Dataflows Basic Overview (Designer)
About Dataflows Overview
A dataflow is a sequence of linked data processing actions (such as extractors and transformers). Many actions in XM Discover are processed as dataflows, such as uploading data, running classifications, calculating sentiment, and more.
You can access your dataflows in the Dataflows tab in Designer. The Dataflows tab is broken into 4 sections:
- Active: Dataflows that are currently running.
- Queued: Dataflows that have been triggered and put into a queue. These jobs will start once the currently active job is completed.
- Completed: Dataflows that have been completed, stopped, or failed.
- Scheduled: Dataflows that have been scheduled to run periodically
Active Dataflows
You can view dataflows that are currently running in the Active section of Dataflows.
This page contains a table that shows you the following information about each dataflow:
- Job: The active job’s title.
- Project: The name of the project to which this job belongs.
- Details: Additional details about the job and its progress. This information is constantly updated as the job processes.
- Started: The date and time when this job was started.
- Duration: How long the job has been processing.
- Action: You can stop an active dataflow. See Stopping Dataflows for more information.
Queued Dataflows
You can view dataflows that are queued to run after the currently active dataflow is finished processing in the Queued section of Dataflows.
This page contains a table that shows you the following information about each dataflow:
- Job: The active job’s title.
- Project: The name of the project to which this job belongs.
- Details: Additional details about the job and its progress. This information is constantly updated as the job processes.
- Queued: The date and time when this job was triggered and moved to the queue.
- Action: You can remove a queued dataflow from the queue. See Canceling a Queued Dataflow for more information.
Completed Dataflows
You can view dataflows that have been completed, stopped, or failed in the Completed section of Dataflows.
This page contains a table that shows you the following information about each dataflow:
- Job: The active job’s title.
- Project: The name of the project to which this job belongs.
- Status: Shows a job’s status. Possible statuses include: Completed, Completed with skipped, Stopped, or Failed. Click Details to view the following information about the job’s components and processing details:
- Component: The name of a component in the dataflow. Some dataflows contain multiple components while some only have 1.
- Processing time: The amount of time a component was running.
- Documents. The number of documents processed by a component.
- Documents. The number of documents processed by the dataflow, if applicable. This number is equal to the number of rows in an uploaded Microsoft Excel file or the number of posts extracted from social media sources.
- Actions: Allows you to take action on your dataflow. The actions available here will depend on the dataflow. The available actions include:
- Clean: Clean session data for AdHoc upload or Realtime Downstream dataflows. See Cleaning Project Data for more information.
- Schedule: Schedule a dataflow to happen at a future date and time. See Scheduling Dataflows for more information.
- Resume: Resume stopped or failed dataflows. See Stopping and Resuming Dataflows.
- Force Complete: Manually change a stopped or failed dataflow’s status to Completed. See Force Completing Dataflows for more information.
- View Log: View and export an error log for failed dataflows to get additional information about an underlying issue.
- Started: The date and time when this job was started.
- Completed: The date and time when this job was completed.
- Duration: How long the job has been processing.
- Action: You can stop an active dataflow. See Stopping Dataflows for more information.
Scheduled Dataflows
You can view dataflows that have been scheduled to run in the future in the Scheduled section of Dataflows.
This page contains a table that shows you the following information about each dataflow:
- On/Off: Enable or disable the job.
- Job: The active job’s title.
- Project: The name of the project to which this job belongs.
- Action: Reschedule or cancel a job.
- Details: The schedule that the job runs on.
- Next Run: The next time the job will run.