Merge Task
About the Merge Task
With the merge task, you can combine data from multiple datasets into a single merged dataset. This allows you to view data from multiple sources in 1 convenient place, and gives you the ability to build dashboards and analyze results across data sources.
Data can be merged from up to 3 sources at the same time. The merging of data is referred to as a join. When setting up a merge task, refer to the task limits for information on the limitations that apply to your datasets.
Available Joining Methods
Before selecting a join method, decide which survey will be the “left input” and which survey will be the “right input”. This will be important when you decide which join operation to use.
Once you have set your left and right inputs, there are 3 join options to create the merged dataset:
- Inner: The merged dataset will only include matching rows found in both datasets.
- Left outer: The merged dataset will include all rows from the left input and matching rows found in the right input.
- Right outer: The merged dataset will include all rows from the right input and matching rows found in the left input.
Setting Up a Merge Task
Once you have at least 2 datasets you want to merge, you can begin setting up your merge task.
- Click the navigation menu in the top-left corner.
- Choose Workflows.
- Click Create a workflow.
- Select Started at a specific time.
- Within your newly created workflow, click Select frequency and choose a frequency from the dropdown.
- Finish setting up the frequency of your workflow. See Scheduled Workflows for information on the options available for each frequency.
- Click the plus sign (+) underneath the frequency.
- Select Task.
- Select Merge Task.
- Click Select a data set and choose one of your datasets using the dropdown menu. Only survey projects and imported data projects you’ve created or have been shared with you will be available.
Qtip: To remove a dataset from the merge task, click the 3 horizontal dots to the right of the dataset. - Click the dropdown underneath the “Fields” column, and select the fields you want to include by clicking the checkbox next to each field name.
Qtip: The maximum number of fields in the resulting dataset is 500. If your combined sources have more than 500 fields, only the first 500 will be selected.Attention: Fields that start with an underscore (_) will not be available for selection. - Click Add data set to add an additional survey. You can add up to 3 surveys.
- Repeat steps 10-11 for each dataset you add.
- Once you have added all your datasets, click Next. You will not be able to proceed until you have added at least 2 datasets.
- Click Select a data set under the left “Input” dropdown and select one of the surveys you added.
- Repeat step 15 and select a dataset for the right input.
- Click on the join button between the left and right input drop-downs, and select how you want to join your datasets. See Available Joining Methods for information on the available join types.
- Click Select a field under the left “Join condition” dropdown, and select which field in your left input will be used to match your datasets. The field you select must have a field with matching values in your other dataset(s).
Qtip: When you join multiple datasets, you can only join them on 1 field. - Repeat step 18 for your right input dataset.
- Click Next.
Attention: If you are merging 3 datasets, you will need to click Add join and repeat steps 15-19, selecting Join 1 as the left dataset and your third dataset on the right.Qtip: To delete a join, click the 3 horizontal dots and select Delete. - Enter a name for your merged dataset.
- If you want the merged data to be loaded into a new imported data project, click the checkbox. This will create a new imported data project the first time the task runs.
Attention: If you leave the box unchecked, you will need to create a load into a data project task in your workflow. Within this task, you can select an existing imported data project to be the destination for your merged dataset.
- If you are loading the merged data into a new imported data project, you will need to select one of your fields to be a Unique ID.
Attention: Unique IDs must be a string. Non-string fields cannot be selected.Qtip: Survey choices cannot be used as a Unique ID because they are not unique.
- By default, each field will be loaded into your dataset with the survey name added at the beginning. If desired, you can click on the field names for each field and rename them.
Attention: Fields cannot start with an underscore (_). Any field that starts with an underscore will be filtered out of your merged dataset.
- Click Save.
- Make sure your workflow is enabled.
Execution Results
Once your merge task has run, you can view the task execution results. If your join does not produce any rows for the resulting dataset on the first run, a merged dataset will not be created.
Viewing Your Merged Dataset
After your merge task runs successfully, you will be able to view your combined dataset.
- Navigate to the Projects page.
- Click on your imported data project. This should have the name you used in step 22 while setting up your merge task.
- The first thing you will see in your project is your merged data. Each column will begin with the survey name so it is clear which dataset the column belongs to.
Qtip: If you decided to create a new imported data project within your merge task, your Unique ID column will be denoted by “id” in front of the column name. - If you want to do analysis on your responses, you can click on the other tabs within Data & Analysis. See the following pages for more information:
Qtip: The tabs you see here will depend on your individual user permissions.
Once your data is imported into Qualtrics, you can use Stats iQ, Text iQ, crosstabs, and response weighting to analyze your data. You can also use your project as a data source in CX dashboards.
Limits
Merge tasks have the following limits. If you exceed these limits when setting up your merge task, it will not execute successfully.
Source Dataset Limits:
Number of fields within each source dataset | 500 fields |
Number of records within source dataset | 1 million records |
Row size | 100 KB |
Number of fields selected for merged dataset | 500 fields |
Merged Dataset Limits:
Number of records within joined dataset | 1 million records |
Size of joined dataset | 1 GB |
Task Limits:
Number of datasets within a join | 3 datasets |
Number of workflows that can contain a merge task | 5 workflows |
Merge task execution frequency | 1 time every 24 hours |
Inner Join
The following limits are specific to inner joins.
Number of fields within source dataset | 500 fields |
Number of records within source dataset | 1 million records |
Size of merged dataset | 1 million records |
Left Join
The following limits are specific to left outer joins.
Number of fields within source dataset | 250 fields |
Number of rows within each source dataset | 50,000 rows |
Troubleshooting
There are a few issues that may arise when setting up and running a merge task:
- The merge task will not run because of an ambiguous field name. This is caused by 2 fields on step 25 of the task setup being given the same name. Change the name of 1 of the fields to proceed.
- The merge task will not run because datasets are currently building. This is caused by the task being run with an older survey as one of the source datasets. Run the task again in 24 hours to proceed.
- The merge task will not run when a new survey is selected as a source dataset. This is caused by new surveys not being available for processing, even though they can be selected while configuring the task. Wait up to 24 hours before re-running the task.
- The merge task will not run when there were recent modifications to a source survey dataset. This is caused by modifications not being available for processing, even though they are visible while configuring the task. Wait a few hours before re-running the task.
- New survey responses are not available when setting up the merge task or within the merged dataset. Wait a few hours for these responses to be available within your task.