Tutorial: Data Merging - Step 1
With this tutorial, we are going to use the Wrangell Data tools to perform data cleaning and data transformation. No account is required, and we’ll be using some sample data sets.
By the end of this four-step tutorial, you will be able to merge two datasets together, clean up the data, split the data into two different datasets, and then save the datasets as JSON files.
Here are the steps in the tutorial:
Let’s continue!
Starting Data Sets
We are going to start with two data sets you might use for web page search engine optimization (SEO).
- SEO Research Data: Contains information about keywords, paths, and search volume.
- Website Path Data: Contains information about the URLs on your website, the number of views, and the bounce rate.
Start by downloading these two data sets - we will be using these two CSV files in the next step, data merging.
Upload the Data to the Data Merging Tool
These two data sets are related - both CSV files share a path
column that we can use to merge the data together into one data set.
While you could use a spreadsheet or a programming language like Python to merge these two data sets, we are going to use the Wrangell Data tools to merge the data together.
To get started, vist the Data Merger tool and upload the two CSV files you downloaded in Step 1.
It doesn’t matter what order you upload them in - this example shows what happens if you start with the seo_research.csv
file.
After uploading a file, click the Next button to move on to the next step.
Notice that there is a small preview of the data from the first file on the second step, so you can quickly see what sort of data you are working with.
Selecting Columns to Merge
After you upload website_paths.csv
and click the Next button to move to step 3, you will see a data preview for both files, along with a data column picker tool.
Each of the columns for your data sources will be listed in the data picker. You can choose columns with different names (such as url
and website
), but this tutorial uses path
as the column name for both data sets.
Once you choose two columns, you are ready to merge your data! Click the button that says ‘Ready to Merge’ to kick off the process.
Previewing the Merged Data and Downloading
You will see a preview of your merged data set, with the first ten rows.
Underneath that data preview, you can download the merged data as a CSV file. If you open the CSV file in a spreadsheet application, you should see the entire data set.
Note that all of this processing happened on your local computer - none of the data left your web browser. We’ll take the CSV file that you downloaded, and analyze the merged data set for missing values in the next step of the tutorial.
Congratulations! You have successfully merged two data sets together. In the next step of the tutorial, we will be looking for missing values in the data set.