Tutorial: Missing Data - Step 2
This is a continuation of the data cleaning tutorial - check out the Data Merging step if you haven’t already.
We’re going to start with the merged data set from that tutorial, and then we’ll look for missing values in the data set. When doing data analysis, missing values can cause problems, so it’s important to identify and handle them.
In our case, we will use another Wrangell Data tool, the Missing Data tool to do some analysis and then filter out the rows with missing values.
Here are the other steps in the tutorial:
Time to start!
Starting Data Set
We are going to start with the merged data set from the previous step in the tutorial. If you didn’t get a chance to merge the data, you can download the merged data set here:
- Merged SEO Data: The merged data set that contains some rows with missing values.
Feel free to explore the data set in a spreadsheet application before we start the tutorial.
Upload the Data to the Missing Data Tool
Visit the Missing Data tool and upload the merged data set that you downloaded in the previous step.
After you upload a file, you can preview the first fiew lines of the file’s content - click the Next button to move you to the next step in the tool.
Missing Data Analysis
After uploading the data set, you will see a summary of the missing values in the data set by column.
From the report, we can tell that 56 rows are missing a Keyword
value, and 56 rows are also missing a Search Volume
value.
For the purposes of our tutorial, we don’t want to use rows with missing values, so we will filter out those rows in the next step.
Filtering Rows with Missing Data
After clicking the Next: Preview & Filter Data button, you will see a preview of the data set, along with a filter tool.
You can filter out rows with missing values by clicking the checkbox next to the column name.
Each column includes the number of rows with missing values, so you can decide which columns to filter out.
Select the Keyword
and Search Volume
columns to filter out rows with missing values for those columns, then click the Next: Preview Filtered Data and Download button to process the data.
Previewing the Filtered Data and Downloading
You will see a preview of the filtered data set, with the first ten rows.
Underneath that data preview, you can download the filtered data as a CSV file. If you open the CSV file in a spreadsheet application, you should see the entire filtered data set.
Congratulations! You have successfully filtered out rows with missing values from your data set. In the next step of the tutorial, we will be splitting the data set into two separate data sets.