When working with a new data set, it is important to make sure that all of your information is ready for analysis before it is imported into Tableau. Through this blog post, I will cover various tools to consider while cleaning up your data in Excel.
With courtesy from Mathforum.org’s Data Sets website, I downloaded a new data set, “Top 100 Private Colleges 2003” for analysis:
Here are a series of steps I took to successfully clean my data.
1) Remove all unnecessary headers, titles, and miscellaneous text. In order to do this, I selected the title and removed the row. I did the same to delete the “Quality Measures” and “Financial Measures” headers, and other text it included at the end of the chart.
2) Remove hyperlinks. Not really necessary, but it will help you to prevent from accidental windows opening while working. To do this, select the column and go towards the left of your screen where there appears an eraser and “Clear”. Click down on the eraser and select “Hyperlinks.”
3) Split columns with multiple information in one cell. In other words, make sure that in each row there is only one piece of data. In this data set, there is a column for both SAT and ACT separated by a “/”. In order for the data to be useful in Tableau, we need to separate these columns. In order to do this, first create two new columns to the right of the column you want to split. Then, select your column data and from the Data toolbar, select “Text to Columns.”
On the following screen, choose your data as “Delimited.” Select the symbol or delimiter with which your data is separated by and click Next. Then, select the Destination, which will be the first column to the right. Once you are done, make sure to name both of the new columns accordingly and delete the old column.
4) Make the percentages into decimals. This is just something I like to do for myself. It will not make any difference, however, I like to make them into decimals in case I need to utilize them in calculated fields. In order to convert from percentages to decimals, select your data and go to “Format” from your toolbar then click on “Cells.” Click on “General.” Now, your numbers are in decimal form.
5) Put your data in a table. Select all of your data including the headers and from the Tables toolbar select “New.” Your information will automatically be formatted into a table.
Finally, do not forget to save your work!
Your data is now ready for Tableau.
Stay tuned for tomorrow’s post, which will cover tools you may use within Tableau in order to get your data ready for analysis.