Recently, I am working on a new dataset. It is an online news popularity data that includes 60 useful factors for each single URL. This dataset is about 40000 rows and 61 columns. My goal is to find out the relations between the shares and these factors.
Once I got the dataset, I created a data dictionary to understand the data well. For example, I have to know what is the meaning of each column name and what are the numbers for each row stand for. Because I want to make the data to be clearer to viewers, I cleaned the data and created some extra columns. For example, I transferred “weekday_is_Monday” to “Monday”, “data_channel_is_entertainment” to “Entertainment”, and “num_videos” to “Videos”, so these names can display nicer in Tableau and help viewers to understand it quickly.
Here are some screenshots of my analysis:
This is a bar chart that shows the number of shares by each category. I eliminated the records that do not have a category by using filter function in Tableau. As you can see, technology related articles are the category that has the most shares and lifestyle articles that have the lowest shares during that certain period of time.
This is a pie chart that shows the number of shares at workdays or weekends. As you can see, workdays has more shares than weekends. However, I know you might think it is because there are five days for workdays and only two days for weekends. Because of that concern, I created another bar chart below to show detailed shares by each day. As you can see the bar chart, there were more shares during workdays than weekends and Wednesday has the highest shares.
This is an interesting dataset, I am going to do more detailed analysis and will post it here when I am done. Also, here is the link of my Tableau Public profile where you can view more dashboards that I have done: https://public.tableau.com/profile/xinyin.wang#!/