As a music major, I am deeply fascinated by music, and enjoy using my computer science skills to analyze it. I am also a fan of popular music in particular, but I have only had one musicology course which briefly analyzed modern popular music. This project is my attempt to use data visualization to learn more about the salient features of modern popular music, and the pioneering artists who have evolved the genre over many decades. In this project, I hope to create visualizations to help answer the following questions:
Ultimately, I would like this project to use these questions to point to much broader questions, which may or may not be answerable given the scope of this project:
This dataset provides information about every song that has been on the Billboard "Hot 100" from 1958-2021. The Billboard "Hot 100" is the weekly ranking of songs in the music industry. This dataset is a csv file with the following headings:
This dataset contains many of the Spotify Audio Features for Taylor Swift Songs specifically. I used this dataset since it seemed like the best Kaggle set I could find with the Audio Features of a pop artist, and I personally know a lot about her music. This dataset is a list of songs with the following headers:
This is a bar chart that shows the top artists for each year on the Billboard 100. The chart can be translated horizontally. It is zoomed in because more of the attention should be given to the artist and their given year, rather than each artist to each other. While it is great that the Beatles have the highest value of 214 in 1964, comparing that to Ed Sheeran's 115 in 2017 doesn't tell the viewer much. Instead, it is more important who stuck out the most that particular year. Right now, the chart can be traversed by clicking on the bars or the arrow buttons. The first slider can change the Top 100 to any number between 1 and 100 and redraw the graph. The second slider can change the year that the graph is currently looking at. This visualization requires the date, rank, title, and artist, along with some pre-processing. This chart uses a heavily pre-processed JSON file organizing the data from the CSV by year (see setupGraphData1(), which is unoptimized, but gave me the data I needed). One thing that I find fascinating about this chart is that if the rank is set to a lower number (for instance 1-5), one hit wonders or popular collab songs (e.g. Old Town Road - Lil Nas X Featuring Billy Ray Cyrus [2019]) are more likely to show up than more regularly renowned artists.
This is a multi-line chart that tracks the average values of Spotify Audio Features in Taylor Swift's music over the years. When the user hovers over a key in the legend, the line it is referring to is highlighted. An interesting observation from this chart is that acousticness has a significant jump starting in 2020. This is likely due to the fact that the two albums released that year (Folklore and Evermore), which are represented in the peak, both leaned into a more folk pop style that was significantly different from her other works. This chart required the date, and [0, 1] audio features of the Taylor Swift dataset.
This is a stack bar chart that shows the counts of each song on the Hot 100 from the top 10 artists in a given decade. Tooltips (TITLE tags) have been added to each stack with each song title and the number of times it was on the Hot 100 during that decade. This chart required the title, artist, and counts of every album in the Billboard 200 dataset for each decade.
This pie chart has an inner chart and an outer ring. The inner pie chart shows the number of times each artist from that decade was represented in the Billboard 200 (note that "Various Artists" and "Soundtrack" are included as artists, which is why they take up sizeable pieces). The pie chart is organized in numeric to alphabetical order starting from the top (12 o' clock) going clockwise. The ring outside of the pie chart shows a sub partition of each inner wedge with the counts of the individual albums of that particular artist. After drawing this chart, I realized I probably should have just set it up to be a radial partition. All the data is stored in tooltips for each wedge, and this graph allows for zooming and panning to make it easier to focus on a certain artist and their music if their wedge is small.
This chart generates a force chart for each Taylor Swift song in the Taylor Swift Kaggle dataset. This uses both the original set and an pre-aggregated JSON file. For each song (selectable using the SELECT tag), the chart will present a root node with the song title, and leaf nodes for each [0, 1] audio feature for that specific song (color coding matches chart 2). The leaf nodes are draggable in case the physics makes the text collide. An information box is given in the bottom right with more details about the song other than the audio features. Tooltips are provided for each of the leaf nodes containing the percentage from [0, 1] for each feature.
Because I do not have the experience or time needed to access the Spotify Developer Tools for my own data collection, a lot of the visualizations dealing with salient musical features was Taylor Swift specific and not helpful for pop music in general. Still there were many things that surprised me when completing this project. For one thing, I was not sure if calculating averages for the Audio Feature Data would amount to anything, or show anything significant. The line chart shows otherwise, since the 2020 peak of acousticness in Taylor Swift's music tracks with the genre shift she had in her two albums released that year. For the first bar chart, I thought it was very interesting that when looking at the top 1-5 versus looking at the top 50-100, one hit wonders or particularly popular collaboration songs (e.g. Old Town Road - Lil Nas X Featuring Billy Ray Cyrus [2019]) showed up more than regularly high-rated artists. I don't know that I can accurately answer any of my original questions since my visualizations cover so much data, but these visualizations have provided me with a list of artists I haven't heard much about that I could potentially do more research on in the future. They also helped me to learn how I can use Spotify Audio Feature data in a visualization if I ever collect some on my own.