Pop Music Project

Introduction

As a music major, I am deeply fascinated by music, and enjoy using my computer science skills to analyze it. I am also a fan of popular music in particular, but I have only had one musicology course which briefly analyzed modern popular music. This project is my attempt to use data visualization to learn more about the salient features of modern popular music, and the pioneering artists who have evolved the genre over many decades. In this project, I hope to create visualizations to help answer the following questions:

Which artists were the most mainstream each year/decade?
What salient musical elements are most present in these artists' works?
What specific songs and albums were most popular in each year/decade?
Which salient musical elements are most present in these songs/albums?

Ultimately, I would like this project to use these questions to point to much broader questions, which may or may not be answerable given the scope of this project:

What salient musical elements have most impacted the evolution of pop music over the past several decades?
What trajectory has pop music been evolving in the past few years, and what might that tell us about the near future of pop music?

Datasets

Billboard "The Hot 100" Songs
- Source
- CSV

This dataset provides information about every song that has been on the Billboard "Hot 100" from 1958-2021. The Billboard "Hot 100" is the weekly ranking of songs in the music industry. This dataset is a csv file with the following headings:

Date (week)
Rank (of that week)
Song (title)
Artist
Last-Week (rank in previous week)
Peak-Rank (highest rank ever)
Weeks-On-Board (number of weeks on the Billboard "Hot 100")

The Billboard 200 Acoustic Data
- Source
- JSON (Converted from SQLite) This file is 120 MB, be careful when opening in a text editor!
This dataset contains every album on the Billboard 200 from 1963-2019. The Billboard 200 is the weekly ranking of albums in the music industry. This dataset was originally a table in an SQLite file, but I coverted it to a JSON file with the following headings:
- Album (title)
- Artist
- Date (week)
- Id (carried over from SQLite)
- Length (number of tracks)
- Rank (of that week)
- Track_Length (total duration of album in ms)
Taylor Swift Spotify Data
- Source
- CSV

This dataset contains many of the Spotify Audio Features for Taylor Swift Songs specifically. I used this dataset since it seemed like the best Kaggle set I could find with the Audio Features of a pop artist, and I personally know a lot about her music. This dataset is a list of songs with the following headers:

Name (title of track)
Album (album of track)
Artist (artist of track)
Release Date (release date of track)
Length (length of track (in ms))
Popularity (popularity percentage from Spotify)
Acousticness (confidence in track being acoustic [0, 1])
Danceability (suitability for dancing [0, 1])
Energy (measure of intensity and activity [0, 1])
Instrumentalness (chance track is non-vocal [0, 1])
Liveness (presence of live audience in recording [0, 1])
Loudness (average dB level of the track (in dB))
Speechiness (presence of spoken words in track [0, 1])
Tempo (estimated tempo in BPM)
Valence (musical positiveness (closer to 1) versus musical negativeness (closer to 0) [0, 1])

Visualizations

NOTE: All visualizations use a pre-aggregated JSON file

Top Artists Yearly (Billboard "Hot 100") [Bar Chart]

This is a bar chart that shows the top artists for each year on the Billboard 100. The chart can be translated horizontally. It is zoomed in because more of the attention should be given to the artist and their given year, rather than each artist to each other. While it is great that the Beatles have the highest value of 214 in 1964, comparing that to Ed Sheeran's 115 in 2017 doesn't tell the viewer much. Instead, it is more important who stuck out the most that particular year. Right now, the chart can be traversed by clicking on the bars or the arrow buttons. The first slider can change the Top 100 to any number between 1 and 100 and redraw the graph. The second slider can change the year that the graph is currently looking at. This visualization requires the date, rank, title, and artist, along with some pre-processing. This chart uses a heavily pre-processed JSON file organizing the data from the CSV by year (see setupGraphData1(), which is unoptimized, but gave me the data I needed). One thing that I find fascinating about this chart is that if the rank is set to a lower number (for instance 1-5), one hit wonders or popular collab songs (e.g. Old Town Road - Lil Nas X Featuring Billy Ray Cyrus [2019]) are more likely to show up than more regularly renowned artists.

Rank (Top 1-100):
Year (1958-2021):

Progression of Audio Features for Artist (Taylor Swift) [Line Chart]

This is a multi-line chart that tracks the average values of Spotify Audio Features in Taylor Swift's music over the years. When the user hovers over a key in the legend, the line it is referring to is highlighted. An interesting observation from this chart is that acousticness has a significant jump starting in 2020. This is likely due to the fact that the two albums released that year (Folklore and Evermore), which are represented in the peak, both leaned into a more folk pop style that was significantly different from her other works. This chart required the date, and [0, 1] audio features of the Taylor Swift dataset.

Stacks of Song Counts on Billboard Hot 100 (Top 10 Artists of given decade) [Stack Bar Chart]

This is a stack bar chart that shows the counts of each song on the Hot 100 from the top 10 artists in a given decade. Tooltips (TITLE tags) have been added to each stack with each song title and the number of times it was on the Hot 100 during that decade. This chart required the title, artist, and counts of every album in the Billboard 200 dataset for each decade.

Select Decade:

Albums and Their Artists on the Billboard 200 (given decade) [Pie Chart]

This pie chart has an inner chart and an outer ring. The inner pie chart shows the number of times each artist from that decade was represented in the Billboard 200 (note that "Various Artists" and "Soundtrack" are included as artists, which is why they take up sizeable pieces). The pie chart is organized in numeric to alphabetical order starting from the top (12 o' clock) going clockwise. The ring outside of the pie chart shows a sub partition of each inner wedge with the counts of the individual albums of that particular artist. After drawing this chart, I realized I probably should have just set it up to be a radial partition. All the data is stored in tooltips for each wedge, and this graph allows for zooming and panning to make it easier to focus on a certain artist and their music if their wedge is small.

Select Decade:

Audio Features of Taylor Swift Songs [Force Chart]

This chart generates a force chart for each Taylor Swift song in the Taylor Swift Kaggle dataset. This uses both the original set and an pre-aggregated JSON file. For each song (selectable using the SELECT tag), the chart will present a root node with the song title, and leaf nodes for each [0, 1] audio feature for that specific song (color coding matches chart 2). The leaf nodes are draggable in case the physics makes the text collide. An information box is given in the bottom right with more details about the song other than the audio features. Tooltips are provided for each of the leaf nodes containing the percentage from [0, 1] for each feature.

Select Song:

Conclusion

Because I do not have the experience or time needed to access the Spotify Developer Tools for my own data collection, a lot of the visualizations dealing with salient musical features was Taylor Swift specific and not helpful for pop music in general. Still there were many things that surprised me when completing this project. For one thing, I was not sure if calculating averages for the Audio Feature Data would amount to anything, or show anything significant. The line chart shows otherwise, since the 2020 peak of acousticness in Taylor Swift's music tracks with the genre shift she had in her two albums released that year. For the first bar chart, I thought it was very interesting that when looking at the top 1-5 versus looking at the top 50-100, one hit wonders or particularly popular collaboration songs (e.g. Old Town Road - Lil Nas X Featuring Billy Ray Cyrus [2019]) showed up more than regularly high-rated artists. I don't know that I can accurately answer any of my original questions since my visualizations cover so much data, but these visualizations have provided me with a list of artists I haven't heard much about that I could potentially do more research on in the future. They also helped me to learn how I can use Spotify Audio Feature data in a visualization if I ever collect some on my own.

Pop Music Project

Made by Collin Presser for CS391 - Data Visualization

Table of Contents:

Introduction

Datasets

Billboard "The Hot 100" Songs

The Billboard 200 Acoustic Data

Taylor Swift Spotify Data

Visualizations

NOTE: All visualizations use a pre-aggregated JSON file

Top Artists Yearly (Billboard "Hot 100") [Bar Chart]

Progression of Audio Features for Artist (Taylor Swift) [Line Chart]

Stacks of Song Counts on Billboard Hot 100 (Top 10 Artists of given decade) [Stack Bar Chart]

Albums and Their Artists on the Billboard 200 (given decade) [Pie Chart]

Audio Features of Taylor Swift Songs [Force Chart]

Conclusion