Cupid shuffle

Can data science find true romance?

Love and hip hop or country romance?

What strums your heart strings: country music or hip hop? Can we classify one as superior when it comes to matters of affection? This project aims to see if analysis using machine learning algorithms will suggest that one genre is more romantic in comparison with the other.

My love of country music is in my blood. My mom's family is from the deep south specifically Mississippi and I grew up listening to Hank Williams and Patsy Cline along with another child of the region, Elvis Presley. I've got a bias towards the belief that country music is more romantic than many other styles if looking at lyrics alone and argue that as a genre, it is loaded with phrases evoking and expressing a longing that does not show up as frequently in pop. So, in the pursuit of figuring out whether this is true or not, I did what any data professional would do and used the "super powers" of data science to see see what might turn up.

vinyl
Let the music play on!

The approach

I retrieved two randomly generated sets of playlists from two categories namely: "country romance" and "hip hop romance" using the Spotify API developer tools. This resulted in two lists which housed approximately 800 songs each. I took the song list extracted from Spotify and processed them using the API provided by genius.com resulting in the lyric text for the combined total of 1600 tracks. The wording from all of these songs were subsequently normalized and cleaned with the Python programming language and processed using natural language machine learning algorithms. This process of extraction and tranformation provided the basis for evaluating each genre and finding distinct word patterns along with their use frequencies.

Country romance

I've been the only country music lover in a crowd of detractors more than once and when presenting my "case" for the gentle romance found within the lyrics I point out that men sing the praises of women in an adoring context like this song from Alan Jackson, "Big Green Eyes". Lo and behold, "eyes" make the list of most frequently used words in country romance lyrics as seen in the graph below.

spotify api
Top words in country romance

Love and hip hop

The detractors referenced above who would argue against me could point to the fact that the word "love" ranks at a higher frequency in hip hop compared to country as is seen in the chart below. Note that it comes in at 2nd place in comparison with 4th in country. The counter argument could be that even though this may be true, the word "love" itself appears approximately 2,500 times in the random sampling of country lyrics versus 1750 in hip hop.

hip hop top words
Hip hop top words

"Country Grammar" is a hip hop song

I took that song title to be an invitation to apply grammatical analysis, and the graphs below will take you down a tagmemic journey exploring language structures of the two musical styles which seem to be remarkably similar to each other across all features even when accounting for the infusion foreign words which can be seen in the non-English category.

spotify api
Country word types
spotify api
Hip Hop word types

Analysis

Putting the fundamental flaws of the project aside, namely a relatively small amount of data in combination with the "elephant in the room" which is that language is rich and cannot be fully evaluated by examining words out of context, evidence suggests that country is more romantic. For one thing, nouns dominate hip hop while country lyrics contain a more even handed variety of pronouns, adjectives and adverbs. This suggests the possibility of greater language complexity in country music which can be a byproduct of expressing emotion. Additionally, phrases suggestive of being polite and content such as, "god", "mam'" and "home" made the country music list but expletives did not and the reverse is true for hip hop.

The above being said, none of this is conclusive and can be easily counter argued. For example my assumption around the use of the word 'god' is anectdotal and stems from familiarity with the culture. For example, from an unbiased perspective, the frequent reference to a higher power could also suggest religious ferver or could easily come from the saying: "Oh my god". Deeper inquiries may produce different conclusions and if I was to stay within the machine learning model for analysis, it would be helpful to get a larger data sample, and next steps could include sentiment analyis, evaluation of common phrases instead of individual words and also the inclusion of song release dates to see if language has changed over time.

Cupid Shuffle

Much like the perks associated with being a restaurant reviewer having access to great food, this analysis required a fair amount of listening to good music and ultimately hip hop kept me tapping my foot while programming. In keeping with the spirit of the project, here's the "Cupid Shuffle" for your enjoyment.

The code

If you are interested in the code, it is available here in my GitHub repository: Code repository for cupid_shuffle. Please feel free to fork or clone the project and to access the code for the Spotify API is housed here: Code repository for sound_effects. The github page I created to support how to use the API is here: sound_effects webpage.