Beatles song lyrics - a statistical snapshot
A few years ago, I was toying with the idea of making a phone app where players would be challenged to demonstrate their knowledge of song lyrics.
I got distracted and never made an app, but not before writing some code to digest and profile lyrics on a song-by-song basis. I started with the Beatles (of course). I thought one easy, and perhaps illuminating view of the songs would be a histogram of words, and how many songs they appeared in. I think I used lyrics.com as my input.
Caveats:
- there's gonna be slop room on "oh", "ooh" and "you're", "yer" etc.
- with fade outs, there will be doubt as to what is the last audible word
- I only did 184 songs written by the Beatles that they released on Beatles albums
- various verb forms and singular/plural are treated separately. That is "love" and "loves" are separate entries
- probably a lot more
There were 2291 distinct words used in those 184 songs. 1309 of the 2291 words appear in one song only. 293 appeared in two songs only.
Here are the top ten (the numbers given reflect how many songs each word appears in... not the number of times the word appears - this helps dampen out the fudge).
- 161 "the"
- 160 "to"
- 151 "you"
- 151 "and"
- 142 "i"
- 135 "a"
- 123 "me"
- 108 "that"
- 107 "in"
- 97 "my"
A good question is "What Beatles songs do NOT have the word 'the' or 'to' in them?" ;)
The basis of my never-made game would have centered on the more interesting words, and asking players to name songs that contained a given word in a timed trial and see who came up with more.
For instance:
- Which songs contain "kitchen"? (there are 2)
- Which contain "everywhere"? (8)
- "bag"? (7! WTF?)
- "point"? (3?!)
etc etc.
LMK if there is some other analysis that would intrigue you. If you like, I could post the entire list of how many songs each word appears in.