Songs Without Love

I was wondering what songs are about. Most of them are about love of course but what about the other ones?

Terry Gross had a dude on the other day who wrote songs about a bomb that went off his train carriage on the way to Machu Pichu. Abba won the 1974 Eurovision Song Context with a song about Waterloo. Are there any other songs about weird topics?

I haven’t done a fun science project for a while and I need to learn about the latest versions of Ruby & Rails & Elastic Search & D3 & Hicharts. I also want to dabble in some NLP stuff—sentiment analysis; classifiers; that kind of thing.

Here’s the TODO list:

* Grab every top 100 song since Al Martino in 1952.
* Grab all the lyrics to all the songs.
* Build a word cloud for each song.
* Build a word cloud for each week/year/decade.
* Do cluster analysis to find interesting topics.
* Write a classifier that can figure out what each song is about (love, war, bombs, whatever).
* Plot how that changes over time.
* Do sentiment analysis to see if songs are happy or sad.
* Plot how that changes over time.

The results will be here:

If you know Ruby (or want an excuse to learn) fork my repo and play along.

I need to get this finished by next year. The year after that at the very latest.

What’s your trick, Python?

The folks who wrote Pragmatic Programming recommend that you learn a new language frequently because, with each language, you’ll learn a new trick or a new way of thinking about programming that you never thought of before. When you go back to your old language you’ll take your new trick with you. Last year, I learned Objective C.

There is a lot to hate about Objective C. When I first started learning it, I felt like I had been time-warped back to 1987 along with some aliens from the planet Zarg but, over the last year, the language has improved so dramatically and so many of the rough edges have been smoothed that I could almost recommend it.

It’s still an ugly language, of course. The moments when you are confronted with bits of C in the middle of your Objective C method are like discovering that your ice cream topping is cod-liver oil.

It’s verbose too. The libraries feel like they were designed by colonial administrators in early-nineteenth century India. But, with automatic reference counting (ARC), it is no longer daunting to programmers who have forgotten how to alloc and dealloc.

Objective C has a couple of nice tricks though. My favourite is the fact that nil is an object and you can call its methods. In most languages, this would explode (or at least start a small fire):

collection = nil;
for (int i = 0; i < collection.length; i++) {
  id item = [collection objectAt: i];
  [item doSomething];

but it’s perfectly natural in Objective C. You can happily call methods on nil, it will return nil or 0 so you can just get on with your work without dodging NullPointerExceptions at every turn.

I like Objective C’s syntax for calling methods too, strange as it is. There is something heart-warming about the way that the method name wraps itself around the arguments so that in,

[object populate: collection
        fromFile: filename]

the method name is actually populate:fromFile:. It feels more comfortable than named arguments, in my humble opinion, and the way Xcode wraps the method call and aligns the colons makes it easy to read. If only the method names weren’t designed by colonial civil servants who mistook verbosity for clarity, it would be pleasant even. The names in the Cocoa libraries have that odd do the needful feel about them, like the authors learned grammar in a faraway country, probably one with steam trains, punkah wallahs and government forms in triplicate and it’s hard to love a language that doesn’t have a syntax for accessing array elements.

Ruby is the biggest trickster of them all. My only complaint about that language is that sometimes – especially in Rails – the whole language feels like one big trick. Every time I come back to it, I am constantly saying – “Wow! You can do that? That is awesome! Wait! How does that work again?”.

Ruby taught me blocks:

collection.each { |item|  item.do_something }

Sure, every language has blocks or lambdas these days, but there is just something very soothing about the simplicity of Ruby’s syntax that puts me at ease. In C#, I have to concentrate really hard to get the syntax right and, in Objective C, I doubt there is anyone in the world who remembers how to make a callback without looking it up online. I like to imagine that there was one primordial Objective C block written in a prototype at One Infinite Loop in 1994 and it has been copy-pasted ever since.

The trait I like most about Ruby is its humanity. If it seems like you can do something, you can. All these expressions work and do exactly what you might expect:

3.times { print 'Ho! ' } + 5.days
[1..100].each do |number|
  puts "#{number} is even." if number.even?

If only the Objective C folks would glance at the Ruby libraries and learn that terse does not have to be obscure and that verbosity is not intrinsically a good thing. Just ask the COBOL people.

A couple of years ago there was a debate online about the relative benefits of adding methods to objects to make a programmer’s life easier. The proposition was that such methods result in bloat which makes the API harder to learn but, really, how can you seriously argue that this:

if( array.length > 0 )
  element = array[array.length-1];

is more humane than this:

element = array.last

Meta-programming takes the Ruby language into the astroplane where the angels live and foolish mortals tread carefully. Here’s a builder for generating an xml file:

xml.slimmers do
  @slimmers.each do |slimmer|
    xml.slimmer do slimmer.first_name

And here’s the code for parsing some xml (OK, it’s not meta-programming but it is neat and tidy):

xml ='posts.xml')
parser =
doc = parser.parse xml
doc.find('//posts/post').each do |post|
  puts post['title']

In Objective C, that would be over 7 million lines of code.

C# learned all of Java’s tricks and smoothed away its rough edges. It added lots of little tricks of its own to make it at least 9% better than Java. But its big, new trick is LINQ.

LINQ is essentially a functional language rammed right in the middle of a curly-braced imperative language. Once you get the hang of it, it’s amazing. I never did get the hang of it though and wrote all of my LINQ by typing it out in longhand and then clicking the helpful green squigglies that cause Resharper to turn this:

public IList<Album> FindAlbumsToGiveAway(IList<Album> albums)
  var badAlbums = new List<Album>();

  foreach (Album album in albums)
    if (album.Genre == "Country")
  return badAlbums;

into this:

public IList<Album> FindAlbumsToGiveAway(IList<Album> albums)
  return albums.Where(album => album.Genre == "Country").ToList();

or, more ambitiously, into:

public IList<Album> FindAlbumsToGiveAway(IList<Album> albums)
  return from album in albums
         where album.Genre == "Country";
         select album

if I was in a functional mood (example stolen shamelessly from Alvin Ashcraft).

To achieve its lofty status of 9% better than Java, C# has had to add about 83% more syntax and therein lies its downfall. There is no way that one person can fit all that syntax into their brain unless he dedicates a lifetime to learning it, and why would anyone do that when there are so many finer languages to learn?

Less syntax is more, et cetera paribus, and this:

frequency = {}

is nicer than this:

Dictionary<string, int> frequency = new Dictionary<string, int>();

which brings us to Python, the language where whitespace is syntax.

At first blush, significant whitespace is Python’s big trick. There’s no need to add loop delimiters; just indent correctly – and you were going to do that anyway, right? – and Python will know what you mean. Once you get used to it, indenting loops is just so easy and obvious that you wonder a) why all the other languages didn’t copy it years ago and b) if Python has a better trick for me to learn.

Since Ruby, I am no longer impressed by parallel assignment,

a,b = 2,3

or generators,

def fib():
     a, b = 1, 1
     while True:
         yield a
         a, b = b, a + b 

sequence = fib()
>>> 1
>>> 1
>>> 2

or default values for arguments,

def f(a, b=100):
  return a + b

>>> 102

or the myriad other ways that Ruby and Python are more pleasant to use than Java or C# (OK. I am a still a little bit impressed by generators).

List comprehension is a nice little trick,

numbers = range(1..100)
squares = [x*x for x in numbers]

but it’s not dramatically better than Ruby’s collect method,

numbers = 1..100
squares =  numbers.collect { |x| x*x }

or C#’s,

var numbers = Enumerable.Range(4, 3);
var squares = numbers.Select(x => x * x);

(OK, it’s a lot better than C#’s)

In fact, Python is so similar to Ruby that I feel forced to compare based on æsthetic terms alone and, æsthetically Python loses big time. If Guido and Matz were cousins, Guido would be the awkward, bookish cousin who is perfectly happy typing underbar underbar init underbar underbar open paren self close paren colon instead of initialize. Python has a strong mark of the geek about it.

Python also throws a lot of exceptions and you can barely shake a stick without causing a ShakenStickException. I mean, honestly, what is exceptional about getting something from a hash without checking to see if it’s in the hash first? Even Java gets that right, for Gosling’s sake!

Python’s inclination to hurl exceptions at the slightest provocation has cured me of the last traces of a youthful folly that said you should write the happiest of happy paths inline and put the rarer cases in exception handlers. Exceptions are nasty things and shouldn’t be tossed around lightly and guard clauses are not much better. The PragProgs (again) have a coding kata that requires you to minimize the number of boundary conditions in the implementation of a linked list. It’s a fine aspiration and finessing boundary conditions seems to result in less complexity and complexity is where the bugs hide.

The one Python feature that I haven’t seen anywhere else is the tuple. They are said to be magnificent and the distinction between




is allegedly profound but so far the significance escapes me. I’d be delighted if a commenter would help me understand or point me to some other feature that would make them choose Python over Ruby.

All this harsh buzz over Python might make you wonder why I would be foolish enough to decide to choose Python rather than Ruby at my new gig. The answer is that there is a specific library, nltk, I needed to use.

The natural language toolkit does cool stuff like this:

text = 'Mary had a little lamb. Its fleece was white as snow.'
sentences = nltk.sent_tokenize(text)
>>> ['Mary had a little lamb.', 'Its fleece was white as snow.']

which is harder than it looks. Once you have your sentences, you can find the words and, teleporting back to 6th grade language arts (assuming you grew up in America) or first year Latin (if you didn’t) you can analyse the parts of speech with:

words = [nltk.word_tokenize(sentence) for sentence in sentences]
>>> [
  ['Mary', 'had', 'a', 'little', 'lamb', '.'],
  ['Its', 'fleece', 'was', 'white', 'as', 'snow', '.']
parts_of_speech = nltk.pos_tag(words[0])
>>> [('Mary', 'NNP'), ('had', 'VBD'),
    ('a', 'DT'), ('little', 'RB'), ('lamb', 'NN'), ('.', '.')]

Hmmm. I think Mr Hickey would’ve gone with adjective rather than adverb for ‘little’ there. So would I. Anyhoo…

Once you have your parts of speech, you can diagram the sentence automatically (ssshhhh. Don’t tell your middle school kids):

That’s gotta be handy for something, right?

Now that we are stuck with Python, we get to wrestle with Django which is like Rails but brought to you by the same people that thought def __init__(self): was a good idea. I’m sure it’ll be great when it catches up with the state of the art but, Dudes! A separate language for templating!? I’m already learning a new language? You’re gonna make me learn another one for generating HTML? Didn’t you learn anything from JSP?

I think the folks who decided that separate languages for templates are descended from the folks who thought separate drinking fountains were a good idea. Is it really easier for designer folks to type

{% for slimmer in slimmers %}
    <li>{{|lower }}</li>
{% endfor %}


{% for slimmer in slimmers %}
    <li>{{ }}</li>
{% endfor %}

Suddenly that significant whitespace business doesn’t seem so clever, does it? But, seriously, separate is rarely equal when it comes to template languages and the soft bigotry of low expectations hurts those it aims to help.

Template languages are the one area where the microsofties are ahead of the game with their Razor template syntax. It reduces the number of angle brackets and other unwanted syntax by 83%. Guaranteed!

@foreach (var slimmer in slimmers)

How, you might wonder, if you know all these languages, are you supposed to keep all the various syntaxes straight? The plain answer is… I don’t. I immediately forget everything I knew about the previous language about two weeks after I stopped using it. That makes for embarrassing interviews when they ask me a Java question and, despite having 12 years of Java on my resume, I can’t remember how to construct and initialize a List, or is that a Vector? Or an ArrayList? One of them, anyway.

Fortunately for the forgetful among us, there is JetBrains. Even more fortunately, they have just released a brilliant Python IDE, PyCharm, to go along with the also brilliant, RubyMine and IntelliJ. They also have the brilliant Resharper for the microsofties but you have to use it inside the not-quite-so-brilliant Visual Studio and they don’t get along entirely well together. They both enjoy a lot of memory consumption for a start.

PyCharm amazes me a little bit every day despite my 10 years of being amazed by JetBrains. The type inference system is, frankly, spooky. PyCharm knows the type of a variable that I merely whispered to a colleague the day before and knows all its methods and parameters, what it likes to have for lunch and its taste in science fiction. It handles renaming and more sophisticated refactorings even better than Resharper and it doesn’t even have .NET’s type system to help it along.

So. Python.

To summarize:

  • It’s not quite Ruby.
  • It’s jolly excellent at text mining.
  • It’s a lot nicer than C# (except in html templates) or Java.
  • PyCharm. Oh yeah.

I’m happy with our choice so far but ask me again when I get good enough to stop needing to refer to my cheat sheet every ten minutes. I might have a more informed opinion.

Proudly Powered by Wordress

I was bored with my wordpress theme and Stu’s fresh look made me decide it was time for a refresh. This is my third theme and I wanted to go right back to basics this time rather than copy an existing theme.

I started from the most basic theme template I could find – Starkers – and converted it to use all new html5 tags.  Starkers has no CSS, so I was starting from scratch.

Here, for posterity, are my three themes side by side.

Best part of the whole exercise? I have confirmed once and for all that PHP is absolutely the nastiest programming language I have ever come across. I don’t get why it is so popular at all. Debugging wordpress is like doing a surreal jigsaw puzzle where you are looking for a brightly coloured machine tool to match the giraffe. If there is an organizing principle, I couldn’t find it. It seems completely random whether it grabs markup from a template or spits it out from a function or a widget or a plugin. It’s amazing that WordPress is so good.

I’m not quite done yet. I have a few weird tags left to style. I want to do something with responsive design and I want to do something special for ipad and iphone. When I am done with all that, I might make it work on IE ( < 9.0 ). Google Analytics says I get hardly any visitors with IE (72% of visitor time on my blog comes from macs and ipads!) but my mum has IE so I either need to make it work or fly to England to install Firefox for her. That's probably the cheapest option to be honest. PS. If those side-by-side images are still aligned vertically when you read this, it's because I haven't figured out how to style the image gallery yet. I didn't even know wordpress had a gallery plugin until just now. PPS. If anyone needs a site built in wordpress - find someone else.

My Second Rant About Playlists

Rhapsody’s service has been spotty recently so I thought I’d try Spotify to see if it is all it is cracked up to be. And the verdict is…

…it’s OK.

But like every other music app I have ever tried it doesn’t do one the thing I want a music player to do.

I want to listen to music that I like.

Spotify excels at playlist management but I don’t want to manage playlists. I want to listen to music.

Here’s a reminder of why playlists suck:

Playlists are very seductive at first. You think Oh yes. I’ll build me a playlist with all my favourite songs. But then, after the third time you play it. You start thinking Oh man! This again!? I’m gonna build me another playlist. Then I’ll have two.

Before you know it, you have hundreds of playlists called things like Early English Folk (I) and Early English Folk (II) and you are spending all your time managing your playlists which, by the way, is exactly what the people who make the playlist managers want you to be doing.

I had a go at writing my own Rhapsody client a couple of years back but dropped that when a) Rhapsody started suing everyone who used their API to build apps and b) Rhapsody made an iPhone client that didn’t suck.

My dream didn’t die though and I am still in the market for an app that will play music I like. Here’s how it will work.

I search for some music, say, Gogol Bordello and play some “Top Tracks”.

I hit the button “Play more stuff like this” and it’ll find some Firewater or The Pogues.

After a while, I get bored with gypsy punk and play some Mediaeval Baebes instead. When it gets to Gaudete, I’ll tell it to play more like this and it’ll drift over into some Steeleye Span or some Fairport Convention.

When I use the app a few days later, I won’t need to tell it what to play because it will already know what I like. But, if I want to hear 23 versions of John Barleycorn I can do that too (this is where Pandora falls short).

Is there an app out there like this? Maybe someone has done something on top of the Spotify API? Don’t make me write it myself!

There is no right or wrong language

Stephen Fry loves language so much that he wants to wrestle it away from the control of the pedants who want to control it.

Favourite bit:

You slip into a suit for an interview, and you dress your language up too. You can wear what you like linguistically or sartorially when you are at home or with friends but most people accept the need to smarten up under some circumstances. It’s only considerate.

But please! People! For the love of God! Learn the difference between loose and lose!

It Changed my Life – Book One

I hate internet memes too, but I like this one. List 10(ish) books that had a big influence on your life. Here are Will Wilkinson’s and Conor Friedersdorf’s and Ross Douthat’s.

Sinclair Basic

At the end of the third year at Chis and Sid, I won a prize for the most improved student. After coming dead last in my class in the autumn and winter terms, I came first in class at the end of the year and won a book voucher (I did the same thing in each of the subsequent years too but, by then, they were on to me – no more prizes for me).

On my way home from school, I stopped in the bookshop and picked up a book called Programming in BASIC (Beginners All-Purpose Symbolic Instruction Code).

My mum’s company had recently bought a mini-computer and mum took me to work one day to show it off. It was the first computer that I ever saw and she left me on my own with it for a couple of hours. I found the games!

It had a really primitive version of 20 Questions that I played over and over, fascinated that this chunk of metal could figure out what I was thinking. The highlight was when it didn’t guess my animal and it asked me for a question that would distinguish apes from moneys.

The lowlight came soon after when I introduced my first bug into a computer program. All future players, after answering “no” to “Does it have a tail?” would be asked

Is it a chim?


The full page dot-matrix ASCII of Snoopy made an impression too.


                 X    XX
                X  ***  X                XXXXX
               X  *****  X            XXX     XX
            XXXX ******* XXX      XXXX          XX
          XX   X ******  XXXXXXXXX                XX XXX
        XX      X ****  X                           X** X
       X        XX    XX     X                      X***X
      X         //XXXX       X                      XXXX
     X         //   X                             XX
    X         //    X          XXXXXXXXXXXXXXXXXX/
    X     XXX//    X          X
    X    X   X     X         X
    X    X    X    X        X
     X   X    X    X        X                    XX
     X    X   X    X        X                 XXX  XX
      X    XXX      X        X               X  X X  X
      X             X         X              XX X  XXXX
       X             X         XXXXXXXX\     XX   XX  X
        XX            XX              X     X    X  XX
          XX            XXXX   XXXXXX/     X     XXXX
            XXX             XX***         X     X
               XXXXXXXXXXXXX *   *       X     X
                            *---* X     X     X
                           *-* *   XXX X     X
                           *- *       XXX   X
                          *- *X          XXX
                          *- *X  X          XXX
                         *- *X    X            XX
                         *- *XX    X             X
                        *  *X* X    X             X
                        *  *X * X    X             X
                       *  * X**  X   XXXX          X
                       *  * X**  XX     X          X
                      *  ** X** X     XX          X
                      *  **  X*  XXX   X         X
                     *  **    XX   XXXX       XXX
                    *  * *      XXXX      X     X
                   *   * *          X     X     X
     =======*******   * *           X     X      XXXXXXXX\
            *         * *      /XXXXX      XXXXXXXX\      )
       =====**********  *     X                     )  \  )
         ====*         *     X               \  \   )XXXXX
    =========**********       XXXXXXXXXXXXXXXXXXXXXX

A couple of years later, when I won that prize, there was no question but that I would buy myself a book on programming. I didn’t have a computer though, so I wrote my programs on paper and imagined them running.

Sinclair ZX81Another year went by before Sir Clive Sinclair – who inherited the title Greatest Living Englishman when Winston Churchill died – released the first home computer for under a £100. I saved up and bought myself one.

As soon as that fuzzy little K cursor started blinking in the corner of my TV screen I was hooked and there was no holding me back.

I drew my own ascii art. I played chess in 1kB. I painstakingly copied the machine code for a draughts program byte by byte from a book. I wrote a Monopoly program. I wrote a program to do Fourier Analysis. I learned Z80 assembly language which I hand-assembled using look-up tables because I didn’t have an assembler.

Non-programmers often don’t understand what a creative activity programming is. They think it’s about following mundane instructions. I can’t think of a more creative activity.

It’s truly liberating to discover that you can make something out of nothing but the thoughts in your head. Maybe people who are gifted at painting or music get a hint of this but to suddenly find that you can imagine something and then go build it! It makes you feel superhuman.

Sinclair C5Sinclair also invented the first commercial electric car which turned out not to be so commercial after all and Uncle Clive lost both his fame and his fortune. A fickle nation turned its love to Alan Sugar and his wondrous Amstrads but I’ll always be grateful to Sir Clive for the gift he gave me.

Heaven Knows I’m Miserable Now

I have been a Rhapsody subscriber for several years. The service they provide is fantastic:

Think of a song. Any song. Play it.

I suspect that people who suggest “try Pandora” (and there are many of you) probably don’t get what Rhapsody is about. It’s like owning all the songs in the world and you can play any one at any time.

But their software absolutely sucks.

RhapsodySo when Rhapsody suspended my account (I got a new credit card and forgot to tell them), I took it as an excuse to go see what else is happening in music software in the years that I have been gone.

I tried something like twenty different players this week and they pretty much fall into two basic categories:

  1. Music discovery (like and Pandora)
  2. Playlist management

Within category 2, there are two business models (purchase tracks or monthly subscription) but the software all has the same primary use case:

User wants to manage their playlists.

They are playlist managers with the ability to actually play the music seemingly tacked on as an afterthought.

I don’t want to edit playlists.

I hate playlists.

Playlists are very seductive at first. You think Oh yes. I’ll build me a playlist with all my favourite songs. But then, after the third time you play it. You start thinking Oh man! This again!? I’m gonna build me another playlist. Then I’ll have two.

Before you know it, you have hundreds of playlists called things like Early English Folk (I) and Early English Folk (II) and you are spending all your time managing your playlists which, by the way, is exactly what the people who make the playlist managers want you to be doing.

No. Playlists are not a good solution for anything.

Here’s what I want:

I want to listen to music that I like.

I’ll clarify that a little:

chetOne day, I might have a hankering to play 7 different versions of My Funny Valentine (Chet Baker’s is best) or every single recording of John Barleycorn Must Die (Traffic’s).

piratesAnother day I’ll have an urgent need to listen to Rogue’s Gallery: Pirate Ballads, Sea Songs, and Chanteys – because there is a piratefest coming up – or to hear the latest Lily Allen album.

I might have just read that there have only ever been two songs sung in latin to make the UK Top Twenty and I’ll want to hear them both.

gogolI might be on my way to a Gogol Bordello concert and I want to hear their albums over and over to get myself in the mood.

But most of the time,

I just want the thing to play me stuff that it thinks I’ll like.

Pandora excels at that last one but is a non-starter for the rest. iTunes will do the job if you don’t mind shelling out 99c every time you have a hankering to listen to some early Abba. If you listen to a lot of music, those 99cs will soon rack up.

So given that

a) music subscriptions rock and

b) the software for music subscription services sucks

oh, and by the way,

c) I have been meaning to learn Flex for a while now

there is only one thing for it…

..I’ll have to write my own damn software.

So that’s what I have doing the last few evenings. It’s fun. I don’t get to program much at work any more so it’s a nice change of pace. I have a prototype that will play Rhapsody or Napster tracks on my wonderful Squeezebox. I have a design all sketched out and I even have a color scheme and icons (step 3 – profit!)

clown music

So, meanwhile, in my ongoing quest to find some existing software that doesn’t suck (and to steal ideas) I keep trying out new players and services. So far, they are all – every single one of them – playlist managers until…

…this morning I discovered GrooveShark.

GrooveShark is uncannily like my sketched design (they even copied my color scheme and icons) and I have been playing it all day.

They have a passably good search screen (mine is better of course but, since it is only sketched on paper, doesn’t work as well as theirs) and it is easy to find a song and stick it in your queue. But, what makes them different from everyone else is that tantalyzing autoplay button.


If you stop adding tracks to your queue, AutoPlay will start playing stuff that it thinks you will enjoy. That was gonna be my killer feature!

I have figured out their algorithm though.

It is:

Play The Smiths.

Did the user veto it?

No – Play The Smiths all day. Over and over (and over). Throw in the occasional REM track.

Yes – Play REM all day (throw in some Smiths though in case they have changed their mind).

Try playing some rap every now and again to make sure they are paying attention and not just listening to any old crap.

Play some more Smiths.

That’s it.

If I had known it was this easy, I would’ve done it years ago.