Sunday, 15 January 2017

Emotional arcs

I moaned about Scientific American at reference 1, and since then the February number has turned up, nicely illustrating the value it can add for me.

In the section called 'Graphic Science' at the end, we have an article about some work mapping the emotional arcs traced by more than 1,000 books, all in English and mostly fiction, taken from Project Gutenburg, an outfit I use myself occasionally for its excellent free copies of older books. See reference 2. The work itself is written up at reference 3, but I offer a few titbits here.

The headline is that despite the appearance of the graph in the snap above, we can reduce most stories to one of six basic emotional scenarios, with all the emotions being scored in a one dimensional way, from unhappiness to happiness:

  • "Rags to riches" (rise)
  • "Tragedy", or "Riches to rags" (fall)
  • "Man in a hole" (fall-rise)
  • "Icarus" (rise-fall)
  • "Cinderella" (rise-fall-rise)
  • "Oedipus" (fall-rise-fall)

with the six scenarios actually reducing to three pairs with inversion. With this summary hiding the much longer sequences of rise and fall identified by one of the methods used, much more like that of the snap - which gave the impression that a good story needed to be a sine wave with regular emotional ups and downs. Which we knew already, but it is interesting to see how such waves can be extracted by computers.

The core of this extraction being the assignment of happiness scores to a long list of words and then using those scores to compute scores for chunks of text. The sequence of these scores for successive chunks then gives us our emotional arcs - correlated with, but quite different from the plot, which is about events rather than feelings.

Along the way I learned about a strange Amazon capability called Mechanical Turk, which appears to be a sort of online dating site where I can find people to do odd jobs for me for pay - the sort of odd jobs that a brain worker can do in a few hours armed with a PC and a telephone line. No idea how widely used it is. See reference 4.

I was also reminded of other things.

First, the windowing technique used in the extraction of emotional arcs, in time through the book, seemed very like a version of the convolution used by signals engineers and countless others.

Second, the business of counting up words in books reminded me of the work done to fingerprint authors's, as it were, by their use of words. Fingerprints which can be used in provenance and attribution and which, unlike real fingerprints, probably shift over time. Perhaps cross over the fingerprints of someone else.

Third, all those people who went in for analysing the structure of folk tales and fairy stories, producing big catalogs of same. Would they recognise the six chosen arcs?

And last but not least, what on earth would F. R. Leavis have made of it all? One imagines that he have would come up with some very withering & snooty remark, if he could have been bothered to think about it at all. A famous man in my school days, famous enough that I even bought a book by him back in 2010 or so - although, to be fair, it got culled before I had read much of it. See reference 5.

All good fun. Almost got my money back already.

Reference 1: http://psmv3.blogspot.co.uk/2017/01/scientific-american.html.

Reference 2: https://www.gutenberg.org/.

Reference 3: the emotional arcs of stories are dominated by six basic shapes - Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, and Peter Sheridan Dodds – 2016. Google will turn up a free copy for you.

Reference 4: https://www.mturk.com/mturk/welcome.

Reference 5: http://pumpkinstrokemarrow.blogspot.co.uk/search?q=plumed+leavis.

No comments:

Post a Comment