Visualizing the sentiments in U.S. Presidential Inauguration Speeches
Tags: projects, research
In previous posts, I wrote about a brief analysis of U.S. Presidential Inauguration Speeches and how to extend this analysis using tf–idf. In this post, I want to present an extended analysis based on sentiment analysis. Sentiment analysis encompasses a class of techniques for detecting whether sentences are mean either negatively, neutrally, or positively.
Depicting sentiment over time
Since every inauguration speech has a beginning and an end, it forms
a natural time-series. Hence, I first calculated the sentiment scores
for every sentence in a speech and scaled an artificial time parameter
over the speech between 0
and 1
. This yields a nice sentiment
curve plot, in which the abscissa denotes the time of the speech, and
the ordinate denotes the sentiment of a given sentence—with values
close to +1
meaning that the sentence is extremely positive, 0
meaning that the sentence is neutral, and -1
meaning that the sentence
is extremely negative.
Here are some example visualizations of the last three inauguration speeches. Positive sentences are shown in green, while negative ones are shown in red. I am filling the distance between true neutral sentences with the colour in order to show patterns and ‘rhythm’ of different speeches. A black line indicates the mean sentiment over the speech.
It is interesting to see that Obama’s first speech appears to be more subdued and neutral than the subsequent speeches, which exhibit more peaks and thus a larger variability between extremely positive and extremely negative sentiment.
If you want to compare the different sentiment curves for your favourite presidents, you may do so—I have prepared a large visualization that combines all sentiment curves. Watching the appearance and disappearance of negative sentiments over time is quite fascinating. The 1945 speech by Roosevelt is a striking example of evoking very negative imagery for a prolonged period of time.
Comparing mean sentiments
For comparing individual presidents, the curves are well and good, but
I was also interested in the global sentiment of a speech and how it
evolves over time. To this end, I calculated the mean (average)
sentiment over every speech and depicted it over time. This works
because sentiments are always bounded between [-1:1]
, making their
comparison very easy. Here is the average sentiment of a speech, plotted
over time:
We can see an interesting pattern: after the second World War, speeches become more positive on average. They remain that way until the first inauguration speech of Barack Obama, which, as I noted above, is somewhat subdued again. Afterwards, they pick up steam. Donald Trump’s speech evokes more positive sentiments, on average, than most of the speeches since 1945.
All in all, I think that this is a nice tool to assess patterns in speeches. If you want, go take a look at the individual sentiment curves, which are stored on GitHub. Maybe you pick up something interesting.
Code
I used the intriguing
TextBlob
Python module for
this analysis. All visualizations are done using
gnuplot
. As usual, the scripts and data
files—as well as the output—is stored in the GitHub
repository for
the project.
Have fun!