Intermediate Learning NLP R Text Analytics

Looking for Love in Brontë’s Wuthering Heights

Words have always been important when it comes to communicating concepts and emotions.

Words have always been important when it comes to communicating concepts and emotions. Given the short attention span with which we now consume words on social media platforms, the choice of what words to use has become even more pressing.

I wanted to see how word selection, either individually or in context, could impact an emotional response. This is my search for love, through text mining in R, using “the greatest love story ever written,” Wuthering Heights (1847) by Emily Brontë.

I started with the prescribed textbook approach and came up with a bag of words, or rather a sack of WTH?!

Screen Shot 2018-02-23 at 10.26.15 PM

I just needed context as to what emotions some of these words evoke, plain and simple.

As I delved further, I stumbled on Text Mining with R by Julia Silge & David Robinson as well as her incredible blog, Many thanks for the inspiration and help in the second half of my quest—namaste.

With the use of R’s nrc lexicon in tidy text, the sentiment words evoked 3,478 moments/sentiments of which only 43% were “positive,” a.k.a. feelings of love (see below). [Heads up, just for the sake of optimism, I categorized “surprise” at 4% as a “positive.”]

Screen Shot 2018-02-23 at 11.48.20 AM

# Ranked Sentiment & Sentiment Word cloud 
> twh %>%                                     #file
  inner_count(get_sentiments("nrc"))  %>%     #choice lexicon of three available
  count(sentiment,sort=T) %>%                 #stop here for tibble above
  with(wordcloud(sentiment,nn, max.words=100,
       colors=brewer.pal(8,"Dark 2")))

The sentiment word cloud of Wuthering Heights below summarizes what to expect when one does find the greatest love of all. I hope I didn’t put Whitney Houston in your head right at this moment! That said, I see potential future blog(s) to explore further, like looking into the emotive cycle of the book itself, comparing this to other love stories or even other classics, etc. In the meantime, I am contemplating my next post on tidytext and text or sentiment mining! Thank you for reading.

Screen Shot 2018-02-23 at 9.42.35 AM

Preview the full R script  & before you leave check out check out caption for the picture of Wuthering Heights for a good chuckle. IMPORTANT NOTE : There are two text-mining Option 1 oldskool & Option 2 with ever efficient tidytext : Terry’s ‘WHeights’ Project on GitHub



4 comments on “Looking for Love in Brontë’s Wuthering Heights

  1. Really enjoyed this post; it has inspired me to read the book again. It’s been 30 or so years but I remember this as a very tumultuous story.

    Liked by 1 person

  2. Pingback: Three Text Sentiment Lexicons in R’s tidytext – DataCritics

  3. Terry Chang

    Thanks so much Christine. This project was actually a break from my crazy busy school schedule. Glad you enjoyed it.


  4. Pingback: Text Mining and Sentiment Analysis with Canadian Underwriter Magazine Headlines – datacritics

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: