I am NOT politically savvy, and I decided the quickest way to know more about the upcoming June elections in Ontario was to look at what is being discussed on Twitter.
As this was a quick midday break project, my goal was to sift through “#elxn2018” tweets based on the past seven days (as of May 19, 2018). I decided to load up the twitteR package to grab all the information I needed from Twitter and to use R’s tidyverse & tidytext to remove some noise for a succinct word count. I did not focus on specific parties or platforms for this quick project, as my time allocation was only two hours.
DISCLAIMER: Data is as of May 19, 2018. You do need to create a twitter app and token in order to explore the twitteR package (see official CRAN document).
> library(ROAuth) > library(twitteR) > library(tidyverse) > library(tidytext) #secure your twitter consumer_key <- "xxxxx" consumer_secret <- "xxxx" access_token <- "xxxxx" access_secret <- "xxxxx" setup_twitter_oauth(consumer_key, consumer_secret, access_token=NULL, access_secret=NULL) #search query ontvote2018 <- searchTwitter("elxn2018",n=1000) #convert to data frame ontvote2018_df <- twListToDF(ontvote2018) #tokenization ontvotetkn <- ontvote2018_df %>% select(id, text) %>% unnest_tokens(word,text) #custom stop words stop_words <- stop_words %>% bind_rows(data.frame(word = c("https", "t.co", "rt", "amp","elxn2018","onpoli", "public","ontario","ontario's","2","7","june","day","don't","1","2018","election","party", "parties","9pm","00","30pm","saturday","brampton","ontario","don't"))) onvote1 <- ontvotetkn %>% anti_join(stop_words) #top 20 word count onvote1 %>% group_by(word) %>% tally(sort=TRUE) %>% slice(1:20) %>% ggplot(aes(x = reorder(word, n, function(n) -n), y = n)) + geom_bar(stat = "identity",fill="yellow1") + theme(axis.text.x = element_text(angle = 60, hjust = 1)) + xlab("") + ggtitle("Ontario Votes 2018 - Overall Top 20 words")
The top two words (beyond the typical focus on candidates, debates, current party-in-power, challenging the status quo, etc.) were:
- healthcare
- oncall4on
With further examination, oncall4on is actually the twitter account for Your Ontario Doctors with the hashtag #CarenotCuts.
I decided to explore #CarenotCuts.
#twitter query > carenotcuts <- searchTwitter("carenotcuts",n=1000) #follow steps above till stop_words #update stop words > stop_words2 <- stop_words %>% bind_rows(data.frame(word = c("https", "t.co", "rt", "amp","elxn2018","onpoli", "public","ontario","ontario's","2","7","june","day","don't","1","2018","election","party", "parties","9pm","00","30pm","saturday","brampton","ontario","don't", "carenotcuts","candidate","candidates","debates","oncall4on","healthcare", "dockaurg","onhealth","nj6tgy7pcg","learn","onelxn","oma","mds"))) > carenotcuts1 <- carenotcutstkn %>% anti_join(stop_words2) #top 20 word count > carenotcuts1 %>% group_by(word) %>% tally(sort=TRUE) %>% slice(1:15) %>% ggplot(aes(x = reorder(word, n, function(n) -n), y = n)) + geom_bar(stat = "identity",fill="yellow1") + theme(axis.text.x = element_text(angle = 60, hjust = 1)) + xlab("") + ggtitle("Ontario Votes 2018 - #1 Issue is Healthcare")
These are the following key notations, beyond the actual words themselves:
- There is a definite rally cry with action verbs like #join and #fight.
- There is finger pointing at #kathleeen_wynn and #ontliberals and their healthcare policy.
- The situation is in #crisis and is #hurting #doctors, #nurses and #patient(s).
We read so much about how amazing healthcare is in Canada. This lunchtime project made me realize this may not be the case, especially in Ontario. I need, or rather we all need, to get on top of this and all other issues. Go on Twitter & explore Liberal’s Kathleen Wynn, Conservative’s Doug Ford, NDP’s Andrea Horwath and Ontario Green’s Mike Schreiner.
Ontario is Home. June 7, 2018. VOTE.
Fantastic insights.
Thanks Terry
LikeLike
Thanks Patrick. It was an interesting lunch project to find out about what’s on twittwersphere. Thanks and have a nice day!
LikeLike
Thank Mr P
LikeLike
Pingback: Text Mining and Sentiment Analysis with Canadian Underwriter Magazine Headlines – datacritics
I often watch the house of Commons discuss issues at hand. It’s very interesting and you know your hearing the news first hand a lot of attacking each others party so it can be concerning..
LikeLike