Code to compare Facebook and Youtube's comments

May 30, 2017 · 1079 words · 6 minute read




Working with Facebook comments

Cleaning and tidying the data

Here I replicate the work done on Youtube comments.

fb_comments <- fb_comments %>% 
  filter(com_text != "") %>%
  left_join(videos_fb, by = c("post_id_fb" = "id")) %>% 
  group_by(short_title) %>% 
  mutate(n = n(),
         com_created = as.Date(com_created)) %>% 
  ungroup() %>% 
  filter(n >= 100) %>% 
  select(short_title, video_id = ids, post_id_fb, com_text, com_id, com_created)

tidy_fb_comments <- fb_comments %>%
  tidytext::unnest_tokens(word, com_text) %>%
  anti_join(stop_words, by = "word") 

Plot the most positive and most negative words

Once I have a tidy dataframe, I plot the most positive and most negative words on Facebook to compare them in the original article with the Youtube ones.

fb_pos_neg_words <- tidy_fb_comments %>%  
  inner_join(get_sentiments("bing"), by = "word") %>%
  count(word, sentiment, sort = TRUE) %>%
  ungroup() %>%
  group_by(sentiment) %>%
  top_n(10) %>%
  ungroup() %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(word, n, fill = sentiment)) +
  geom_col(show.legend = FALSE) +
  scale_fill_manual(values = c("red2", "green3")) +
  facet_wrap(~sentiment, scales = "free_y") +
  ylim(0, 2500) +
  xlab(NULL) +
  ylab(NULL) +
  coord_flip() +
  theme_minimal()

Sentiment by comment and by video

As I did for the Youtube videos, I calculate the sentiment for every comment and then for every video.

fb_comment_sent <- tidy_fb_comments  %>%
  inner_join(get_sentiments("bing"), by = "word") %>% 
  count(com_id, sentiment) %>%
  spread(sentiment, n, fill = 0) %>%
  mutate(sentiment = positive - negative) %>% 
  ungroup() %>% 
  left_join(fb_comments, by = "com_id")

fb_title_sent <- fb_comment_sent %>% 
  group_by(short_title) %>% 
  summarise(pos        = sum(positive),
            neg        = sum(negative),
            sent_mean  = mean(sentiment),
            sentiment  = pos - neg) %>% 
  ungroup() %>% 
  arrange(-sentiment)

Joining Facebook and Youtube comments

I join the Youtube and Facebook's sentiment by video tables to compare comments. I have to filter the videos present in both platforms to make a fair comparison.

comments_by_title <- yt_title_sent %>% 
  inner_join(fb_title_sent, by = c("short_title" = "short_title")) %>% 
  select(vid_created, 
         short_title, 
         mean_sent_yt = sent_mean.x,
         mean_sent_fb = sent_mean.y) %>% 
  ungroup() %>% 
  mutate(diff = mean_sent_fb - mean_sent_yt,
         short_title = reorder(short_title, -diff)) %>% 
  arrange(desc(diff))

And now I can plot the sentiment for every video on each platforms, ordered by published date.

library(plotly)
ggplotly(comments_by_title %>%
  
  ggplot(aes(x = reorder(short_title, vid_created), 
             text = paste(short_title, "<br />",  vid_created))) +
  geom_line(aes(y = mean_sent_fb, group = 1), color = "blue") +
  geom_line(aes(y = mean_sent_yt, group = 1), color = "red") +
  geom_hline(yintercept = 0) +
  xlab(NULL) +
  ylab(NULL) +
  theme_minimal() +
  theme(axis.text.x = element_blank()),
tooltip = "text")