Visualizing insights with an analytics dashboard
Dec 02, 2016 • 4 min read

In the previous post we discussed which models we tried for sentiment classification and which one has demonstrated the best performance. In this post, we’ll show you how to visualize our under-the-hood findings so that others can see the results of our analysis. You can see our twitter sentiment analysis insights with our demo application here.
The data we have
Let’s briefly recall what we know about the tweets after the sentiment classification is performed. We know the keyword (movie name), the tweet text, the date and exact time, the sentiment (positive or negative), and the number of followers the user who posted the tweet has.
We know that the number of followers can be between zero and hundreds of thousands, and that 50% of Twitter users have 300-600 followers. These are regular users like you or me. There are another ~45% who have up to 5000 followers. These users are likely not professional tweeters, but their influence is much greater than most peoples since their posts or repost will be seen by many hundreds or even thousands of people. Let’s call them opinion leaders. And apparently about 5% of Twitter users have more than 5000 followers. In many cases these Twitter accounts belong to businesses like magazines, cinemas, and other movie industry companies. They are professionals. Who among them — regular users, opinion makers or professionals — make the greatest buzz around a movie? Let’s make this all visible.
Live and historical sentiments
Imagine it’s 5pm on a Friday. What are people tweeting about the Star Wars movie that will be released this weekend? How many tweets in total, how many of them are buzz from, regular users, and how many are part of promo campaigns “warming people up” and “pushing them to spontaneous decisions?” A real-time chart showing a cumulative number of tweets split by sentiment and user category will show us how the situation is developing.
This kind of diagram gives a clear picture, so we want to use it to see long-term trends for the same movie. It might include several days or several months. So we need a chart showing historical data that gives us the ability to choose a range of dates.
Will all that give us a really clear picture? It definitely won’t. Our sentiment classification is far from perfect, so it’s good to see the real tweet stream to understand what people are actually tweeting about. The next useful capability is to see negative or positive tweets representing historical data. Having all that in front of you, you get a comprehensive picture of what is happening now, and what happened hours or days or a month ago, as you can see from the picture below.
This informative application can stil be improved. Cumulative diagrams are good to show the total number of tweets received in a given period of time, but they are not very clear in showing cycles and patterns in the number of tweets. Stacked-type diagrams are good for showing the relative contribution of every user type, but they don’t make it easy to perceive absolute values. So the next step could be to check different visualization options to show more “contrasts” in different data aspects for the same movie in the same range of dates.
We can easily recognize cyclic patterns with the “stacked bars” diagram and also see the contribution from regular users, opinion leaders, and professionals with the “bubble diagram.” Now we can say we have a pretty comprehensive picture of the sentiments around one selected movie. Our ultimate goal, of course, is to see if we can visually recognize patterns specific for a particular movie. That can be possible if we see the data for several movies simultaneously. So let’s select two movies and a particular type of diagram to check if there is a significant difference in contribution of various classes of Twitter users in the actual information atmosphere around those movies.
Now we can clearly see that the structure of the public sentiments around the movies is very different. In the picture above, we can see that the activity level of professional Twitter accounts might be significantly higher for one movie than for the other. The diagram shows that the activity of professional users toward “Suicide Squad,” which was released in August, looks very different from activity for the just-released “Inferno.” That might mean a significant difference in social media promotion strategies for these two movies. And that small finding may become the basis for further, deeper investigation into both products’ successes.
With this blog post we close the series of Data Scientist Kitchen blog posts, but we will keep developing new approaches and techniques for data analysis using open source projects. Please subscribe to our blog to keep up to date on the newest posts.
Tags
You might also like
When it comes to the best web development frameworks, finding the right balance between efficiency, creativity, and maintainability is key to building modern, responsive designs. Developers constantly seek tools and approaches that simplify workflows while empowering them to create visually strikin...
Most enterprise leaders dip their toe into AI, only to realize their data isn’t ready—whether that means insufficient data, legacy data formats, lack of data accessibility, or poorly performing data infrastructure. In fact, Gartner predicts that through 2026, organizations will abandon 60% of AI pr...

For many businesses, moving away from familiar but inherently unadaptable legacy suites is challenging. However, eliminating this technical debt one step at a time can bolster your confidence. The best starting point is transitioning from a monolithic CMS to a headless CMS. This shift to a modern c...
Many organizations have already embraced practices like Agile and DevOps to enhance collaboration and responsiveness in meeting customer needs. While these advancements mark significant milestones, the journey doesn't end here. Microservices offer another powerful way to accelerate business capabil...
From AI/ML workloads and multi-tenancy to test labs and edge computing, uncover 5 practical examples of Kubernetes-based platform engineering.

Accessibility is a critical factor for businesses across various industries, including retail, technology and media, insurance, FMCG, HORECA, and manufacturing. The potential impact of neglecting accessibility can be immense, not only from a legal standpoint but also in terms of lost revenue an...
Buckle up, web enthusiasts! We’re about to explore the fascinating world of Google’s Web Vitals—the crucial initiative that has reshaped how we approach web performance and user experience. My name is Maksym, and with more than 8 years in front-end development, I’ve seen firsthand how Web Vitals ha...