Tracking DBZ's Brand Sentiment on Reddit using R

In a research project for my MBA, I used public sentiment and consumer behavior analysis to evaluate brand reputation, locate opportunities for improvement, and evaluate the brand’s position relative to its competitors.

This blog post outlines a structured process for collecting, analyzing, and visualizing audience sentiment using Dragon Ball (DB), a globally beloved anime and manga series by Akira Toriyama, as a case study. DB follows the adventures of Goku and his friends as they seek wish-granting Dragon Balls, train in martial arts, and fight enemies across fictional worlds.

I do not use “DBZ” as the abbreviation because I am observing the entire franchise, not just the Dragon Ball Z show.

By collecting data from Reddit, applying sentiment analysis techniques in R, and visualizing trends in Tableau, I explore engagement surrounding DB; I also highlight gaps in my approach. Ultimately, this post and process serves as a guide for people wanting to obtain audience sentiment for product and competitor analysis.

🌸👋🏻 Join 10,000+ followers! Let’s take this to your inbox. You’ll receive occasional emails about whatever’s on my mind—offensive security, open source, boats, reversing, software freedom, you get the idea.

Step 1: Data Collection in R

1. Choose a Platform

To analyze DB, I gathered my data on Reddit. Reddit communities (such as r/dbz and r/dragonball) share experiences, opinions, and analyses of the series. These discussions provide insights into the series’ impact and how it resonates with different audiences.

Subreddits with Posts Containing 500+
comments relating to Dragon Ball.

I could have also used X. While X hosts DB interactions, tweets are generally shorter and lack the depth of Reddit threads. While X users might post quick reactions, fan art, or brief opinions, the detailed and often analytical nature of Reddit discussions makes it a better source for exploring DB’s reputation.

2. Set Up R and Required Libraries

Use the rtweet package for X or RedditExtractoR for Reddit. Both packages allow me to fetch posts based on specific keywords.

install.packages("RedditExtractoR")

view raw RedditExtractoR.r hosted with ❤ by GitHub

3. Download at Least 1000+ Comments

	#### example: accessing data using the reddit api ####
	# reference:
	# https://cran.r-project.org/web/packages/RedditExtractoR/RedditExtractoR.pdf
	# https://www.reddit.com/dev/api/
	# https://cran.r-project.org/web/packages/RedditExtractoR/index.html
	# https://github.com/ivan-rivera/RedditExtractoR


	# TODO FOR YOU – install required packages (only needed once)
	# install.packages("RedditExtractoR")

	# load required library
	library(RedditExtractoR)

	# download reddit thread urls containing key phrase "Dragon Ball"
	urls <- find_thread_urls(keywords = "Dragon Ball")

	# filter urls to only include urls from specific subreddits ("dbz" and "dragonball")
	dbz_urls <- subset(urls, subreddit == "dbz")
	dragonball_urls <- subset(urls, subreddit == "dragonball")
	db_urls <- rbind(dbz_urls, dragonball_urls)

	# filter threads with more than 500 comments for analysis
	bigurls <- subset(urls, comments > 500)

	# display titles of threads from filtered subreddits
	db_urls$title

	# create box plot to compare number of comments across subreddits
	boxplot(comments ~ subreddit, data = bigurls)

	# retrieve content from all threads in filtered subreddits
	db_threads <- lapply(db_urls$url, get_thread_content)

	# combine comments from all threads into a single data frame
	comments_list <- lapply(db_threads, function(thread) thread$comments)
	comments_df <- do.call(rbind, comments_list)

	# export combined comments to a csv for sentiment analysis
	write.csv(comments_df, "dragonball_comments3.csv")

view raw DBZ-reddit-download.r hosted with ❤ by GitHub

In this program, I used the Reddit API via the RedditExtractoR package to gather, filter, analyze, and export data related to posts and comments on Reddit about “Dragon Ball.”

The program installs and loads the necessary package (RedditExtractoR), and fetches all Reddit URLs matching the phrase “Dragon Ball.”

Then, the program narrows the data to use specific subreddits (i.e., “dbz” and “dragonball”). These URLs are filtered again to only include posts with more than 500 comments.

I create a boxplot to visualize which subreddits receive the most comments, but I do not actually use this information for anything else. I included it because I think it is interesting.

Subreddits with Posts Containing 500+
comments relating to Dragon Ball.

Then, it retrieves content from the posts associated with the filtered URLs. The comments from these posts are extracted and consolidated into a single data frame. Finally, the combined dataset of comments is saved as a CSV, dragonball_comments3.csv.

If I rerun this script, I can obtain the most recent Reddit posts that fit the criteria described.

Step 2: Conduct Sentiment Analysis in R

Next, I use the syuzhet library to perform sentiment analysis.

	# load syuzhet library – provides tools for sentiment analysis
	library(syuzhet)

	# only run line if syuzhet package is not installed
	# install.packages("syuzhet")

	# read csv comments into a data frame
	rawdata <- read.csv('dragonball_comments3.csv')

	# perform sentiment analysis using 'syuzhet' method on 'comment' column
	# of dataset
	sent <- get_sentiment(rawdata$comment, method="syuzhet")

	# perform sentiment analysis using 'afinn' method
	sent2 <- get_sentiment(rawdata$comment, method="afinn")

	# analyze comments using NRC sentiment lexicon
	# scores presence of various emotions (e.g., joy, anger, sadness)
	nrcscores <- get_nrc_sentiment(rawdata$comment)

	# boxplot visualizing distribution of NRC emotion scores
	boxplot(nrcscores)

	# combine original data frame with sentiment scores from 'syuzhet' and 'afinn'
	# into new data frame
	finaldata <- cbind(rawdata, sent, sent2)

	# save combined data frame in new csv
	write.csv(finaldata, 'commentswithsentiment3.csv')

view raw dbz-sentiment-analysis.r hosted with ❤ by GitHub

This program performs sentiment analysis on the dataset of comments stored in the CSV file dragonball_comments3.csv. Using the syuzhet library, it loads the comments into a variable called rawdata.

Then, the get_sentiment function is applied to the comment column of the dataset using two different sentiment analysis methods: “syuzhet” and “afinn”. The resulting sentiment scores are stored in the variables sent and sent2, respectively.

Additionally, the get_nrc_sentiment function analyzes the comments and determines the presence of various emotions (such as joy, anger, sadness, etc.) based on the NRC sentiment lexicon, with the scores stored in the nrcscores.

After that, a box plot is created to visualize the distribution of these NRC emotion scores.

NRC Emotion Score Distribution (Box Plot).

Finally, the sentiment scores are combined with the original dataset using the cbind function, creating a new dataset called finaldata, which is then saved as a CSV file named commentswithsentiment3.csv. This CSV will be imported into Tableau to create more visualizations of the data.

Step 3: Visualization of Sentiment and Patterns using Tableau

1. Import to Tableau

Here, I take the CSV, commentswithsentiment3.csv, and import it into Tableau.

2. Visualize Patterns

Now, I can look for spikes in sentiment that may align with specific events, product launches, or news. Since I am only analyzing a few posts, it is difficult to gauge the overall sentiment towards the DB.

I would run this script every few days to collect data over time if I wanted more meaningful analysis. Alternatively, I would adjust the parameters in my script to filter for the best posts through periodic date ranges to obtain a backlog of data. Data over time would give me a more comprehensive overview, but would also make my data analysis harder as the computer I am using to conduct this analysis might not have the processing power to analyze this much data in Tableau. Although adding more data is not usually computationally demanding, Tableau is a heavy application. Thus, adding more data to create more views taxes computer resources.

This is problematic because there have been many notable events this year that might affect the overall sentiment towards the series. I will discuss this later.

Sentiment Overview

Now, let’s explore some sentiment overview trends.

Popular Authors & Active Authors

Authors Dashboard

The Popular Authors chart highlights the most upvoted authors, indicating their influence in driving discussions. The Active Authors chart shows authors contributing the most content, though not all are equally upvoted. Regarding patterns, the popularity (upvotes) and activity (comment count) are not strongly correlated, suggesting that a few highly upvoted authors dominate sentiment trends.

Sentiment Distribution

Sentiment Dashboard – Ways of Analyzing Sentiment over Time.

Using the Sentiment Overview sheet, the distribution pie chart shows a higher proportion of positive sentiment than negative sentiment. The Sentiment, Time, Emotions line chart shows that positive sentiment consistently outpaces negative. Overall, the sentiment skews positive, with notable negativity possibly tied to specific controversies or negative reviews.

Sentiment Over Time

Using the Sentiment, Time, Emotions sheet, peaks in sentiment correspond to specific dates, indicating events. Negative sentiment spikes suggest polarizing moments. Theoretically, these sentiment trends indicate that DB content generates excitement and polarizing reactions depending on the event.

Emotion Insights

In this section, I want to highlight prevalent emotions (e.g., joy, trust, and anger) and what they suggest about customer perception.

Boxplot – Emotion Distribution of the Comments.

Emotion Overview

Using the Emotions Over Time sheet (below), I see emotions like “anticipation” and “trust” frequently peak, which I will discuss later. Anger and fear spikes likely correlate with contentious debates. Emotional reactions vary widely over time, showing strong community involvement and responses to news or developments.

Each Emotion Over Time.

The Emotions Over Time sheet highlights significant emotional fluctuations. Peaks in anger and fear might align with notable events, suggesting adverse reactions or contentious debates, while trust and joy likely reflect positive topics (e.g., character achievements or fan appreciation). Repeated spikes in trust and joy point to recurring moments of positivity within the community. However, the frequent occurrence of anger and fear might indicate polarizing discussions or dissatisfaction with certain developments.

Emotions Over Time

Emotion variability can highlight how fans engage with DB. Peaks in joy and trust suggest moments of appreciation, whereas anger and fear indicate divisive topics (no duh). As shown above, emotions dominate specific dates, suggesting their tie to particular discussions or events, which I discuss later.

Patterns Over Time

While the dataset captures thousands of comments, the limited number of posts used as a source means little news is explicitly mentioned. However, as a fan of the series, I know there is a wealth of newsworthy events to discuss.

At present, I am around episode 250 of the series.

The dataset reveals that high-scoring comments reference “Goku,” “Vegeta,” “Super Saiyan,” “Broly,” “art,” and “style.” These suggest discussions focused on character developments, transformations, and visual aspects of the series. However, there is no clear evidence in the dataset of direct mentions of specific news events, such as movie releases, theme parks (one is being built in Saudi Arabia), or product launches.

As such, while the dataset reflects ongoing discussions about the series, it doesn’t capture notable events like the opening of a DB park in Saudi Arabia, movie or game announcements, or news about Akira Toriyama (i.e., his passing). Consequently, this analysis remains surface-level and lacks the depth I prefer when drawing meaningful conclusions.

Patterns Over Time

Comments (Hourly).

The Comments (Hourly) sheet reveals a surge in activity during specific hours of the day, with peaks likely driven by event announcements or trending discussions. Most engagement occurs in the evening, which might align with typical online activity patterns for fans. Spikes in activity may be tied to time-specific announcements or streaming events.

I could not determine what time zone Tableau is measuring (my dataset shows the time zone, but I think Tableau did some sort of conversion), which is unhelpful with the “engagement occurring in the evening” claim.

Overall Analysis

Sentiment Dashboard – Ways of Analyzing Sentiment over Time.

The sentiment distribution reveals that positive sentiment dominates overall, though notable spikes in negativity indicate polarizing topics or events. Emotion trends show high levels of trust and anticipation, reflecting fan loyalty and excitement, while anger and fear suggest more contentious or emotional debates. Spikes in both sentiment and emotions align with specific dates, yet few events are mentioned in the dataset.

Potential Debates Based on Observed Trends

Earlier, I hinted at a disconnect because no events are explicitly mentioned in the comments, yet extreme emotions might be linked to events.

While my analysis did not reveal direct mentions of notable events (like new releases or announcements), the datasets and charts indicate debates or spikes in sentiment and emotion. These spikes suggest that even if explicit events weren’t mentioned in the comments, events could still drive debates.

For example, discussions surround character dynamics, visual quality, and evolving content. Frequent mentions of “Goku,” “Vegeta,” and “Super Saiyan” highlight very old (yet continuous) debates about character power levels, transformations, and story arcs, including comparisons like Goku vs. Vegeta and Gohan’s development. Conversations about “art” and “style” suggest ongoing debates regarding the visual quality of newer DB adaptations versus older ones. The character “Broly” sparks discussions about his portrayal in DB Super: Broly compared to earlier versions, reflecting fan interest in his evolution. Frustrations related to DB’s games often revolve around gameplay mechanics, microtransactions, and perceived imbalances, contributing to spikes in negative sentiment. Finally, fan divisions over new content (e.g., plotlines in DB Super and DB Heroes) drive positive and negative reactions, fueled by polarizing opinions on new characters, transformations, and retcons.

Why Events Might Not Be Explicitly Mentioned

The dataset’s focus on sentiment and emotions suggests it captures reactions rather than explicitly mentioning events. These debates may reflect responses to ongoing discussions, trends, or rumors. Additionally, long-standing debates, such as “Who’s the best Saiyan?” might reignite due to related events like an anniversary or a new product release, fueling fan conversations. There is also a chance that they are not actually related to any of these things.

Missing Analysis That Would Have Been Useful

Competitors

Analyzing other popular anime—such as Naruto and One Piece—would provide more context around my sentiment analysis of DB. These series share similar themes (e.g., training, personal growth, battles, and friendship), making them natural competitors.

For example, Naruto’s journey as a ninja mirrors Goku’s path of self-improvement. With Luffy’s quest to become Pirate King, One Piece combines humor, fights, and a deep storyline. One Piece’s longevity and popularity rivals DB.

That said, I got banned since I did not rate the API request limit well initially, so I could not do this. I tried using a VPN to see if that would help, but it still was not working properly. Given the time constraints, I had to skip this.

Conclusion

In conclusion, this research demonstrates a structured approach to collecting, analyzing, and visualizing audience sentiment, using the globally beloved DB franchise as a case study. By collecting data from Reddit, applying sentiment analysis techniques in R, and visualizing trends in Tableau, I explored audience engagement with DB and identified areas for improvement in my methodology. This process provides insights into the franchise’s reception. Additionally, it serves as a guide for people interested in evaluating brand reputation, locating opportunities for improvement, and conducting competitive analysis using audience sentiment.

If you enjoyed this post on my sentiment analysis of DB, consider reading Visualizing Cancer Trends in Tableau.

Olivia A. Gallucci