This module focuses on the principles and practices of effective data storytelling and communication, with an emphasis on multivariate visualisation, ethical considerations, and practical coding skills.
Reading: The Ethics of Data Visualization by Alberto Cairo
Data storytelling is the bridge between raw data analysis 📊 and meaningful action. While exploratory data analysis is about finding the signal in the noise, explanatory storytelling is about presenting that signal to stakeholders in a way that is clear, persuasive, and memorable.
Think of your data as the “facts” of a case. Without a narrative 📖, those facts are just a list. Storytelling provides the “argument” that tells the stakeholders why those facts matter to their specific business goals.
Narrative structure transforms a series of charts into a compelling argument. Instead of just showing data, we use a story arc to lead stakeholders through a journey of discovery. A classic framework for this is the Context-Complication-Resolution model.
Let’s decide where to go next to build these resources for your students:
This exercise is designed to shift students from “making charts” to “building a case.” By framing data points as characters, they learn to highlight the tension (the problem) and the resolution (the recommendation).
In this scenario, students act as Lead Data Analysts for Stream-It, a fictional video streaming service. Recent reports show a dip in revenue, and it’s their job to find the “Villain” causing the loss and the “Hero” that will save the quarter.
Your stakeholders are the Marketing and Product teams. They don’t want a 50-page technical report; they want to know:
Python code to generate a synthetic dataset with a hidden narrative:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Generate synthetic data
np.random.seed(42)
n_users = 1000
data = {
'User_ID': range(n_users),
'Subscription_Type': np.random.choice(['Basic', 'Premium', 'Family'], n_users),
'Monthly_Charges': np.random.uniform(10, 30, n_users),
'Region': np.random.choice(['North', 'South', 'East', 'West'], n_users),
'Churned': np.random.choice([0, 1], n_users, p=[0.7, 0.3]),
'Customer_Support_Calls': np.random.poisson(2, n_users),
'App_Engagement_Score': np.random.normal(50, 15, n_users)
}
df = pd.DataFrame(data)
# Inject the 'Villain': Higher churn for Basic users with high support calls
df.loc[(df['Subscription_Type'] == 'Basic') & (df['Customer_Support_Calls'] > 3), 'Churned'] = 1
# Inject the 'Hero': Users with high App_Engagement_Score almost never churn
df.loc[df['App_Engagement_Score'] > 70, 'Churned'] = 0
print(df.head())
Students must create three specific visualizations that tell the story:
Goal: Use a bar chart or heatmap to show that churn isn’t happening everywhere—it’s concentrated.
Subscription_Type and Customer_Support_Calls.Goal: Translate the data into business impact.
Goal: Find a segment that is succeeding and turn that insight into a recommendation.
App_Engagement_Score and Churned.Students should be graded not just on the code, but on their annotations.
This model solution focuses on Explanatory Data Viz. Instead of just showing the data, we are going to use “Active Titles” and annotations to guide the stakeholder’s eye.
Below is the Python code using Seaborn and Matplotlib. You can share this with your students as the “Goal” they should strive for.
First, we ensure the environment is set up and the “Villain” and “Hero” are baked into the data.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Set the storytelling theme
sns.set_theme(style="white")
plt.rcParams['font.family'] = 'sans-serif'
# 1. Setup (Data Generation)
np.random.seed(42)
n_users = 1000
data = {
'Subscription_Type': np.random.choice(['Basic', 'Premium', 'Family'], n_users),
'Monthly_Charges': np.random.uniform(10, 30, n_users),
'Customer_Support_Calls': np.random.poisson(2, n_users),
'App_Engagement_Score': np.random.normal(50, 15, n_users),
'Churned': np.random.choice([0, 1], n_users, p=[0.7, 0.3])
}
df = pd.DataFrame(data)
# Inject the 'Villain': High churn for Basic users with >3 support calls
df.loc[(df['Subscription_Type'] == 'Basic') & (df['Customer_Support_Calls'] > 3), 'Churned'] = 1
# Inject the 'Hero': High engagement prevents churn
df.loc[df['App_Engagement_Score'] > 75, 'Churned'] = 0
The Story: We aren’t losing everyone; we are specifically failing our Basic tier users who need help.
# Create a pivot table for the heatmap
heatmap_data = df.groupby(['Subscription_Type', 'Customer_Support_Calls'])['Churned'].mean().unstack()
plt.figure(figsize=(10, 5))
sns.heatmap(heatmap_data, annot=True, cmap='Reds', fmt=".1f", cbar=False)
# Storytelling elements
plt.title("THE VILLAIN: Support Friction is Killing the 'Basic' Tier", fontsize=16, loc='left', pad=20)
plt.xlabel("Number of Customer Support Calls")
plt.ylabel("Subscription Plan")
plt.annotate('CRITICAL ZONE:\nBasic users with 4+ calls\nhave a 100% churn rate.',
xy=(5, 0.5), xytext=(7, 0.5),
arrowprops=dict(facecolor='black', shrink=0.05))
plt.show()
The Story: This isn’t just a “metric”—it is a direct hit to our monthly revenue.
# Calculate lost revenue
lost_revenue = df[df['Churned'] == 1].groupby('Subscription_Type')['Monthly_Charges'].sum()
plt.figure(figsize=(8, 6))
ax = sns.barplot(x=lost_revenue.index, y=lost_revenue.values, palette=['#ff9999', '#cccccc', '#cccccc'])
# Storytelling elements
plt.title("THE STAKES: We are losing $1,800+ Monthly in 'Basic' alone", fontsize=16, loc='left', pad=20)
plt.ylabel("Potential Monthly Revenue Lost ($)")
plt.xlabel("Subscription Tier")
sns.despine()
# Add data labels
for p in ax.patches:
ax.annotate(f'${p.get_height():.0f}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center', xytext = (0, 9), textcoords = 'offset points', fontweight='bold')
plt.show()
The Story: High app engagement is our “shield.” If we can move users into the app, the “Villain” (support friction) loses its power.
plt.figure(figsize=(10, 6))
sns.kdeplot(data=df[df['Churned'] == 0], x='App_Engagement_Score', fill=True, label='Retained', color='teal')
sns.kdeplot(data=df[df['Churned'] == 1], x='App_Engagement_Score', fill=True, label='Churned', color='red')
# Storytelling elements
plt.title("THE HERO: High App Engagement is a Churn Vaccine", fontsize=16, loc='left', pad=20)
plt.axvline(75, color='green', linestyle='--')
plt.text(76, 0.02, "THE HERO ZONE:\nScores >75 = Zero Churn", color='green', fontweight='bold')
plt.legend()
sns.despine()
plt.show()
sns.despine()) and removed the color bar from the heatmap to keep the focus on the data.Video by Scott Klemmer on storyboards
🤔 comic strip: show flow, how does user figure in this?
star people: how to draw people
Sequence: what steps are involved?
Helps get stakeholders on the same page.
Here is an example of a storyboard

Paper prototypes, transparencies and sticky notes
Digital mockups
High fidelity mockups (controlled experiments)
Storyboarding for data visualization is like writing a script 📽️ before filming a movie. It helps us map out the Sequence—the logical flow of insights—so stakeholders don’t get lost between charts. It moves the focus from “how do I code this?” to “what am I trying to say?”
In Python, we can simulate this “sketching” phase by having students create a Story Skeleton. Instead of rendering complex charts immediately, they define the “Panels” of their story using a data structure. This ensures the narrative holds up before they spend hours on formatting.
Here are three ways we could structure a Python-based storyboarding exercise:
StoryFrame class. They must “instantiate” 4-5 frames of their story, specifying the Sequence, the Persona (the “Star Person” 👤 viewing the data), and the Key Takeaway.plt.text() to describe what the chart will show and where the annotations will go. This mimics the Paper Prototype 📝 approach.The Narrative Audit 📋: Students take an existing set of charts and write a Python “wrapper” or function that prints out the transition logic between them (e.g., “Because we see [X] in Frame 1, we must investigate [Y] in Frame 2”).
A Narrative Audit focuses on the “connective tissue” between your data visualizations. In storyboarding, this ensures that the transition from one chart to the next feels like a logical progression rather than a random jump.
Think of it like a comic strip 🎞️: if Panel A shows a character at home and Panel B shows them on Mars, the reader needs a “transition” panel (the rocket ship 🚀) to understand how they got there. In data, this means explaining why a specific insight in Chart 1 leads us to investigate the metric in Chart 2.
In this exercise, students are given a Python script that generates three correct but disconnected charts. Their job is to perform an “audit” and write the narrative bridge that connects them.
Provide students with this “broken” narrative. The charts are technically fine, but the story is missing.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
# Sample Data: Website Traffic and Sales
data = pd.DataFrame({
'Day': range(1, 8),
'Visitors': [1000, 1100, 1050, 1200, 1500, 1600, 1550],
'Bounce_Rate': [40, 42, 41, 39, 65, 68, 70],
'Conversion_Rate': [5, 5, 4.8, 5.2, 2.1, 1.8, 1.5]
})
def plot_narrative_gap():
# Chart 1: Traffic is growing
plt.figure(figsize=(5, 3))
sns.lineplot(data=data, x='Day', y='Visitors', marker='o')
plt.title("Total Website Visitors")
plt.show()
# Chart 2: Bounce rate spiked
plt.figure(figsize=(5, 3))
sns.lineplot(data=data, x='Day', y='Bounce_Rate', color='red')
plt.title("Bounce Rate Percentage")
plt.show()
# Chart 3: Conversion dropped
plt.figure(figsize=(5, 3))
sns.barplot(data=data, x='Day', y='Conversion_Rate')
plt.title("Sales Conversion Rate")
plt.show()
plot_narrative_gap()
Students must create a Python dictionary called narrative_audit. For each transition, they must identify:
narrative_audit = {
"Transition_1_to_2": {
"Observation": "Traffic is hitting record highs in the second half of the week.",
"The Question": "Is this high-volume traffic actually high-quality traffic?",
"Bridge": "To find out, we need to look at the **Bounce Rate** to see if people are sticking around."
},
"Transition_2_to_3": {
"Observation": "Bounce rates nearly doubled as traffic increased.",
"The Question": "How did this inability to retain users impact our bottom line?",
"Bridge": "We will now examine **Conversion Rates** to quantify the cost of this technical friction."
}
}
Instead of checking if the code runs, you are checking for Causality.
How do you think your students would react to critiquing “broken” stories like this versus building their own from scratch? Would they find it easier to spot logic gaps in someone else’s work first?

🥳 2 experts might figure it out, but the rest of the 8 billion people?
As shown in the figure below, overly complex visuals can fail to communicate outside a small expert audience.
Lovable
Replit
Cursor
Google AI studio
Base44
The User Experience: A detailed look at the components of user experience design.