Chapter 4: Data Visualization / Lesson 20

Data Visualization Project

🎯 Project: Complete Data Visualization Analysis

This project will help you apply everything you've learned about data visualization. You'll analyze a dataset, create multiple visualizations, and extract insights using Matplotlib and Seaborn.

Visualization is crucial for understanding data before building ML models. This project will give you hands-on experience with real visualization workflows.

Exploring Data with Visualizations

Visualizations help you understand your data. Here's a complete workflow:

exploratory_analysis.py
# Complete Data Visualization Workflow import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import numpy as np # Sample dataset data = { 'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'], 'sales': [100, 120, 140, 130, 150, 180], 'region': ['North', 'South', 'North', 'South', 'North', 'South'], 'product': ['A', 'B', 'A', 'B', 'A', 'B'] } df = pd.DataFrame(data) print("Dataset Overview:") print(df) print(f"\nDataset shape: {df.shape}") print(f"Columns: {list(df.columns)}") print("\nVisualization Plan:") print(" 1. Time series: Sales trend over months") print(" 2. Comparison: Sales by region") print(" 3. Distribution: Sales distribution") print(" 4. Relationship: Sales vs other features")

Creating Multiple Visualizations

A complete analysis requires multiple visualization types:

multiple_viz.py
# Creating Multiple Visualizations import matplotlib.pyplot as plt import seaborn as sns import pandas as pd # Sample sales data df = pd.DataFrame({ 'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'], 'sales': [100, 120, 140, 130, 150], 'region': ['North', 'South', 'North', 'South', 'North'] }) print("Visualization 1: Time Series (Line Plot)") print(" plt.plot(df['month'], df['sales'])") print(" Shows trend over time") print("\nVisualization 2: Comparison (Bar Chart)") print(" sns.barplot(data=df, x='region', y='sales')") print(" Compares sales by region") print("\nVisualization 3: Distribution (Histogram)") print(" sns.histplot(data=df, x='sales')") print(" Shows sales distribution") print("\nVisualization 4: Statistical Summary (Box Plot)") print(" sns.boxplot(data=df, x='region', y='sales')") print(" Shows quartiles and outliers")

Combining Visualizations

Create comprehensive dashboards with multiple plots:

dashboard.py
# Creating a Visualization Dashboard import matplotlib.pyplot as plt import seaborn as sns # Create figure with multiple subplots fig = plt.figure(figsize=(14, 10)) print("Dashboard Layout:") print(" Top Row: Overview plots (trend, summary)") print(" Bottom Row: Detailed analysis (distribution, comparison)") # Layout structure print("\nSubplot Structure:") print(" [0,0] - Time series line plot") print(" [0,1] - Summary statistics bar chart") print(" [1,0] - Distribution histogram") print(" [1,1] - Comparison box plot") print("\nBenefits of dashboard:") print(" - See multiple perspectives at once") print(" - Identify patterns and outliers") print(" - Communicate insights effectively")

Exercise: Complete Visualization Project

Complete the exercise on the right side:

  • Task 1: Create a DataFrame with sales data (month, sales, region)
  • Task 2: Create a line plot showing sales trend over months
  • Task 3: Create a bar chart comparing sales by region
  • Task 4: Calculate and print summary statistics (mean, max, min sales)
  • Task 5: Create a dashboard with 2x2 subplots showing all visualizations

Write your code to complete this comprehensive visualization project!

💡 Project Tips

Break the project into smaller tasks. Complete and test each part before moving to the next. Don't try to do everything at once—iterative development leads to better results!

🎉

Lesson Complete!

Great work! Continue to the next lesson.

main.py
📤 Output
Click "Run" to execute...