Chapter 4: Data Visualization / Lesson 17

Creating Visualizations

Creating Visualizations with Matplotlib

Visualizations help you understand your data, identify patterns, and communicate insights. Matplotlib is Python's primary plotting library, and mastering it is essential for data analysis and ML.

Good visualizations can reveal trends, outliers, and relationships that numbers alone cannot show. They're crucial for exploratory data analysis before building ML models.

Line Plots

Line plots are perfect for showing trends over time or relationships between variables:

line_plot.py
# Creating Line Plots import matplotlib.pyplot as plt import numpy as np # Sample data months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'] sales = [100, 120, 140, 130, 150, 180] # Create line plot plt.figure(figsize=(8, 5)) plt.plot(months, sales, marker='o', linewidth=2, color='#22d3ee') plt.title('Monthly Sales Trend', fontsize=14, fontweight='bold') plt.xlabel('Month') plt.ylabel('Sales ($)') plt.grid(True, alpha=0.3) plt.tight_layout() plt.show() print("Line plot created showing sales trend over 6 months")

Bar Charts

Bar charts are ideal for comparing categories or discrete values:

bar_chart.py
# Creating Bar Charts import matplotlib.pyplot as plt # Category data categories = ['Product A', 'Product B', 'Product C', 'Product D'] revenue = [45000, 52000, 38000, 61000] # Create bar chart plt.figure(figsize=(8, 5)) plt.bar(categories, revenue, color=['#22d3ee', '#06b6d4', '#a855f7', '#8b5cf6']) plt.title('Revenue by Product', fontsize=14, fontweight='bold') plt.xlabel('Product') plt.ylabel('Revenue ($)') plt.xticks(rotation=45) plt.grid(axis='y', alpha=0.3) plt.tight_layout() plt.show() print("Bar chart created comparing revenue across products")

Scatter Plots

Scatter plots show relationships between two continuous variables:

scatter_plot.py
# Creating Scatter Plots import matplotlib.pyplot as plt import numpy as np # Generate sample data np.random.seed(42) hours_studied = np.random.randint(10, 50, 30) test_scores = hours_studied * 2 + np.random.randint(-10, 10, 30) # Create scatter plot plt.figure(figsize=(8, 5)) plt.scatter(hours_studied, test_scores, alpha=0.6, s=100, color='#22d3ee') plt.title('Study Hours vs Test Scores', fontsize=14, fontweight='bold') plt.xlabel('Hours Studied') plt.ylabel('Test Score') plt.grid(True, alpha=0.3) plt.tight_layout() plt.show() print("Scatter plot shows positive correlation between study hours and scores")

Histograms

Histograms show the distribution of a single variable:

histogram.py
# Creating Histograms import matplotlib.pyplot as plt import numpy as np # Generate sample data (ages) np.random.seed(42) ages = np.random.normal(35, 10, 1000) # Mean=35, Std=10, 1000 samples # Create histogram plt.figure(figsize=(8, 5)) plt.hist(ages, bins=30, color='#22d3ee', edgecolor='black', alpha=0.7) plt.title('Age Distribution', fontsize=14, fontweight='bold') plt.xlabel('Age') plt.ylabel('Frequency') plt.grid(axis='y', alpha=0.3) plt.tight_layout() plt.show() print("Histogram shows the distribution of ages in the dataset")

Multiple Subplots

You can create multiple plots in one figure using subplots:

subplots.py
# Creating Multiple Subplots import matplotlib.pyplot as plt import numpy as np # Create figure with 2x2 subplots fig, axes = plt.subplots(2, 2, figsize=(12, 10)) # Plot 1: Line plot x = np.linspace(0, 10, 100) axes[0, 0].plot(x, np.sin(x)) axes[0, 0].set_title('Sine Wave') # Plot 2: Bar chart categories = ['A', 'B', 'C'] values = [10, 20, 15] axes[0, 1].bar(categories, values) axes[0, 1].set_title('Bar Chart') # Plot 3: Scatter plot x_scatter = np.random.randn(50) y_scatter = np.random.randn(50) axes[1, 0].scatter(x_scatter, y_scatter) axes[1, 0].set_title('Scatter Plot') # Plot 4: Histogram data = np.random.normal(0, 1, 1000) axes[1, 1].hist(data, bins=30) axes[1, 1].set_title('Histogram') plt.tight_layout() plt.show() print("Created 4 different plots in one figure using subplots")

Exercise: Create Visualizations

Complete the exercise on the right side:

  • Task 1: Create a line plot showing temperature over 7 days
  • Task 2: Create a bar chart comparing sales for 4 products
  • Task 3: Create a scatter plot showing the relationship between two variables
  • Task 4: Add titles and labels to all plots

Write your code to create these visualizations! (Note: In this environment, plots will be described in text output)

💡 Learning Tip

Practice is essential. Try modifying the code examples, experiment with different parameters, and see how changes affect the results. Hands-on experience is the best teacher!

🎉

Lesson Complete!

Great work! Continue to the next lesson.

main.py
📤 Output
Click "Run" to execute...