Chapter 3: NumPy & Pandas / Lesson 11

Introduction to NumPy

Introduction to NumPy

NumPy (Numerical Python) is the foundation of numerical computing in Python and is essential for machine learning. It provides powerful N-dimensional array objects and tools for working with these arrays efficiently.

Unlike Python lists, NumPy arrays are homogeneous (all elements have the same type) and are stored in contiguous memory, making operations much faster. This is crucial when working with large datasets in machine learning.

Why NumPy for Machine Learning?

NumPy is the backbone of most ML libraries because:

  • Speed: Operations are implemented in C, making them much faster than Python loops
  • Memory Efficiency: Arrays use less memory than Python lists
  • Vectorization: Perform operations on entire arrays without explicit loops
  • Integration: Works seamlessly with pandas, scikit-learn, and other ML libraries

Creating NumPy Arrays

The most common way to create arrays is using np.array(). You can create arrays from Python lists:

creating_arrays.py
# Create arrays from Python lists import numpy as np # 1D array (vector) arr1d = np.array([1, 2, 3, 4, 5]) print("1D Array:", arr1d) print("Shape:", arr1d.shape) # (5,) print("Dimensions:", arr1d.ndim) # 1 # 2D array (matrix) arr2d = np.array([[1, 2, 3], [4, 5, 6]]) print("\n2D Array:") print(arr2d) print("Shape:", arr2d.shape) # (2, 3) print("Dimensions:", arr2d.ndim) # 2 # 3D array arr3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) print("\n3D Array shape:", arr3d.shape) # (2, 2, 2)

Array Operations

NumPy allows you to perform mathematical operations on entire arrays efficiently:

array_operations.py
# Element-wise operations import numpy as np arr = np.array([1, 2, 3, 4, 5]) # Arithmetic operations print("Original:", arr) print("Multiply by 2:", arr * 2) print("Add 10:", arr + 10) print("Square:", arr ** 2) # Operations between arrays arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) print("\nArray 1:", arr1) print("Array 2:", arr2) print("Sum:", arr1 + arr2) print("Product:", arr1 * arr2) print("Dot product:", np.dot(arr1, arr2))

Useful Array Functions

NumPy provides many useful functions for array manipulation:

array_functions.py
# Common NumPy functions import numpy as np arr = np.array([3, 1, 4, 1, 5, 9, 2, 6]) print("Array:", arr) print("Sum:", np.sum(arr)) print("Mean:", np.mean(arr)) print("Max:", np.max(arr)) print("Min:", np.min(arr)) print("Standard deviation:", np.std(arr)) # Reshaping arrays arr = np.array([1, 2, 3, 4, 5, 6]) print("\nOriginal shape:", arr.shape) reshaped = arr.reshape(2, 3) print("Reshaped to (2, 3):") print(reshaped)

Array Indexing and Slicing

NumPy arrays support powerful indexing and slicing operations:

indexing.py
# Array indexing and slicing import numpy as np arr = np.array([10, 20, 30, 40, 50, 60, 70, 80]) # Single element print("First element:", arr[0]) print("Last element:", arr[-1]) # Slicing print("First 3 elements:", arr[:3]) print("Last 3 elements:", arr[-3:]) print("Middle elements:", arr[2:5]) # 2D array indexing matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print("\nMatrix:") print(matrix) print("Element at [1, 2]:", matrix[1, 2]) print("First row:", matrix[0, :]) print("Second column:", matrix[:, 1])

Creating Arrays with Built-in Functions

NumPy provides convenient functions to create arrays with specific patterns:

array_creation.py
# Creating arrays with NumPy functions import numpy as np # Zeros array zeros = np.zeros((3, 4)) print("Zeros array (3x4):") print(zeros) # Ones array ones = np.ones((2, 3)) print("\nOnes array (2x3):") print(ones) # Array with a range range_arr = np.arange(0, 10, 2) # Start, stop, step print("\nRange array:", range_arr) # Array with evenly spaced values linspace_arr = np.linspace(0, 1, 5) # Start, stop, num_points print("Linspace array:", linspace_arr) # Random array random_arr = np.random.rand(3, 3) # Random values between 0 and 1 print("\nRandom array:") print(random_arr)

💡 Key Takeaway

NumPy arrays are the foundation of machine learning in Python. Most ML libraries (pandas, scikit-learn, TensorFlow) use NumPy arrays internally. Mastering NumPy will make learning these libraries much easier!

🎉

Lesson Complete!

Great work! Continue to the next lesson.

main.py
📤 Output
Click "Run" to execute...