First ML Project

🎯 Project: Your First ML Model

Congratulations! You're ready to build your first machine learning model. This project will walk you through the complete ML workflow using a simple, understandable example.

We'll predict a simple linear relationship: y = 2x + 1. While this seems trivial, it demonstrates all the key concepts you'll use in more complex projects.

Project Goal

Create a model that learns the relationship between input (x) and output (y) values. Given training examples, the model should learn to predict y for any new x value.

project_goal.py
# Training data: learning y = 2x + 1
# When x = 1, y = 3 (because 2×1 + 1 = 3)
# When x = 2, y = 5 (because 2×2 + 1 = 5)
# When x = 3, y = 7 (because 2×3 + 1 = 7)

training_data = [
    (1, 3),
    (2, 5),
    (3, 7),
    (4, 9),
    (5, 11)
]

print("Training Examples:")
for x, y in training_data:
    print(f"x={x}, y={y}")

Step 1: Prepare the Data

First, we need to format our data for the machine learning algorithm. We'll separate inputs (X) and outputs (y):

prepare_data.py
import numpy as np

# Separate features (X) and target (y)
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)  # Input features
y = np.array([3, 5, 7, 9, 11])  # Target values

print("Input features (X):")
print(X)
print("\nTarget values (y):")
print(y)
print("\nShape of X:", X.shape)
print("Shape of y:", y.shape)

Step 2: Train the Model

Now we'll use scikit-learn's LinearRegression to learn the pattern from our data:

train_model.py
from sklearn.linear_model import LinearRegression
import numpy as np

# Prepare data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([3, 5, 7, 9, 11])

# Create and train the model
model = LinearRegression()
model.fit(X, y)

print("Model trained successfully!")
print("Learned coefficient (slope):", model.coef_[0])
print("Learned intercept:", model.intercept_)
print("\nThe model learned: y = {:.2f}x + {:.2f}".format(model.coef_[0], model.intercept_))

Step 3: Make Predictions

Once trained, use the model to predict y for new x values it hasn't seen:

make_predictions.py
from sklearn.linear_model import LinearRegression
import numpy as np

# Train model (same as before)
X_train = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y_train = np.array([3, 5, 7, 9, 11])
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions on new data
X_new = np.array([[6], [7], [8]])  # New x values
predictions = model.predict(X_new)

print("Predictions on new data:")
for x, pred in zip(X_new, predictions):
    expected = 2 * x[0] + 1  # True relationship
    print(f"x={x[0]}, predicted y={pred:.2f}, expected y={expected}")

Step 4: Evaluate the Model

Check how well the model learned by comparing predictions with expected values:

evaluate_model.py
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np

# Train model
X_train = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y_train = np.array([3, 5, 7, 9, 11])
model = LinearRegression()
model.fit(X_train, y_train)

# Test on new data
X_test = np.array([[6], [7], [8]])
y_test = np.array([13, 15, 17])  # True values
y_pred = model.predict(X_test)

# Calculate error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
print("\nModel Performance:")
print("The lower the error, the better the model learned!")

💡 Key Insight

This simple example demonstrates the complete ML cycle: data → training → prediction → evaluation. Every complex ML project follows this same pattern!

🎉

Lesson Complete!

Great work! Continue to the next lesson.