Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Data Analytics Using Python/5 Simple Linear regression.ipynb
3074 views
Kernel: Python 3 (ipykernel)

Regression in Machine Learning

What is Regression?

Regression is a supervised learning technique used when the target variable is continuous. It models the relationship between a dependent variable (target) and one or more independent variables (features).

Goal: Predict a numerical value (e.g., salary, price, temperature) based on input data.


Why Use Regression?

  • Predict housing prices based on size, location, etc.

  • Forecast sales or stock prices

  • Estimate student marks based on study hours


Types of Regression

TypeDescriptionExample
Linear RegressionOne dependent, one/multiple independent varsSalary vs Experience
Polynomial RegressionFeatures raised to a power (non-linear curves)Age vs Cholesterol levels
Ridge/Lasso RegressionRegularized linear regressionUsed to prevent overfitting
Logistic Regression (Special)Used for classification problemsYes/No outcomes

Simple Linear Regression Formula

%7BAFB0721A-9856-4B25-9DBF-A2B8FEF217EE%7D.png

Example: Salary Prediction

Experience (Years)Salary (Lakhs)
12.5
23.2
33.8
44.5
55.0

Model may learn: [ \text{Salary} = 0.58 \times \text{Experience} + 1.95 ]


Visual Representation

  • The regression line tries to minimize the distance (error) between actual and predicted values.

  • In simple linear regression, it’s a straight line.

  • In polynomial regression, it’s a curve.


Summary

  • Regression is used when the target is continuous.

  • Linear regression is the most common starting point.

  • Helps understand relationships and make predictions.

# Importing required libraries import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression # Step 1: Create Simple Dataset data = { 'Experience_Years': [1, 2, 3, 4, 5, 6, 7, 8], 'Salary_Lakhs': [2.5, 3.2, 3.8, 4.5, 5.0, 5.8, 6.4, 7.0] } df = pd.DataFrame(data) print("Sample Data") print(df) # 🔧 Step 2: Define features (X) and target (y) X = df[['Experience_Years']] # 2D input y = df['Salary_Lakhs'] # 1D target # 🔧 Step 3: Build and Train Model model = LinearRegression() model.fit(X, y) # Step 4: Predict on training data df['Predicted_Salary'] = model.predict(X) # Step 5: Visualize the regression line plt.figure(figsize=(8,5)) plt.scatter(X, y, color='blue', label='Actual Salary') plt.plot(X, df['Predicted_Salary'], color='red', label='Predicted Line') plt.xlabel("Years of Experience") plt.ylabel("Salary (in Lakhs)") plt.title("Simple Linear Regression Example") plt.legend() plt.grid(True) plt.show() # Step 6: Display model coefficient and intercept print("\n Regression Equation:") print(f"Salary = {model.coef_[0]:.2f} * Experience + {model.intercept_:.2f}")
Sample Data Experience_Years Salary_Lakhs 0 1 2.5 1 2 3.2 2 3 3.8 3 4 4.5 4 5 5.0 5 6 5.8 6 7 6.4 7 8 7.0
Image in a Jupyter notebook
Regression Equation: Salary = 0.64 * Experience + 1.88