Path: blob/master/Data Science Essentials for Data Analysts/Data Modelling using Linear regression .ipynb
3074 views
Linear Regression Overview
Linear Regression is a statistical method used to model the relationship between a dependent variable 𝑌(output) and one or more independent variables X (input). The simplest form is Simple Linear Regression, which deals with a single input feature.
Assumptions of Linear Regression
Linearity: The relationship between independent and dependent variables is linear.
Independence: Observations are independent.
Homoscedasticity: Constant variance of residuals across all levels of 𝑋
Normality of Residuals: Residuals should be approximately normally distributed.
No Multicollinearity (for multiple regression): Independent variables should not be highly correlated.
Implementation Using NumPy
Implementation Using sklearn
To implement Linear Regression with sklearn, we use the LinearRegression class from sklearn.linear_model.
Examples of Linear Regression
Linear regression is widely used in various fields. Here are some examples:
Predicting House Prices Input: Square footage, number of bedrooms. Output: Price of the house.
Salary Prediction Input: Years of experience. Output: Annual salary.
Health Analysis Input: Body Mass Index (BMI). Output: Risk of certain diseases.
Advertising Effectiveness Input: Money spent on advertising. Output: Sales revenue.
Quick Practice Predicting Attrition Rate of a Company
Year Attrition Rate (%) 2004 12.5 2005 13.2 2006 11.8 2007 12.1 2008 14.0 2009 15.5 2010 14.8 2011 13.3 2012 13.0 2013 12.7 2014 11.5 2015 11.0 2016 12.2 2017 12.9 2018 13.5 2019 14.0 2020 16.5 2021 18.0 2022 17.5 2023 16.0
You need to predict Attrition equation and alos Attrition rate for 2025 and 2026.
Attrition_Rate = .18(Year)+ (-358) +2.62