Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
YStrano
GitHub Repository: YStrano/DataScience_GA
Path: blob/master/lessons/lesson_11/MLB - Decision Tree Practice.ipynb
1904 views
Kernel: Python [conda env:Anaconda3]

Import the MLB Hitters Data Set

import numpy as np import pandas as pd df = pd.read_csv('data/hitters.csv')
df.head()
df.shape
(322, 20)

Create a Few New Features for the Data Set (i.e. batting average)

Create a train test split

from sklearn.model_selection import train_test_split

Use SkLearn to build a decision tree regressor to predict Salary

Does it work well? What is the MSE? Plot your predictions against the actual salaries