Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
guipsamora
GitHub Repository: guipsamora/pandas_exercises
Path: blob/master/10_Deleting/Wine/Solutions.ipynb
613 views
Kernel: Python [default]

Wine

Introduction:

This exercise is a adaptation from the UCI Wine dataset. The only pupose is to practice deleting data with pandas.

Step 1. Import the necessary libraries

Step 2. Import the dataset from this address.

Step 3. Assign it to a variable called wine

Step 4. Delete the first, fourth, seventh, nineth, eleventh, thirteenth and fourteenth columns

Step 5. Assign the columns as below:

The attributes are (donated by Riccardo Leardi, riclea '@' anchem.unige.it):

  1. alcohol

  2. malic_acid

  3. alcalinity_of_ash

  4. magnesium

  5. flavanoids

  6. proanthocyanins

  7. hue

Step 6. Set the values of the first 3 rows from alcohol as NaN

Step 7. Now set the value of the rows 3 and 4 of magnesium as NaN

Step 8. Fill the value of NaN with the number 10 in alcohol and 100 in magnesium

Step 9. Count the number of missing values

alcohol 0 malic_acid 0 alcalinity_of_ash 0 magnesium 0 flavanoids 0 proanthocyanins 0 hue 0 dtype: int64

Step 10. Create an array of 10 random numbers up until 10

array([2, 3, 0, 5, 0, 9, 4, 0, 7, 2])

Step 11. Use random numbers you generated as an index and assign NaN value to each of cell.

Step 12. How many missing values do we have?

alcohol 7 malic_acid 0 alcalinity_of_ash 0 magnesium 0 flavanoids 0 proanthocyanins 0 hue 0 dtype: int64

Step 13. Delete the rows that contain missing values

Step 14. Print only the non-null values in alcohol

1 True 6 True 8 True 10 True 11 True 12 True 13 True 14 True 15 True 16 True 17 True 18 True 19 True 20 True 21 True 22 True 23 True 24 True 25 True 26 True 27 True 28 True 29 True 30 True 31 True 32 True 33 True 34 True 35 True 36 True ... 147 True 148 True 149 True 150 True 151 True 152 True 153 True 154 True 155 True 156 True 157 True 158 True 159 True 160 True 161 True 162 True 163 True 164 True 165 True 166 True 167 True 168 True 169 True 170 True 171 True 172 True 173 True 174 True 175 True 176 True Name: alcohol, dtype: bool
1 10.00 6 14.06 8 13.86 10 14.12 11 13.75 12 14.75 13 14.38 14 13.63 15 14.30 16 13.83 17 14.19 18 13.64 19 14.06 20 12.93 21 13.71 22 12.85 23 13.50 24 13.05 25 13.39 26 13.30 27 13.87 28 14.02 29 13.73 30 13.58 31 13.68 32 13.76 33 13.51 34 13.48 35 13.28 36 13.05 ... 147 13.32 148 13.08 149 13.50 150 12.79 151 13.11 152 13.23 153 12.58 154 13.17 155 13.84 156 12.45 157 14.34 158 13.48 159 12.36 160 13.69 161 12.85 162 12.96 163 13.78 164 13.73 165 13.45 166 12.82 167 13.58 168 13.40 169 12.20 170 12.77 171 14.16 172 13.71 173 13.40 174 13.27 175 13.17 176 14.13 Name: alcohol, dtype: float64

Step 15. Reset the index, so it starts with 0 again

BONUS: Create your own question and answer it.