GroupBy

Introduction:

GroupBy can be summarized as Split-Apply-Combine.

Special thanks to: https://github.com/justmarkham for sharing the dataset and materials.

Check out this Diagram

Step 1. Import the necessary libraries

In [1]:

Step 2. Import the dataset from this address.

Step 3. Assign it to a variable called drinks (Watch the values of the Column 'Continent' NA (North America), and how Pandas interprets it!

In [4]:

Out[4]:

Step 4. Which continent drinks more beer on average?

In [6]:

Out[6]:

continent
AF     61.471698
AS     37.045455
EU    193.777778
OC     89.687500
SA    175.083333
Name: beer_servings, dtype: float64

Step 5. For each continent print the statistics for wine consumption.

In [9]:

Out[9]:

continent       
AF         count     53.000000
           mean      16.264151
           std       38.846419
           min        0.000000
           25%        1.000000
           50%        2.000000
           75%       13.000000
           max      233.000000
AS         count     44.000000
           mean       9.068182
           std       21.667034
           min        0.000000
           25%        0.000000
           50%        1.000000
           75%        8.000000
           max      123.000000
EU         count     45.000000
           mean     142.222222
           std       97.421738
           min        0.000000
           25%       59.000000
           50%      128.000000
           75%      195.000000
           max      370.000000
OC         count     16.000000
           mean      35.625000
           std       64.555790
           min        0.000000
           25%        1.000000
           50%        8.500000
           75%       23.250000
           max      212.000000
SA         count     12.000000
           mean      62.416667
           std       88.620189
           min        1.000000
           25%        3.000000
           50%       12.000000
           75%       98.500000
           max      221.000000
dtype: float64

Step 6. Print the mean alcohol consumption per continent for every column

In [10]:

Out[10]:

Step 7. Print the median alcohol consumption per continent for every column

In [14]:

Out[14]:

Step 8. Print the mean, min and max values for spirit consumption by Continent.

This time output a DataFrame

In [15]:

Out[15]:

GroupBy

Introduction:

Step 1. Import the necessary libraries

Step 2. Import the dataset from this address.

Step 3. Assign it to a variable called drinks (Watch the values of the Column 'Continent' NA (North America), and how Pandas interprets it!

Step 4. Which continent drinks more beer on average?

Step 5. For each continent print the statistics for wine consumption.

Step 6. Print the mean alcohol consumption per continent for every column

Step 7. Print the median alcohol consumption per continent for every column

Step 8. Print the mean, min and max values for spirit consumption by Continent.

This time output a DataFrame

Product

Resources

Company