🐍Python/Pandas

[Pandas] Pandas03 - Alcohol_Consumption 풀이

728x90
반응형

Ex - GroupBy

Introduction:

GroupBy can be summarizes as Split-Apply-Combine.

Special thanks to: https://github.com/justmarkham for sharing the dataset and materials.

Check out this Diagram

Step 1. Import the necessary libraries

In [ ]:
import pandas as pd 

Step 2. Import the dataset from this address.

Step 3. Assign it to a variable called drinks.

In [5]:
url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/drinks.csv'
drinks = pd.read_csv(url,',')
drinks.head()
Out[5]:
countrybeer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcoholcontinent
0Afghanistan0000.0AS
1Albania89132544.9EU
2Algeria250140.7AF
3Andorra24513831212.4EU
4Angola21757455.9AF

Step 4. Which continent drinks more beer on average?

In [18]:
drinks.groupby('continent')['beer_servings'].mean()
Out[18]:
continent
AF     61.471698
AS     37.045455
EU    193.777778
OC     89.687500
SA    175.083333
Name: beer_servings, dtype: float64

Step 5. For each continent print the statistics for wine consumption.

In [19]:
drinks.groupby('continent')['wine_servings'].describe()
Out[19]:
countmeanstdmin25%50%75%max
continent
AF53.016.26415138.8464190.01.02.013.00233.0
AS44.09.06818221.6670340.00.01.08.00123.0
EU45.0142.22222297.4217380.059.0128.0195.00370.0
OC16.035.62500064.5557900.01.08.523.25212.0
SA12.062.41666788.6201891.03.012.098.50221.0

Step 6. Print the mean alcohol consumption per continent for every column

In [27]:
drinks.groupby('continent').mean()
Out[27]:
beer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcohol
continent
AF61.47169816.33962316.2641513.007547
AS37.04545560.8409099.0681822.170455
EU193.777778132.555556142.2222228.617778
OC89.68750058.43750035.6250003.381250
SA175.083333114.75000062.4166676.308333

Step 7. Print the median alcoohol consumption per continent for every column

In [28]:
drinks.groupby('continent').median()
Out[28]:
beer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcohol
continent
AF32.03.02.02.30
AS17.516.01.01.20
EU219.0122.0128.010.00
OC52.537.08.51.75
SA162.5108.512.06.85

Step 8. Print the mean, min and max values for spirit consumption.

This time output a DataFrame

In [33]:
drinks.groupby('continent').spirit_servings.agg(['mean','min','max'])
# agg로 합치기
Out[33]:
meanminmax
continent
AF16.3396230152
AS60.8409090326
EU132.5555560373
OC58.4375000254
SA114.75000025302


728x90
반응형