🐍Python/Pandas

[Pandas] Pandas03 - Regiment 풀이

728x90
반응형

Regiment

Introduction:

Special thanks to: http://chrisalbon.com/ for sharing the dataset and materials.

Step 1. Import the necessary libraries

In [16]:
import pandas as pd

Step 2. Create the DataFrame with the following values:

In [17]:
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 
        'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'], 
        'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'], 
        'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
        'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}

Step 3. Assign it to a variable called regiment.

Don't forget to name each column

In [18]:
regiment = pd.DataFrame(raw_data,columns=raw_data.keys())
regiment.head()
Out[18]:
regimentcompanynamepreTestScorepostTestScore
0Nighthawks1stMiller425
1Nighthawks1stJacobson2494
2Nighthawks2ndAli3157
3Nighthawks2ndMilner262
4Dragoons1stCooze370

Step 4. What is the mean preTestScore from the regiment Nighthawks?

In [19]:
regiment[regiment['regiment'] == 'Nighthawks'].groupby('regiment').mean()
Out[19]:
preTestScorepostTestScore
regiment
Nighthawks15.2559.5

Step 5. Present general statistics by company

In [20]:
regiment.groupby('company').describe()
Out[20]:
preTestScorepostTestScore
countmeanstdmin25%50%75%maxcountmeanstdmin25%50%75%max
company
1st6.06.6666678.5244752.03.003.54.0024.06.057.66666727.48575425.034.2566.070.094.0
2nd6.015.50000014.6526452.02.2513.529.2531.06.067.00000014.05702757.058.2562.068.094.0

Step 6. What is the mean each company's preTestScore?

In [21]:
regiment.groupby('company')['preTestScore'].mean()
Out[21]:
company
1st     6.666667
2nd    15.500000
Name: preTestScore, dtype: float64

Step 7. Present the mean preTestScores grouped by regiment and company

In [24]:
regiment.groupby(['regiment','company']).preTestScore.mean()
Out[24]:
regiment    company
Dragoons    1st         3.5
            2nd        27.5
Nighthawks  1st        14.0
            2nd        16.5
Scouts      1st         2.5
            2nd         2.5
Name: preTestScore, dtype: float64

Step 8. Present the mean preTestScores grouped by regiment and company without heirarchical indexing

In [26]:
regiment.groupby(['regiment','company']).preTestScore.mean().unstack()
Out[26]:
company1st2nd
regiment
Dragoons3.527.5
Nighthawks14.016.5
Scouts2.52.5

Step 9. Group the entire dataframe by regiment and company

In [28]:
regiment.groupby(['regiment','company']).mean()
Out[28]:
preTestScorepostTestScore
regimentcompany
Dragoons1st3.547.5
2nd27.575.5
Nighthawks1st14.059.5
2nd16.559.5
Scouts1st2.566.0
2nd2.566.0

Step 10. What is the number of observations in each regiment and company

In [32]:
# unstack으로 보기 좋게 
regiment.groupby(['regiment','company']).size().unstack()
Out[32]:
company1st2nd
regiment
Dragoons22
Nighthawks22
Scouts22

Step 11. Iterate over a group and print the name and the whole data from the regiment

In [41]:
for name, reg in regiment.groupby('regiment'):
    print(name)
    print(reg)
Dragoons
   regiment company    name  preTestScore  postTestScore
4  Dragoons     1st   Cooze             3             70
5  Dragoons     1st   Jacon             4             25
6  Dragoons     2nd  Ryaner            24             94
7  Dragoons     2nd    Sone            31             57
Nighthawks
     regiment company      name  preTestScore  postTestScore
0  Nighthawks     1st    Miller             4             25
1  Nighthawks     1st  Jacobson            24             94
2  Nighthawks     2nd       Ali            31             57
3  Nighthawks     2nd    Milner             2             62
Scouts
   regiment company   name  preTestScore  postTestScore
8    Scouts     1st  Sloan             2             62
9    Scouts     1st  Piger             3             70
10   Scouts     2nd  Riani             2             62
11   Scouts     2nd    Ali             3             70


728x90
반응형