728x90

Regiment

Introduction:

Special thanks to: http://chrisalbon.com/ for sharing the dataset and materials.

Step 1. Import the necessary libraries

In [16]:

import pandas as pd

Step 2. Create the DataFrame with the following values:

In [17]:

raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 
        'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'], 
        'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'], 
        'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
        'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}

Step 3. Assign it to a variable called regiment.

Don't forget to name each column

In [18]:

regiment = pd.DataFrame(raw_data,columns=raw_data.keys())
regiment.head()

Out[18]:

	regiment	company	name	preTestScore	postTestScore
0	Nighthawks	1st	Miller	4	25
1	Nighthawks	1st	Jacobson	24	94
2	Nighthawks	2nd	Ali	31	57
3	Nighthawks	2nd	Milner	2	62
4	Dragoons	1st	Cooze	3	70

Step 4. What is the mean preTestScore from the regiment Nighthawks?

In [19]:

regiment[regiment['regiment'] == 'Nighthawks'].groupby('regiment').mean()

Out[19]:

	preTestScore	postTestScore
regiment
Nighthawks	15.25	59.5

Step 5. Present general statistics by company

In [20]:

regiment.groupby('company').describe()

Out[20]:

	preTestScore								postTestScore
	count	mean	std	min	25%	50%	75%	max	count	mean	std	min	25%	50%	75%	max
company
1st	6.0	6.666667	8.524475	2.0	3.00	3.5	4.00	24.0	6.0	57.666667	27.485754	25.0	34.25	66.0	70.0	94.0
2nd	6.0	15.500000	14.652645	2.0	2.25	13.5	29.25	31.0	6.0	67.000000	14.057027	57.0	58.25	62.0	68.0	94.0

Step 6. What is the mean each company's preTestScore?

In [21]:

regiment.groupby('company')['preTestScore'].mean()

Out[21]:

company
1st     6.666667
2nd    15.500000
Name: preTestScore, dtype: float64

Step 7. Present the mean preTestScores grouped by regiment and company

In [24]:

regiment.groupby(['regiment','company']).preTestScore.mean()

Out[24]:

regiment    company
Dragoons    1st         3.5
            2nd        27.5
Nighthawks  1st        14.0
            2nd        16.5
Scouts      1st         2.5
            2nd         2.5
Name: preTestScore, dtype: float64

Step 8. Present the mean preTestScores grouped by regiment and company without heirarchical indexing

In [26]:

regiment.groupby(['regiment','company']).preTestScore.mean().unstack()

Out[26]:

company	1st	2nd
regiment
Dragoons	3.5	27.5
Nighthawks	14.0	16.5
Scouts	2.5	2.5

Step 9. Group the entire dataframe by regiment and company

In [28]:

regiment.groupby(['regiment','company']).mean()

Out[28]:

		preTestScore	postTestScore
regiment	company
Dragoons	1st	3.5	47.5
Dragoons	2nd	27.5	75.5
Nighthawks	1st	14.0	59.5
Nighthawks	2nd	16.5	59.5
Scouts	1st	2.5	66.0
Scouts	2nd	2.5	66.0

Step 10. What is the number of observations in each regiment and company

In [32]:

# unstack으로 보기 좋게 
regiment.groupby(['regiment','company']).size().unstack()

Out[32]:

company	1st	2nd
regiment
Dragoons	2	2
Nighthawks	2	2
Scouts	2	2

Step 11. Iterate over a group and print the name and the whole data from the regiment

In [41]:

for name, reg in regiment.groupby('regiment'):
    print(name)
    print(reg)

Dragoons
   regiment company    name  preTestScore  postTestScore
4  Dragoons     1st   Cooze             3             70
5  Dragoons     1st   Jacon             4             25
6  Dragoons     2nd  Ryaner            24             94
7  Dragoons     2nd    Sone            31             57
Nighthawks
     regiment company      name  preTestScore  postTestScore
0  Nighthawks     1st    Miller             4             25
1  Nighthawks     1st  Jacobson            24             94
2  Nighthawks     2nd       Ali            31             57
3  Nighthawks     2nd    Milner             2             62
Scouts
   regiment company   name  preTestScore  postTestScore
8    Scouts     1st  Sloan             2             62
9    Scouts     1st  Piger             3             70
10   Scouts     2nd  Riani             2             62
11   Scouts     2nd    Ali             3             70

728x90

저작자표시

[Pandas] Pandas03 - Regiment 풀이

Regiment

Introduction:

Step 1. Import the necessary libraries

Step 2. Create the DataFrame with the following values:

Step 3. Assign it to a variable called regiment.

Don't forget to name each column

Step 4. What is the mean preTestScore from the regiment Nighthawks?

Step 5. Present general statistics by company

Step 6. What is the mean each company's preTestScore?

Step 7. Present the mean preTestScores grouped by regiment and company

Step 8. Present the mean preTestScores grouped by regiment and company without heirarchical indexing

Step 9. Group the entire dataframe by regiment and company

Step 10. What is the number of observations in each regiment and company

Step 11. Iterate over a group and print the name and the whole data from the regiment

티스토리툴바