🐍Python/Pandas

[Pandas] Pandas05 - Housing Market 풀이

728x90
반응형

Housing Market

Introduction:

This time we will create our own dataset with fictional numbers to describe a house market. As we are going to create random data don't try to reason of the numbers.

Step 1. Import the necessary libraries

In [4]:
import pandas as pd
import numpy as np
import random

Step 2. Create 3 differents Series, each of length 100, as follows:

  1. The first a random number from 1 to 4 
  2. The second a random number from 1 to 3
  3. The third a random number from 10,000 to 30,000
In [13]:
A = pd.Series(np.random.randint(1,5,100))
B = pd.Series(np.random.randint(1,4,100))
C = pd.Series(np.random.randint(10000,30000,100))

Step 3. Let's create a DataFrame by joinning the Series by column

In [17]:
SC = pd.concat([A,B,C],axis=1)
SC.head()
Out[17]:
012
03111401
13119543
22315902
33121490
42224496

Step 4. Change the name of the columns to bedrs, bathrs, price_sqr_meter

In [19]:
SC.columns = ['bedrs','bathrs','price_sqr_meter']
SC.head()
Out[19]:
bedrsbathrsprice_sqr_meter
03111401
13119543
22315902
33121490
42224496

Step 5. Create a one column DataFrame with the values of the 3 Series and assign it to 'bigcolumn'

In [27]:
bigcolumn = pd.concat([A,B,C],axis=0)
bigcolumn
Out[27]:
0         3
1         3
2         2
3         3
4         2
5         3
6         3
7         1
8         4
9         4
10        3
11        2
12        1
13        1
14        3
15        1
16        1
17        4
18        3
19        3
20        4
21        2
22        2
23        3
24        4
25        4
26        3
27        1
28        3
29        1
      ...  
70    23279
71    15295
72    24821
73    24792
74    28154
75    21395
76    19312
77    19569
78    15967
79    16023
80    11870
81    12675
82    13671
83    12488
84    16074
85    26507
86    10980
87    24407
88    12235
89    26258
90    13673
91    22398
92    14715
93    19513
94    26824
95    27098
96    17801
97    22476
98    18173
99    11456
Length: 300, dtype: int64

Step 6. Ops it seems it is going only until index 99. Is it true?

In [22]:
len(bigcolumn)
Out[22]:
300

Step 7. Reindex the DataFrame so it goes from 0 to 299

In [28]:
# bigcolumn.index = range(0,300)
# bigcolumn

# 위 아래 둘 다 가능 

bigcolumn.reset_index(drop=True, inplace=True)
bigcolumn
Out[28]:
0          3
1          3
2          2
3          3
4          2
5          3
6          3
7          1
8          4
9          4
10         3
11         2
12         1
13         1
14         3
15         1
16         1
17         4
18         3
19         3
20         4
21         2
22         2
23         3
24         4
25         4
26         3
27         1
28         3
29         1
       ...  
270    23279
271    15295
272    24821
273    24792
274    28154
275    21395
276    19312
277    19569
278    15967
279    16023
280    11870
281    12675
282    13671
283    12488
284    16074
285    26507
286    10980
287    24407
288    12235
289    26258
290    13673
291    22398
292    14715
293    19513
294    26824
295    27098
296    17801
297    22476
298    18173
299    11456
Length: 300, dtype: int64


728x90
반응형