Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

IT이야기

Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

cyworld 2021. 10. 4. 20:51

Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

데이터 프레임에 행을 추가할 때 새 행에 대해 원하는 인덱스를 지정하는 방법이 있습니까?

In [1301]: df = DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])

In [1302]: df
Out[1302]: 
          A         B         C         D
0 -1.137707 -0.891060 -0.693921  1.613616
1  0.464000  0.227371 -0.496922  0.306389
2 -2.290613 -1.134623 -1.561819 -0.260838
3  0.281957  1.523962 -0.902937  0.068159
4 -0.057873 -0.368204 -1.144073  0.861209
5  0.800193  0.782098 -1.069094 -1.099248
6  0.255269  0.009750  0.661084  0.379319
7 -0.008434  1.952541 -1.056652  0.533946

In [1303]: s = df.xs(3)

In [1304]: df.append(s, ignore_index=True)
Out[1304]: 
          A         B         C         D
0 -1.137707 -0.891060 -0.693921  1.613616
1  0.464000  0.227371 -0.496922  0.306389
2 -2.290613 -1.134623 -1.561819 -0.260838
3  0.281957  1.523962 -0.902937  0.068159
4 -0.057873 -0.368204 -1.144073  0.861209
5  0.800193  0.782098 -1.069094 -1.099248
6  0.255269  0.009750  0.661084  0.379319
7 -0.008434  1.952541 -1.056652  0.533946
8  0.281957  1.523962 -0.902937  0.068159

여기서 새 행은 색인 레이블을 자동으로 가져옵니다. 새 레이블을 제어할 수 있는 방법이 있습니까?

name시리즈의가된다 indexDataFrame의 행의 :

In [99]: df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])

In [100]: s = df.xs(3)

In [101]: s.name = 10

In [102]: df.append(s)
Out[102]: 
           A         B         C         D
0  -2.083321 -0.153749  0.174436  1.081056
1  -1.026692  1.495850 -0.025245 -0.171046
2   0.072272  1.218376  1.433281  0.747815
3  -0.940552  0.853073 -0.134842 -0.277135
4   0.478302 -0.599752 -0.080577  0.468618
5   2.609004 -1.679299 -1.593016  1.172298
6  -0.201605  0.406925  1.983177  0.012030
7   1.158530 -2.240124  0.851323 -0.240378
10 -0.940552  0.853073 -0.134842 -0.277135

df.loc이 작업을 수행합니다.

>>> df = pd.DataFrame(np.random.randn(3, 2), columns=['A','B'])
>>> df
          A         B
0 -0.269036  0.534991
1  0.069915 -1.173594
2 -1.177792  0.018381
>>> df.loc[13] = df.loc[1]
>>> df
           A         B
0  -0.269036  0.534991
1   0.069915 -1.173594
2  -1.177792  0.018381
13  0.069915 -1.173594

질문에 게시된 것과 동일한 데이터 샘플을 참조합니다.

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
print('The original data frame is: \n{}'.format(df))

이 코드를 실행하면

The original data frame is:

          A         B         C         D
0  0.494824 -0.328480  0.818117  0.100290
1  0.239037  0.954912 -0.186825 -0.651935
2 -1.818285 -0.158856  0.359811 -0.345560
3 -0.070814 -0.394711  0.081697 -1.178845
4 -1.638063  1.498027 -0.609325  0.882594
5 -0.510217  0.500475  1.039466  0.187076
6  1.116529  0.912380  0.869323  0.119459
7 -1.046507  0.507299 -0.373432 -1.024795

Now you wish to append a new row to this data frame, which doesn't need to be copy of any other row in the data frame. @Alon suggested an interesting approach to use df.loc to append a new row with different index. The issue, however, with this approach is if there is already a row present at that index, it will be overwritten by new values. This is typically the case for datasets when row index is not unique, like store ID in transaction datasets. So a more general solution to your question is to create the row, transform the new row data into a pandas series, name it to the index you want to have and then append it to the data frame. Don't forget to overwrite the original data frame with the one with appended row. The reason is df.append returns a view of the dataframe and does not modify its contents. Following is the code:

row = pd.Series({'A':10,'B':20,'C':30,'D':40},name=3)
df = df.append(row)
print('The new data frame is: \n{}'.format(df))

Following would be the new output:

The new data frame is:

           A          B          C          D
0   0.494824  -0.328480   0.818117   0.100290
1   0.239037   0.954912  -0.186825  -0.651935
2  -1.818285  -0.158856   0.359811  -0.345560
3  -0.070814  -0.394711   0.081697  -1.178845
4  -1.638063   1.498027  -0.609325   0.882594
5  -0.510217   0.500475   1.039466   0.187076
6   1.116529   0.912380   0.869323   0.119459
7  -1.046507   0.507299  -0.373432  -1.024795
3  10.000000  20.000000  30.000000  40.000000

ReferenceURL : https://stackoverflow.com/questions/16824607/pandas-appending-a-row-to-a-dataframe-and-specify-its-index-label

'IT이야기' 카테고리의 다른 글

C에서 서명되지 않은 문자를 인쇄하는 방법 (0)	2021.10.05
numpy 값이 true인 인덱스 가져오기 (0)	2021.10.05
python 프로젝트에 모든 종속성을 설치하기 위해 requirements.txt를 사용하는 방법 (0)	2021.10.04
WCF: 속성 대 구성원의 DataMember 특성 (0)	2021.10.04
프로그램 내에서 python의 버퍼링되지 않은 stdout(python -u에서와 같이) (0)	2021.10.04

현재글Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

각종 프로그래밍 정보를 다루는 블로그입니다.

뮤지컬, 가족나들이, 볼거리, c#, 놀거리, javascript, 경기, jQuery, 행사, 축제, 관광, 주말나들이, 여행, 연극, 숙박, 펜션, 공연, 유치원, spring3, Java,

Today :
Yesterday :

cyworld

Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

'IT이야기' 카테고리의 다른 글

'IT이야기'의 다른글

티스토리툴바

« 2026/02 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

Pandas: 데이터 프레임에 행 추가 및 인덱스 레이블 지정

'IT이야기' 카테고리의 다른 글

'IT이야기'의 다른글

관련글

티스토리툴바