범주 형 데이터에 레이블을 지정해야합니다. 우리가 홍채 예를 살펴 보자 : 그것은 인쇄됩니다 "unfair"pandas categorical.from_codes
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
iris = load_iris()
print "targets: ", np.unique(iris.target)
print "targets: ", iris.target.shape
print "target_names: ", np.unique(iris.target_names)
print "target_names: ", iris.target_names.shape
:
내가 pandas.Categorical.from_codes 사용하여 원하는 라벨을 생산하기 위해
targets: [0 1 2] targets: (150L,) target_names: ['setosa' 'versicolor' 'virginica'] target_names: (3L,)
:
print pd.Categorical.from_codes(iris.target, iris.target_names)
을[setosa, setosa, setosa, setosa, setosa, ..., virginica, virginica, virginica, virginica, virginica] Length: 150 Categories (3, object): [setosa, versicolor, virginica]
다른 예를 들어 보겠습니다.
# I define new targets
target = np.array([123,123,54,123,123,54,2,54,2])
target = np.array([1,1,3,1,1,3,2,3,2])
target_names = np.array(['paglia','gioele','papa'])
#---
print "targets: ", np.unique(target)
print "targets: ", target.shape
print "target_names: ", np.unique(target_names)
print "target_names: ", target_names.shape
내가 라벨의 범주 값을 변환 다시 시도하십시오 :
C:\Users\ianni\Anaconda2\lib\site-packages\pandas\core\categorical.pyc in from_codes(cls, codes, categories, ordered) 459 460 if len(codes) and (codes.max() >= len(categories) or codes.min() < -1): --> 461 raise ValueError("codes need to be between -1 and " 462 "len(categories)-1") 463
ValueError: codes need to be between -1 and len(categories)-1
당신은 이유를 알고 수행
print pd.Categorical.from_codes(target, target_names)
나는 오류 메시지가?