히스토그램 작성시 오류 계산

개의 관측 세트가 (x[i], y[i]), i=0..N 점으로 2D 공간에 분산되어 있습니다. 각 점에는 좌표 (e_x[i], e_y[i], i=0..N)와 연결된 가중치 (w[i], i=0..N)의 오류가 있습니다. 오류 값이 큰 경우히스토그램 작성시 오류 계산

는

나는 무게가뿐만 아니라 각 지점 원인이 될 것이다 오류에 대한뿐만 아니라 회계, 이러한 N 점의 2 차원 히스토그램을 생성하고 싶습니다 가능성이 많은 쓰레기통 사이 확산 (다른 배포판을 고려할 수도 있지만 오류에 대해서는 표준 Gaussian distribution으로 가정).

나는 numpy.histogram2d이 weights 매개 변수를 가지므로이를 참조해야합니다. 문제는 관찰 된 각 포인트 N의 오류를 설명하는 방법입니다.

내가 할 수있는 기능이 있습니까? 나는 numpy과 scipy에서 무엇이든 열릴 수 있습니다.

출처

2014-10-06 Gabriel

이러한 오류 값은 무엇을 상징 하는가? 이러한 표준 편차가 주축을 따라 있습니까? –

@Dabrion. – Gabriel

그 매개 변수 집합은 주어진 가중치 (\ pi_i), 평균값 (\ mu_i) 및 공분산 행렬 (\ Sigma_i)로 표본 [[e_x [i] ** 2,0]으로 주어진 다 변수 GMM을 구성합니다 [ 0, e_y [i] ** 2]]. 가정 한 표준 일반 경우 (모든 e_x 및 e_y가 1.0과 동일 함)와 달리 대각선에 고유 값이있을 수있는 공분산 행렬이 있습니다. 이것은 원과 반대로 주요 축을 따라 주요 축이있는 타원에 해당합니다. 그렇게하면 앞으로 나아갈 수 있습니까? –

user1415946의 의견을 바탕으로 작성하면 각 점이 bi-variate normal distribution이고 공분산 행렬이 [[e_x[i]**2,0][0,e_y[i]**2]] 인 것으로 가정 할 수 있습니다. 그러나 결과적인 분포는 정규 분포가 아닙니다. 예제를 실행 한 후에 히스토그램이 가우시안을 닮은 것이 아니라 대신 그 그룹을 나타냅니다.

이 분포 집합에서 히스토그램을 만들려면, numpy.random.multivariate_normal을 사용하여 각 점에서 임의의 샘플을 생성하는 것이 좋습니다. 아래의 예제 코드에서 인공적인 데이터를 참조하십시오.

import numpy as np 
from mpl_toolkits.mplot3d import Axes3D 
import matplotlib.pyplot as plt 


# This is a function I like to use for plotting histograms 
def plotHistogram3d(hist, xedges, yedges): 
    fig = plt.figure() 
    ax = fig.add_subplot(111, projection='3d') 
    hist = hist.transpose() 
    # Transposing is done so that bar3d x and y match hist shape correctly 
    dx = np.mean(np.diff(xedges)) 
    dy = np.mean(np.diff(yedges)) 

    # Computing the number of elements 
    elements = (len(xedges) - 1) * (len(yedges) - 1) 
    # Generating mesh grids. 
    xpos, ypos = np.meshgrid(xedges[:-1]+dx/2.0, yedges[:-1]+dy/2.0) 

    # Vectorizing matrices 
    xpos = xpos.flatten() 
    ypos = ypos.flatten() 
    zpos = np.zeros(elements) 
    dx = dx * np.ones_like(zpos) * 0.5 # 0.5 factor to give room between bars. 
# Use 1.0 if you want all bars 'glued' to each other 
    dy = dy * np.ones_like(zpos) * 0.5 
    dz = hist.flatten() 

    ax.bar3d(xpos, ypos, zpos, dx, dy, dz, color='b') 
    ax.set_xlabel('x') 
    ax.set_ylabel('y') 
    ax.set_zlabel('Count') 
    return 

""" 
INPUT DATA 
""" 
#     x y ex ey w 
data = np.array([[1, 2, 1, 1, 1], 
       [3, 0, 1, 1, 2], 
       [0, 1, 2, 1, 5], 
       [7, 7, 1, 3, 1]]) 

""" 
Generate samples 
""" 
# Sample size (100 samples will be generated for each data point) 
SAMPLE_SIZE = 100 
# I want to fill in a table with columns [x, y, w]. Each data point generates SAMPLE_SIZE 
# samples, so we have SAMPLE_SIZE * (number of data points) generated points 
points = np.zeros((SAMPLE_SIZE * data.shape[0], 3)) # Initializing this matrix 

for i, element in enumerate(data): # For each row in the data set 
    meanVector = element[:2] 
    covarianceMatrix = np.diag(element[2:4]**2) # Diagonal matrix with elements equal to error^2 
    # For columns 0 and 1, add generated x and y samples 
    points[SAMPLE_SIZE*i:SAMPLE_SIZE*(i+1), :2] = \ 
     np.random.multivariate_normal(meanVector, covarianceMatrix, SAMPLE_SIZE) 
    # For column 2, simply copy original weight 
    points[SAMPLE_SIZE*i:SAMPLE_SIZE*(i+1), 2] = element[4] # weights 

hist, xedges, yedges = np.histogram2d(points[:, 0], points[:, 1], weights=points[:, 2]) 
plotHistogram3d(hist, xedges, yedges) 
plt.show()

결과는 다음과 플롯 :

출처

2014-11-29 22:19:05

가브리엘, 예제에서 각 줄의 내용을 설명하는 설명을 추가 할 수 있습니까? 또한, 어떤 버전의'matplotlib'을 실행하고 있습니까? 버전 1.3.1이 있고 예제를 실행하려고하면 ValueError : Unknown projection '3d''; 이 예제는 http://stackoverflow.com/q/3810865/1391441에 주어진 예제가 아무 문제없이 작동하기 때문에 이상합니다. – Gabriel

나는 당신과 같은 버전을 사용하지만, 나는 대답하기 전에 실수로 수입 라인을 제거했다. 이건 작동해야합니다. 감사 –

답변

관련 문제