TensorFlow에서 cross_entropy 계산의 두 가지 버전의 차이점

바이너리 분류 신경망을 학습하기 위해 TensorFlow를 사용하고 있습니다.TensorFlow에서 cross_entropy 계산의 두 가지 버전의 차이점

반년 전에 나는 TensorFlow 웹 사이트 (Deep MNIST for Experts)의 지침서를 따랐습니다.

오늘 저는 두 코드 (자습서와 내가 작성한 코드)를 비교할 때 교차 엔트로피 계산의 차이를 확인할 수 있습니다. 내가 왜 그곳에 있는지 말할 수없는 차이점. 로 계산 튜토리얼 교차 엔트로피에서

은 다음과 같습니다 계산은 다음과 같다

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_))

내 코드 에있는 동안 :

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))

내가 Tensorflow에 새로운 해요, 나는 뭔가를 놓치고 있다고 느낍니다. Mabey의 차이점은 두 가지 버전의 TensorFlow 자습서 사이에 무엇입니까? 두 줄 간의 실제 차이점은 무엇입니까?

정말 고맙습니다. 감사!

튜토리얼에서 관련 코드 :

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2 ... cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) sess.run(tf.global_variables_initializer())

내 코드 :

# load data folds = build_database_tuple.load_data(data_home_dir=data_home_dir,validation_ratio=validation_ratio,patch_size=patch_size) # starting the session. using the InteractiveSession we avoid build the entiee comp. graph before starting the session sess = tf.InteractiveSession() # start building the computational graph # the 'None' indicates the number of classes - a value that we wanna leave open for now x = tf.placeholder(tf.float32, shape=[None, patch_size**2]) #input images - 28x28=784 y_ = tf.placeholder(tf.float32, shape=[None, 2]) #output classes (using one-hot vectors) # the vriables for the linear layer W = tf.Variable(tf.zeros([(patch_size**2),2])) #weights - 784 input features and 10 outputs b = tf.Variable(tf.zeros([2])) #biases - 10 classes # initialize all the variables using the session, in order they could be used in it sess.run(tf.initialize_all_variables()) # implementation of the regression model y = tf.nn.softmax(tf.matmul(x,W) + b) # Done! # FIRST LAYER: # build the first layer W_conv1 = weight_variable([first_conv_kernel_size, first_conv_kernel_size, 1, first_conv_output_channels]) # 5x5 patch, 1 input channel, 32 output channels (features) b_conv1 = bias_variable([first_conv_output_channels]) x_image = tf.reshape(x, [-1,patch_size,patch_size,1]) # reshape x to a 4d tensor. 2,3 are the image dimensions, 4 is ine color channel # apply the layers h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) # SECOND LAYER: # 64 features each 5x5 patch W_conv2 = weight_variable([sec_conv_kernel_size, sec_conv_kernel_size, patch_size, sec_conv_output_channels]) b_conv2 = bias_variable([sec_conv_output_channels]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) # FULLY CONNECTED LAYER: # 1024 neurons, 8x8 - new size after 2 pooling layers W_fc1 = weight_variable([(patch_size/4) * (patch_size/4) * sec_conv_output_channels, fc_vec_size]) b_fc1 = bias_variable([fc_vec_size]) h_pool2_flat = tf.reshape(h_pool2, [-1, (patch_size/4) * (patch_size/4) * sec_conv_output_channels]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) # dropout layer - meant to reduce over-fitting keep_prob = tf.placeholder(tf.float32) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) # READOUT LAYER: # softmax regression W_fc2 = weight_variable([fc_vec_size, 2]) b_fc2 = bias_variable([2]) y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) # TRAIN AND EVALUATION: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1])) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) sess.run(tf.initialize_all_variables())

출처

2017-01-25 roishik

의 차이는 작지만 상당히 표시하는 것입니다.

softmax_cross_entropy_with_logits는 로짓 (범위 제한이없는 실수)을 취하여 softmax 함수를 통해 전달한 다음 교차 엔트로피를 계산합니다. 두 함수를 하나의 함수로 결합하면 수치 정확도를 향상시키기 위해 최적화가 적용됩니다.

두 번째 코드는 soft_max 함수의 출력 인 것처럼 보이는 y_conv에 직접 교차 엔트로피를 적용합니다. 이것은 정확하고 둘 다 유사하지만 결과는 같지 않아야합니다. softmax_cross_entropy_with_logits는 수치 안정성 때문에 우수합니다. 그냥 softmax의 출력이 아닌 logits를주는 것을 잊지 마십시오.

출처

2017-01-25 12:09:00

TensorFlow에서 cross_entropy 계산의 두 가지 버전의 차이점

답변

관련 문제