최근에 나는 Kaggle에서 잎 분류 문제를 다루고 있습니다. 나는 수첩을 보았다 Simple Keras 1D CNN + features split. 그러나 Tensorflow를 사용하여 동일한 모델을 만들려고하면 매우 낮은 정확도와 손실 변화가 거의 발생하지 않습니다.Keras에서는 동일한 신경 아키텍처가 작동하지만 Tensorflow (리프 분류)는 왜 작동하지 않는가?
Iter: 0 loss: 0.69941 accuracy: 0.0
Iter: 10 loss: 0.69941 accuracy: 0.0
Iter: 20 loss: 0.69941 accuracy: 0.0
Iter: 30 loss: 0.69941 accuracy: 0.0
Iter: 40 loss: 0.69941 accuracy: 0.0
Iter: 50 loss: 0.698778 accuracy: 0.0625
Iter: 60 loss: 0.698778 accuracy: 0.0625
Iter: 70 loss: 0.69941 accuracy: 0.0
Iter: 80 loss: 0.69941 accuracy: 0.0
Iter: 90 loss: 0.69941 accuracy: 0.0
Iter: 100 loss: 0.69941 accuracy: 0.0
Iter: 110 loss: 0.69941 accuracy: 0.0
Iter: 120 loss: 0.69941 accuracy: 0.0
Iter: 130 loss: 0.69941 accuracy: 0.0
Iter: 140 loss: 0.69941 accuracy: 0.0
Iter: 150 loss: 0.69941 accuracy: 0.0
Iter: 160 loss: 0.69941 accuracy: 0.0
Iter: 170 loss: 0.698778 accuracy: 0.0625
......
그러나 Keras에서 동일한 데이터 모델을 사용하는 경우, 그것은 참으로 생성 매우 좋은 결과 :
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.preprocessing import scale,StandardScaler
#preparing data
train=pd.read_csv('E:\\DataAnalysis\\Kaggle\\leaf\\train.csv',sep=',')
test=pd.read_csv('E:\\DataAnalysis\\Kaggle\\leaf\\test.csv',sep=',')
subexp=pd.read_csv('E:/DataAnalysis/Kaggle/leaf/sample_submission.csv')
x_train=np.asarray(train.drop(['species','id'],axis=1),dtype=np.float32)
x_train=scale(x_train).reshape([990,64,3])
ids=list(subexp)[1:]
spec=np.asarray(train['species'])
y_train=np.asarray([[int(x==ids[i]) for i in range(len(ids))] for x in spec],dtype=np.float32)
drop=0.75
batch_size=16
max_epoch=10
iter_per_epoch=int(990/batch_size)
max_iter=int(max_epoch*iter_per_epoch)
features=192
keep_prob=0.75
#inputs, weights, and biases
x=tf.placeholder(tf.float32,[None,64,3])
y=tf.placeholder(tf.float32,[None,99])
w={
'w1':tf.Variable(tf.truncated_normal([1,3,512],dtype=tf.float32)),
'wd1':tf.Variable(tf.truncated_normal([64*512,2048],dtype=tf.float32)),
'wd2':tf.Variable(tf.truncated_normal([2048,1024],dtype=tf.float32)),
'wd3':tf.Variable(tf.truncated_normal([1024,99],dtype=tf.float32))
}
b={
'b1':tf.Variable(tf.truncated_normal([512],dtype=tf.float32)),
'bd1':tf.Variable(tf.truncated_normal([2048],dtype=tf.float32)),
'bd2':tf.Variable(tf.truncated_normal([1024],dtype=tf.float32)),
'bd3':tf.Variable(tf.truncated_normal([99],dtype=tf.float32))
}
#model.
def conv(x,we,bi):
l1a=tf.nn.relu(tf.nn.conv1d(value=x,filters=we['w1'],stride=1,padding='SAME'))
l1a=tf.reshape(tf.nn.bias_add(l1a,bi['b1']),[-1,64*512])
l1=tf.nn.dropout(l1a,keep_prob=0.4)
l2a=tf.nn.relu(tf.add(tf.matmul(l1,we['wd1']),bi['bd1']))
l3a=tf.nn.relu(tf.add(tf.matmul(l2a,we['wd2']),bi['bd2']))
out=tf.nn.softmax(tf.matmul(l3a,we['wd3']))
return out
#optimizer and accuracy
out=conv(x,w,b)
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=out,targets=y))
train_op=tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
correct_pred = tf.equal(tf.argmax(out, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
#train
with tf.Session() as sess :
sess.run(tf.global_variables_initializer())
step=0
while step<max_iter :
d =(step % iter_per_epoch)*batch_size
batch_x=x_train[d:d+batch_size:1]
batch_y=y_train[d:d+batch_size:1]
sess.run(train_op,feed_dict={x: batch_x,y: batch_y})
if step%10==0:
loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
y: batch_y,})
print("Iter: ", step," loss:",loss, " accuracy:",acc)
step+=1
print('Training finished!')
결과는 같은 것이있다 : 여기 내 코드입니다. 코드 :
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import StratifiedShuffleSplit
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten, Convolution1D, Dropout
from keras.optimizers import SGD
from keras.utils import np_utils
model = Sequential()
model.add(Convolution1D(nb_filter=512, filter_length=1, input_shape=(64, 3)))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dropout(0.4))
model.add(Dense(2048, activation='relu'))
model.add(Dense(1024, activation='relu'))
model.add(Dense(99))
model.add(Activation('softmax'))
sgd = SGD(lr=0.01, nesterov=True, decay=1e-6, momentum=0.9)
model.compile(loss='categorical_crossentropy',optimizer=sgd,metrics=['accuracy'])
model.fit(x_train, y_train, nb_epoch=5, batch_size=16)
결과 : 그런데
Epoch 1/5
990/990 [==============================] - 78s - loss: 4.3229 - acc: 0.1404
Epoch 2/5
990/990 [==============================] - 76s - loss: 1.6020 - acc: 0.6384
Epoch 3/5
990/990 [==============================] - 74s - loss: 0.2723 - acc: 0.9384
Epoch 4/5
990/990 [==============================] - 73s - loss: 0.1061 - acc: 0.9758
는 keras는 Tensorflow 백엔드를 사용합니다. 제안이 있습니까?
logits 대신 softmax가 가장 큰 원인입니다. 다른 모든 차이점은 수렴을 방해 할 수도 있지만 그렇지 않을 수도 있습니다. 그러나이 문제는 예방할 수 있습니다. :) –