정류 된 선형 유닛을 가진 1 숨은 레이어 신경망

나의 목표는 정류 된 선형 유닛 nn.relu()과 1024 개의 숨겨진 노드를 가진 1 숨김 레이어 신경망을 구현하는 것이다.정류 된 선형 유닛을 가진 1 숨은 레이어 신경망

# These are all the modules we'll be using later. Make sure you can import them 
# before proceeding further. 
from __future__ import print_function 
import matplotlib.pyplot as plt 
import numpy as np 
import os 
import sys 
import tarfile 
from IPython.display import display, Image 
from scipy import ndimage 
from sklearn.linear_model import LogisticRegression 
from six.moves.urllib.request import urlretrieve 
from six.moves import cPickle as pickle 
from six.moves import range 
import tensorflow as tf 

url = 'https://commondatastorage.googleapis.com/books1000/' 
last_percent_reported = None 
data_root = '.' # Change me to store data elsewhere 

def download_progress_hook(count, blockSize, totalSize): 
    """A hook to report the progress of a download. This is mostly intended for users with 
    slow internet connections. Reports every 5% change in download progress. 
    """ 
    global last_percent_reported 
    percent = int(count * blockSize * 100/totalSize) 

    if last_percent_reported != percent: 
    if percent % 5 == 0: 
     sys.stdout.write("%s%%" % percent) 
     sys.stdout.flush() 
    else: 
     sys.stdout.write(".") 
     sys.stdout.flush() 

    last_percent_reported = percent 

def maybe_download(filename, expected_bytes, force=False): 
    """Download a file if not present, and make sure it's the right size.""" 
    dest_filename = os.path.join(data_root, filename) 
    if force or not os.path.exists(dest_filename): 
    print('Attempting to download:', filename) 
    filename, _ = urlretrieve(url + filename, dest_filename, reporthook=download_progress_hook) 
    print('\nDownload Complete!') 
    statinfo = os.stat(dest_filename) 
    if statinfo.st_size == expected_bytes: 
    print('Found and verified', dest_filename) 
    else: 
    raise Exception(
     'Failed to verify ' + dest_filename + '. Can you get to it with a browser?') 
    return dest_filename 

# If error in download get it here: http://yaroslavvb.com/upload/notMNIST/ 
train_filename = maybe_download('notMNIST_large.tar.gz', 247336696) 
test_filename = maybe_download('notMNIST_small.tar.gz', 8458043) 

num_classes = 10 
np.random.seed(133) 

def maybe_extract(filename, force=False): 
    root = os.path.splitext(os.path.splitext(filename)[0])[0] # remove .tar.gz 
    if os.path.isdir(root) and not force: 
    # You may override by setting force=True. 
    print('%s already present - Skipping extraction of %s.' % (root, filename)) 
    else: 
    print('Extracting data for %s. This may take a while. Please wait.' % root) 
    tar = tarfile.open(filename) 
    sys.stdout.flush() 
    tar.extractall(data_root) 
    tar.close() 
    data_folders = [ 
    os.path.join(root, d) for d in sorted(os.listdir(root)) 
    if os.path.isdir(os.path.join(root, d))] 
    if len(data_folders) != num_classes: 
    raise Exception(
     'Expected %d folders, one per class. Found %d instead.' % (
     num_classes, len(data_folders))) 
    print(data_folders) 
    return data_folders 

train_folders = maybe_extract(train_filename) 
test_folders = maybe_extract(test_filename) 

pickle_file = 'notMNIST.pickle' 

with open(pickle_file, 'rb') as f: 
    save = pickle.load(f,encoding='latin1') 
    train_dataset = save['train_dataset'] 
    train_labels = save['train_labels'] 
    valid_dataset = save['valid_dataset'] 
    valid_labels = save['valid_labels'] 
    test_dataset = save['test_dataset'] 
    test_labels = save['test_labels'] 
    del save # hint to help gc free up memory 
    print('Training set', train_dataset.shape, train_labels.shape) 
    print('Validation set', valid_dataset.shape, valid_labels.shape) 
    print('Test set', test_dataset.shape, test_labels.shape) 

image_size = 28 
num_labels = 10 

def reformat(dataset, labels): 
    dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32) 
    # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...] 
    labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32) 
    return dataset, labels 
train_dataset, train_labels = reformat(train_dataset, train_labels) 
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels) 
test_dataset, test_labels = reformat(test_dataset, test_labels) 
print('Training set', train_dataset.shape, train_labels.shape) 
print('Validation set', valid_dataset.shape, valid_labels.shape) 
print('Test set', test_dataset.shape, test_labels.shape) 

batch_size = 128 
hidden_nodes = 1024 

graph = tf.Graph() 
with graph.as_default(): 

    x_train = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size)) 
    y_ = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) 
    x_valid = tf.constant(valid_dataset) 
    x_test = tf.constant(test_dataset) 

    hidden_layer = tf.contrib.layers.fully_connected(x_train,hidden_nodes) 

    logits = tf.contrib.layers.fully_connected(hidden_layer, num_labels, activation_fn=None) 
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,labels=y_)) 

    optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss) 

    train_prediction = tf.nn.softmax(logits) 
    valid_relu = tf.contrib.layers.fully_connected(x_valid,hidden_nodes) 
    valid_prediction = tf.nn.softmax(tf.contrib.layers.fully_connected(valid_relu,num_labels)) 

    test_relu = tf.contrib.layers.fully_connected(x_test,hidden_nodes, activation_fn=None) 
    test_prediction = tf.nn.softmax(tf.contrib.layers.fully_connected(test_relu,num_labels, activation_fn=None)) 

steps = 3001 

with tf.Session(graph=graph) as session: 
    tf.global_variables_initializer().run() 

    for step in range(steps): 
     # Selecting some portion within training data 
     # Note: Better to randomize dataset for Minibatch SGD 
     offset = (step * batch_size) % (train_labels.shape[0] - batch_size) 
     # Generate the Minibatch 
     batch_data = train_dataset[offset:(offset + batch_size), :] 
     batch_labels = train_labels[offset:(offset + batch_size), :] 
     # Feed the batch size to dict 
     feed_dict = {x_train: batch_data, y_:batch_labels} 
     _, l, prediction = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict) 
     if(step % 500 == 0): 
      print("Minibatch Loss at step %d: %f"% (step, l)) 
      print("Minibatch accuracy: %.1f%%" % accuracy(prediction,batch_labels)) 
      print("Validation accuracy :%.1f%% "% accuracy(valid_prediction.eval(),valid_labels)) 



    print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))

나는 this tutorial을 다음 오전 내 코드보다 더 나은 정확도를 얻었다.

tf.contrib.layers.fully_connected을 숨겨진 레이어로 사용하여 비슷한 결과를 얻고 싶었지만 제대로 했습니까?

편집 :

변경된 입력 logits

재 작업 valid_relu, valid_prediction, test_relu, test_prediction

결과

에 hidden_layer합니다 :

Minibatch Loss at step 0: 2.389448 
Minibatch accuracy: 5.5% 
Validation accuracy :8.2% 
Minibatch Loss at step 500: 0.342108 
Minibatch accuracy: 92.2% 
Validation accuracy :8.2% 
Minibatch Loss at step 1000: 0.543803 
Minibatch accuracy: 84.4% 
Validation accuracy :8.2% 
Minibatch Loss at step 1500: 0.299978 
Minibatch accuracy: 93.8% 
Validation accuracy :8.2% 
Minibatch Loss at step 2000: 0.294090 
Minibatch accuracy: 93.8% 
Validation accuracy :8.2% 
Minibatch Loss at step 2500: 0.333070 
Minibatch accuracy: 90.6% 
Validation accuracy :8.2% 
Minibatch Loss at step 3000: 0.365324 
Minibatch accuracy: 89.1% 
Validation accuracy :8.2% 
Test accuracy: 6.8%

출처

2017-12-17 Marr

대신 Estimator를 사용하십시오 ... –

당신은 바로 시작. 바로 여기에 몇 가지 추가 :

당신이 tf.contrib.layers.fully_connected의 faviour에 수동 FC 층으로 치우는 것 때문에

이 w 및 b뿐만 아니라 놓습니다. 이것은 그 무게에 맞는 초기화를 따기 당신에게 시간을 절약 할 수 있습니다 :

hidden_layer = tf.contrib.layers.fully_connected(x_train, hidden_nodes) 
logits = tf.contrib.layers.fully_connected(hidden_layer, num_labels, 
              activation_fn=None)

그것은 심지어 튜토리얼에서 마우스 오른쪽 상수와 같은 그래프로 데이터 집합을 넣고도 추론 노드를 복제 연습 나쁜입니다. 대신 valid_dataset 및 test_dataset을 feed_dict으로 입력하고 train_prediction으로 평가하면됩니다.

# BAD idea: this potentially large value is stored in the graph, can lead to OOM 
x_valid = tf.constant(valid_dataset) 
x_test = tf.constant(test_dataset) 
... 
# BAD idea: model duplication 
valid_relu = tf.contrib.layers.fully_connected(x_valid, hidden_nodes) 
valid_prediction = tf.nn.softmax(tf.matmul(valid_relu, w) + b) 
test_relu = tf.contrib.layers.fully_connected(x_test, hidden_nodes) 
test_prediction = tf.nn.softmax(tf.matmul(test_relu, w) + b)

tensorflow.contrib

fully_connected

tf.layers.dense

fully_connected

출처

2017-12-17 08:31:51 Maxim

새로운 ValueError : fully_connected_1 레이어의 입력 0이 예상 레이어 min_ndim = 2과 호환되지 않습니다. ndim = 0을 (를) 발견했습니다. 수신 된 전체 모양 : [] – Marr

수정 됨 : 입력은 숨겨진 레이어 출력이어야합니다. 새로운 오류가 발생하면 ** reproducible example ** – Maxim

New edit로 질문을 업데이트하십시오. 나는 나쁜 결과를 얻었다. – Marr

정류 된 선형 유닛을 가진 1 숨은 레이어 신경망

답변

관련 문제