First neural network

Running neural netowork(shallow_net_in_keras.ipynb)

  • We are going to take a look at this notebook shallow_net_in_keras.ipynb
  • create a new notebook file -> new

  • name the notebook shallow-net-demo

  • Jupiter-notebook shortcut for markdown is esc+1

  • This will start markdown mode and give title shallow-net in Keras

  • Keras is a high level API to call Tensor-Flow in backend and hence will be easy tool for starting up

  • Press esc + m to do markdown cell

  • tell what we are doing in markdown cell "build a shallow neural network to classify MNIST digits"

MNIST?
  • Mnist digits are images 60,000 to train on and 10,000 to test on. Image is image of a digit
  • Every single digit is an square image which is 28 pixels high and 28 pixels wide
  • Withing the image can be number 2 or any number which is handwritten

What we are going to do?

  • We will have 60000 different handwritten digits and we are going to classify them in 10 categories 0 to 9
  • We are also going to set a seed so all of us get same result
  • esc + 4 (4th level heading similar to ####) and set a comment "setting seed for reproducibility"

Setting random seed

import numpy as np
np.random.seed = 42
  • esc + 4 (load dependencies)
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD

What is dense layer

  • We divide the 28 by 28 image (total 784 pixels 28 by 28) into a one big array
  • first row (0 to 27) + second row (28 to 55) and so on till 783
  • each pixel as a value . a darker value is higher , while a white value is 0.
  • We are feeding these values to neural network. this is the input layer 0 ...783 array
  • We feed this input layer to hidden layer
  • This hidden layer will then send it to output layer
  • input layer -> hidden layer (dense) -> output layer
  • output layer is very simple. It is the neural nodes which will do classification . output layer is 0 1 ..9 which classify inputs

  • what is Dense hidden layer:

    • This means that each and every input neuron is connected to every hidden layer neuron.
    • every output neuron is connected to every hidden layer neuron
    • hence this is called fully-connected network
  • We can choose number of hidden layer neurons and we will fine tune them

  • currently we choose 64 nodes so 0 to 63 in middle hidden layer

What is SGD:
  • Stacastic Gradient descent , we will see this later

Load Data

  • now we have all the dependencies , now we will load data
  • in ML as convention inputs are termed as X and outputs that we are trying to guess are termed as y
  • X is the input (the array specifying the image) and y is output which is the specific label (0 to 9)
  • reason we choose capital X is because its a matrix and y is small as its one D array
(X_train, y_train), (X_test, y_test) = mnist.load_data()
  • lets see the size of input
X_train.shape
  • we can see 60,000 images with 28 by 28 pixels
Y_train.shape
y_train[0:99]
X_train[0]

Preprocessing data X

  • we need to convert data from 2D array to 1D array
  • we need to convert 0 to 255 into 0 and 1
X_train = X_train.reshape(60000,784).astype('float32')
X_test = X_test.reshape(10000,784).astype('float32')

X_train /= 255
X_test /= 255

Preprocessing data Y

  • For this kind of neural network instead of integers we want to have a binary representation
  • eg if number is 2 then all other digits will be 0 except 2 which will be 1
  • to do this we use to_categorical method from keras.utils
n_classes = 10
y_train = keras.utils.to_categorical(y_train, n_classes)
y_test = keras.utils.to_categorical(y_test, n_classes)

Design neural network

  • since these layers feed into each other this is a sequential model
model = Sequential()
model.add(Dense((64), activation='sigmoid', input_shape=(784,)))
model.add(Dense((10),activation='softmax'))

Configure model

model.compile(loss='mean_squared_error', optimizer=SGD(lr=0.01), metrics=['accuracy'])

Train

  • an epoch is the amount of runs we will set to train
model.fit(X_train, y_train, batch_size=120, epochs=1, verbose=1, validation_data=(X_test, y_test))
  • as the epoch increases the network trains better its accuracy for training and validation increases

results for ""

    No results matching ""