Ch 6: Neural Networks

Neural Networks are very important in machine learning and growing in popularity due to the major breakthroughs in prior unsolved problems. We must start with introducing ‘shallow’ neural networks, which are very powerful and can help us improve our prior ML algorithm results. We start by introducing the very basic NN unit, the operational gate. We gradually add more and more to the neural network and end with training a model to play tic-tac-toe.

  1. Introduction

    • We introduce the concept of neural networks and how TensorFlow is built to easily handle these algorithms.
  2. Implementing Operational Gates
    • We implement an operational gate with one operation. Then we show how to extend this to multiple nested operations.
  3. Working with Gates and Activation Functions
    • Now we have to introduce activation functions on the gates. We show how different activation functions operate.
  4. Implementing a One Layer Neural Network
    • We have all the pieces to start implementing our first neural network. We do so here with regression on the Iris data set.
  5. Implementing Different Layers
    • This section introduces the convolution layer and the max-pool layer. We show how to chain these together in a 1D and 2D example with fully connected layers as well.
  6. Using Multi-layer Neural Networks
    • Here we show how to functionalize different layers and variables for a cleaner multi-layer neural network.
  7. Improving Predictions of Linear Models
    • We show how we can improve the convergence of our prior logistic regression with a set of hidden layers.
  8. Learning to Play Tic-Tac-Toe
    • Given a set of tic-tac-toe boards and corresponding optimal moves, we train a neural network classification model to play. At the end of the script, we can attempt to play against the trained model.

02 Implementing an Operational Gate

# Implementing Gates
# This function shows how to implement
# various gates in TensorFlow
# One gate will be one operation with
# a variable and a placeholder.
# We will ask TensorFlow to change the
# variable based on our loss functionimport tensorflow as tf
from tensorflow.python.framework import ops
ops.reset_default_graph()# Start Graph Session
sess = tf.Session()#----------------------------------
# Create a multiplication gate:
#   f(x) = a * x
#  a --
#      |
#      |---- (multiply) --> output
#  x --|
#a = tf.Variable(tf.constant(4.))
x_val = 5.
x_data = tf.placeholder(dtype=tf.float32)multiplication = tf.multiply(a, x_data)# Declare the loss function as the difference between
# the output and a target value, 50.
loss = tf.square(tf.subtract(multiplication, 50.))# Initialize variables
init = tf.global_variables_initializer() Declare optimizer
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)# Run loop across gate
print('Optimizing a Multiplication Gate Output to 50.')
for i in range(10), feed_dict={x_data: x_val})a_val = =, feed_dict={x_data: x_val})print(str(a_val) + ' * ' + str(x_val) + ' = ' + str(mult_output))#----------------------------------
# Create a nested gate:
#   f(x) = a * x + b
#  a --
#      |
#      |-- (multiply)--
#  x --|              |
#                     |-- (add) --> output
#                 b --|
## Start a New Graph Session
sess = tf.Session()a = tf.Variable(tf.constant(1.))
b = tf.Variable(tf.constant(1.))
x_val = 5.
x_data = tf.placeholder(dtype=tf.float32)two_gate = tf.add(tf.multiply(a, x_data), b)# Declare the loss function as the difference between
# the output and a target value, 50.
loss = tf.square(tf.subtract(two_gate, 50.))# Initialize variables
init = tf.global_variables_initializer() Declare optimizer
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)# Run loop across gate
print('\nOptimizing Two Gate Output to 50.')
for i in range(10), feed_dict={x_data: x_val})a_val, b_val = (, =, feed_dict={x_data: x_val})print(str(a_val) + ' * ' + str(x_val) + ' + ' + str(b_val) + ' = ' + str(two_gate_output))
Optimizing a Multiplication Gate Output to 50.
7.0 * 5.0 = 35.0
8.5 * 5.0 = 42.5
9.25 * 5.0 = 46.25
9.625 * 5.0 = 48.125
9.8125 * 5.0 = 49.0625
9.90625 * 5.0 = 49.5313
9.95313 * 5.0 = 49.7656
9.97656 * 5.0 = 49.8828
9.98828 * 5.0 = 49.9414
9.99414 * 5.0 = 49.9707Optimizing Two Gate Output to 50.
5.4 * 5.0 + 1.88 = 28.88
7.512 * 5.0 + 2.3024 = 39.8624
8.52576 * 5.0 + 2.50515 = 45.134
9.01236 * 5.0 + 2.60247 = 47.6643
9.24593 * 5.0 + 2.64919 = 48.8789
9.35805 * 5.0 + 2.67161 = 49.4619
9.41186 * 5.0 + 2.68237 = 49.7417
9.43769 * 5.0 + 2.68754 = 49.876
9.45009 * 5.0 + 2.69002 = 49.9405
9.45605 * 5.0 + 2.69121 = 49.9714

03 Working with Activation Functions

# Combining Gates and Activation Functions
# This function shows how to implement
# various gates with activation functions
# in TensorFlow
# This function is an extension of the
# prior gates, but with various activation
# functions.import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.framework import ops
ops.reset_default_graph()# Start Graph Session
config = tf.ConfigProto(allow_soft_placement= True, log_device_placement= True)
sess = tf.Session(config= config)
#sess = tf.Session()
np.random.seed(42)batch_size = 50a1 = tf.Variable(tf.random_normal(shape=[1,1]))
b1 = tf.Variable(tf.random_uniform(shape=[1,1]))
a2 = tf.Variable(tf.random_normal(shape=[1,1]))
b2 = tf.Variable(tf.random_uniform(shape=[1,1]))
x = np.random.normal(2, 0.1, 500)
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)sigmoid_activation = tf.sigmoid(tf.add(tf.matmul(x_data, a1), b1))relu_activation = tf.nn.relu(tf.add(tf.matmul(x_data, a2), b2))# Declare the loss function as the difference between
# the output and a target value, 0.75.
loss1 = tf.reduce_mean(tf.square(tf.subtract(sigmoid_activation, 0.75)))
loss2 = tf.reduce_mean(tf.square(tf.subtract(relu_activation, 0.75)))# Initialize variables
init = tf.global_variables_initializer() Declare optimizer
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step_sigmoid = my_opt.minimize(loss1)
train_step_relu = my_opt.minimize(loss2)# Run loop across gate
print('\nOptimizing Sigmoid AND Relu Output to 0.75')
loss_vec_sigmoid = []
loss_vec_relu = []
for i in range(500):rand_indices = np.random.choice(len(x), size=batch_size)x_vals = np.transpose([x[rand_indices]]), feed_dict={x_data: x_vals}), feed_dict={x_data: x_vals})loss_vec_sigmoid.append(, feed_dict={x_data: x_vals}))loss_vec_relu.append(, feed_dict={x_data: x_vals}))    sigmoid_output = np.mean(, feed_dict={x_data: x_vals}))relu_output = np.mean(, feed_dict={x_data: x_vals}))if i%50==0:print('sigmoid = ' + str(np.mean(sigmoid_output)) + ' relu = ' + str(np.mean(relu_output)))# Plot the loss
plt.plot(loss_vec_sigmoid, 'k-', label='Sigmoid Activation')
plt.plot(loss_vec_relu, 'r--', label='Relu Activation')
plt.ylim([0, 1.0])
plt.title('Loss per Generation')
plt.legend(loc='upper right')
Optimizing Sigmoid AND Relu Output to 0.75
sigmoid = 0.126552 relu = 2.02276
sigmoid = 0.178638 relu = 0.75303
sigmoid = 0.247698 relu = 0.74929
sigmoid = 0.344675 relu = 0.749955
sigmoid = 0.440066 relu = 0.754
sigmoid = 0.52369 relu = 0.754772
sigmoid = 0.583739 relu = 0.75087
sigmoid = 0.627335 relu = 0.747023
sigmoid = 0.65495 relu = 0.751805
sigmoid = 0.674526 relu = 0.754707

