Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Building Neural Networks with TensorFlow: From Perceptrons to CNNs

Tech May 13 1

Perceptrons: The Building Blocks

Introduced by Frank Rosenblatt in 1957, the perceptron is a fundamental unit of artificial neural networks. It takes multiple input values, multiplies each by a corresponding weight, sums these weighted inputs, and then applies an activation function to produce an output. This simple mechanism can solve basic logical operations like AND and OR.

The relationship between perceptrons and logistic regression is significant. While both use weighted sums and activation functions, perceptrons typically use a step function, whereas logistic regression uses a sigmoid function for probabilistic outputs.

Neural Network Fundamentals

Artificial Neural Networks (ANNs) are computational models inspired by biological neural systems, designed to approximate complex functions. They consist of interconnected nodes (neurons) organized into layers.

Neural networks can be categorized by complexity:

  • Basic Networks: Single-layer perceptrons, linear networks, and backpropagation networks.
  • Advanced Networks: Boltzmann machines, restricted Boltzmann machines, and recurrent neural networks.
  • Deep Networks: Deep belief networks, convolutional neural networks (CNNs), and long short-term memory (LSTM) networks.

Key characteristics of neural networks include:

  • Input vectors match the number of input neurons.
  • Each connection has an associated weight.
  • Neurons within the same layer are not connected.
  • Networks typically consist of input, hidden, and output layers.
  • Full connectivity between consecutive layers.

The core components of a neural network are:

  • Architecture: The structure defining weights and neurons.
  • Activation Function: Determines neuron output based on input.
  • Learning Rule: Specifies how weights are adjusted over time, typically using backpropagation.

TensorFlow Modules Overview

TensorFlow provides several modules for neural network operations:

  • tf.nn: Low-level neural network operations including convolutions, pooling, normalization, and loss functions.
  • tf.layers: High-level abstractions for building networks, particularly useful for convolutional layers.
  • tf.contrib: Experimental features and advanced operations, though less stable than core modules.

Shallow Neural Network for MNIST

The MNIST dataset consists of handwritten digits. We'll implement a simple neural network using softmax regression.

Data Preparation

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# Load MNIST data with one-hot encoding
mnist = input_data.read_data_sets('data/mnist', one_hot=True)

Model Construction

# Define placeholders for input data
X = tf.placeholder(tf.float32, [None, 784])  # 28x28 images flattened
y_true = tf.placeholder(tf.float32, [None, 10])  # 10 classes

# Initialize weights and biases
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Compute logits and apply softmax
logits = tf.matmul(X, W) + b
y_pred = tf.nn.softmax(logits)

# Calculate cross-entropy loss
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=logits))

# Use gradient descent optimizer
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Training and Evaluation

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    for _ in range(1000):
        batch_X, batch_y = mnist.train.next_batch(100)
        sess.run(optimizer, feed_dict={X: batch_X, y_true: batch_y})
    
    test_accuracy = sess.run(accuracy, feed_dict={X: mnist.test.images, y_true: mnist.test.labels})
    print(f"Test accuracy: {test_accuracy:.4f}")

Convolutional Neural Network for MNIST

CNNs excel at image recognition by leveraging spatial hierarchies. They use convolutional layers to detect features and pooling layers to reduce dimensionality.

Model Architecture

def create_cnn_model():
    # Input layer
    X = tf.placeholder(tf.float32, [None, 784])
    y_true = tf.placeholder(tf.float32, [None, 10])
    X_image = tf.reshape(X, [-1, 28, 28, 1])
    
    # Convolutional layer 1
    W_conv1 = tf.Variable(tf.truncated_normal([5, 5, 1, 32], stddev=0.1))
    b_conv1 = tf.Variable(tf.constant(0.1, shape=[32]))
    h_conv1 = tf.nn.relu(tf.nn.conv2d(X_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)
    h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    
    # Convolutional layer 2
    W_conv2 = tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev=0.1))
    b_conv2 = tf.Variable(tf.constant(0.1, shape=[64]))
    h_conv2 = tf.nn.relu(tf.nn.conv2d(h_pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)
    h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    
    # Fully connected layer
    W_fc1 = tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024], stddev=0.1))
    b_fc1 = tf.Variable(tf.constant(0.1, shape=[1024]))
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    # Output layer
    W_fc2 = tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1))
    b_fc2 = tf.Variable(tf.constant(0.1, shape=[10]))
    logits = tf.matmul(h_fc1, W_fc2) + b_fc2
    y_pred = tf.nn.softmax(logits)
    
    return X, y_true, logits, y_pred

# Training and evaluation would follow similar patterns as the shallow network

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.