Home > Tech > Content

Building Neural Networks with TensorFlow: From Perceptrons to CNNs

Tech May 13 1

Perceptrons: The Building Blocks

Introduced by Frank Rosenblatt in 1957, the perceptron is a fundamental unit of artificial neural networks. It takes multiple input values, multiplies each by a corresponding weight, sums these weighted inputs, and then applies an activation function to produce an output. This simple mechanism can solve basic logical operations like AND and OR.

The relationship between perceptrons and logistic regression is significant. While both use weighted sums and activation functions, perceptrons typically use a step function, whereas logistic regression uses a sigmoid function for probabilistic outputs.

Neural Network Fundamentals

Artificial Neural Networks (ANNs) are computational models inspired by biological neural systems, designed to approximate complex functions. They consist of interconnected nodes (neurons) organized into layers.

Neural networks can be categorized by complexity:

Basic Networks: Single-layer perceptrons, linear networks, and backpropagation networks.
Advanced Networks: Boltzmann machines, restricted Boltzmann machines, and recurrent neural networks.
Deep Networks: Deep belief networks, convolutional neural networks (CNNs), and long short-term memory (LSTM) networks.

Key characteristics of neural networks include:

Input vectors match the number of input neurons.
Each connection has an associated weight.
Neurons within the same layer are not connected.
Networks typically consist of input, hidden, and output layers.
Full connectivity between consecutive layers.

The core components of a neural network are:

Architecture: The structure defining weights and neurons.
Activation Function: Determines neuron output based on input.
Learning Rule: Specifies how weights are adjusted over time, typically using backpropagation.

TensorFlow Modules Overview

TensorFlow provides several modules for neural network operations:

tf.nn: Low-level neural network operations including convolutions, pooling, normalization, and loss functions.
tf.layers: High-level abstractions for building networks, particularly useful for convolutional layers.
tf.contrib: Experimental features and advanced operations, though less stable than core modules.

Shallow Neural Network for MNIST

The MNIST dataset consists of handwritten digits. We'll implement a simple neural network using softmax regression.

Data Preparation

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# Load MNIST data with one-hot encoding
mnist = input_data.read_data_sets('data/mnist', one_hot=True)

Model Construction

# Define placeholders for input data
X = tf.placeholder(tf.float32, [None, 784])  # 28x28 images flattened
y_true = tf.placeholder(tf.float32, [None, 10])  # 10 classes

# Initialize weights and biases
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Compute logits and apply softmax
logits = tf.matmul(X, W) + b
y_pred = tf.nn.softmax(logits)

# Calculate cross-entropy loss
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=logits))

# Use gradient descent optimizer
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Training and Evaluation

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    for _ in range(1000):
        batch_X, batch_y = mnist.train.next_batch(100)
        sess.run(optimizer, feed_dict={X: batch_X, y_true: batch_y})
    
    test_accuracy = sess.run(accuracy, feed_dict={X: mnist.test.images, y_true: mnist.test.labels})
    print(f"Test accuracy: {test_accuracy:.4f}")

Convolutional Neural Network for MNIST

CNNs excel at image recognition by leveraging spatial hierarchies. They use convolutional layers to detect features and pooling layers to reduce dimensionality.

Model Architecture

def create_cnn_model():
    # Input layer
    X = tf.placeholder(tf.float32, [None, 784])
    y_true = tf.placeholder(tf.float32, [None, 10])
    X_image = tf.reshape(X, [-1, 28, 28, 1])
    
    # Convolutional layer 1
    W_conv1 = tf.Variable(tf.truncated_normal([5, 5, 1, 32], stddev=0.1))
    b_conv1 = tf.Variable(tf.constant(0.1, shape=[32]))
    h_conv1 = tf.nn.relu(tf.nn.conv2d(X_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)
    h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    
    # Convolutional layer 2
    W_conv2 = tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev=0.1))
    b_conv2 = tf.Variable(tf.constant(0.1, shape=[64]))
    h_conv2 = tf.nn.relu(tf.nn.conv2d(h_pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)
    h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    
    # Fully connected layer
    W_fc1 = tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024], stddev=0.1))
    b_fc1 = tf.Variable(tf.constant(0.1, shape=[1024]))
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    # Output layer
    W_fc2 = tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1))
    b_fc2 = tf.Variable(tf.constant(0.1, shape=[10]))
    logits = tf.matmul(h_fc1, W_fc2) + b_fc2
    y_pred = tf.nn.softmax(logits)
    
    return X, y_true, logits, y_pred

# Training and evaluation would follow similar patterns as the shallow network

Tags: tensorflow neural networks

Back to List

Prev: Solving the Two Sum Problem in Java

Next: Common Linked List Problems and Solutions

Fading Coder

Building Neural Networks with TensorFlow: From Perceptrons to CNNs

Perceptrons: The Building Blocks

Neural Network Fundamentals

TensorFlow Modules Overview

Shallow Neural Network for MNIST

Data Preparation

Model Construction

Training and Evaluation

Convolutional Neural Network for MNIST

Model Architecture

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Building Neural Networks with TensorFlow: From Perceptrons to CNNs

Perceptrons: The Building Blocks

Neural Network Fundamentals

TensorFlow Modules Overview

Shallow Neural Network for MNIST

Data Preparation

Model Construction

Training and Evaluation

Convolutional Neural Network for MNIST

Model Architecture

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment