Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

TensorFlow Data Pipelines and Neural Network Implementation: From CSV to CNN

Tech May 11 5

Data Loading and Neural Network Foundations

Approaches to Feeding Data into TensorFlow

There are three primary methods to supply data to a TensorFlow program:

  1. QueueRunner pipeline: Reads data from files using queue-based input pipelines at the beginning of the graph.
  2. Feeding: Python code provides data at each step during runtime.
  3. Preloading: All data is stored as constants or variables in the graph (suitable for small datasets).

File Reading Pipeline

The pipeline consists of three stages:

  1. Build a filename queue
  2. Read and decode
  3. Batch processing

Note: These operations require starting threads that manage the queue operations to ensure smooth enqueue and dequeue during reading.

1. Building the Filename Queue
# Construct a queue of filenames
filename_queue = tf.train.string_input_producer(
    string_tensor,  # 1-D tensor of filenames with paths
    shuffle=True    # Randomize order
)

2. Reading and Decoding

Read file contents from the queue and decode them. Each reader typically extracts one sample at a time:

  • Text files: one line per sample (tf.TextLineReader).
  • Image files: one image per sample (tf.WholeFileReader).
  • Binary files: fixed number of bytes per sample (tf.FixedLengthRecordReader).
  • TFRecords: one Example protocol buffer per sample (tf.TFRecordReader).

The common read(file_queue) method returns a tuple (key, value) where key is the filename and value is the raw content (one sample).

Decoding converts raw bytes into tensors:

# Decode text (CSV)
tf.decode_csv

# Decode JPEG/PNG images
tf.image.decode_jpeg(contents)   # -> uint8 tensor [height, width, channels]
tf.image.decode_png(contents)    # -> uint8 tensor [height, width, channels]

# Decode raw binary bytes (used with FixedLengthRecordReader)
tf.decode_raw(bytes, out_type=tf.uint8)

All decoded data is of type tf.uint8 by default. Use tf.cast() to convert to tf.float32 later if needed.

3. Batching

After decoding, a single sample is available. To get multiple samples, push them into a new queue for batching:

tf.train.batch(
    tensors,        # list of tensors to batch
    batch_size,     # number of samples per batch
    num_threads=1,  # number of enqueue threads
    capacity=32     # max number of elements in the queue
)

# For shuffled batching:
tf.train.shuffle_batch(...)

Thread Management

The queues used are tf.train.QueueRunner objects. To start the queue operations, use:

coord = tf.train.Coordinator()                    # coordinator for threads
threads = tf.train.start_queue_runners(sess=session, coord=coord)

# Stop gracefully
coord.request_stop()
coord.join(threads)

Image Data Fundamentals

Images are represented as tensors with shape [height, width, channels]. Grayscale images have one channel (single value per pixel). Color images have three channels (RGB). For batches, the shape becomes [batch, height, width, channels].

To standardize image sizes for modeling, use:

tf.image.resize_images(images, size)  # size = [new_height, new_width]

Storage uses uint8 to save space; computation uses float32 for precision.

Example: Loading Dog Images
import tensorflow as tf
import os

class ImageLoader:
    def __init__(self):
        self.files = os.listdir('./dog')
        self.paths = [os.path.join('./dog/', f) for f in self.files]

    def load_images(self):
        # 1. Filename queue
        queue = tf.train.string_input_producer(self.paths)

        # 2. Read and decode
        reader = tf.WholeFileReader()
        _, raw = reader.read(queue)          # key ignored
        image = tf.image.decode_jpeg(raw)

        # Resize to uniform shape [200, 200, 3]
        resized = tf.image.resize_images(image, [200, 200])
        resized.set_shape([200, 200, 3])

        # 3. Batch
        batch = tf.train.batch([resized], batch_size=100, capacity=100)

        with tf.Session() as sess:
            coord = tf.train.Coordinator()
            threads = tf.train.start_queue_runners(sess=sess, coord=coord)

            images = sess.run(batch)
            print('Batch shape:', images.shape)

            coord.request_stop()
            coord.join(threads)

if __name__ == '__main__':
    loader = ImageLoader()
    loader.load_images()

Binary Data: CIFAR-10 Example

The CIFAR-10 dataset consists of 60,000 32×32 color images in 10 classes. Each binary file contains 10,000 samples, each composed of 1 byte (label) + 3072 bytes (pixels: 1024 red, then 1024 green, then 1024 blue, row‑major).

Pipeline Implementation
import tensorflow as tf
import os

class CifarDataset:
    def __init__(self):
        self.height = 32
        self.width = 32
        self.channels = 3
        self.image_bytes = self.height * self.width * self.channels
        self.label_bytes = 1
        self.record_bytes = self.label_bytes + self.image_bytes

    def read_binary(self, file_list):
        # 1. Filename queue
        queue = tf.train.string_input_producer(file_list)

        # 2. Read fixed-length records
        reader = tf.FixedLengthRecordReader(self.record_bytes)
        _, record = reader.read(queue)          # key ignored

        # Decode raw bytes
        decoded = tf.decode_raw(record, tf.uint8)

        # Split label and image
        label = tf.slice(decoded, [0], [self.label_bytes])
        pixels = tf.slice(decoded, [self.label_bytes], [self.image_bytes])

        # Reshape to [channels, height, width] then transpose to [h, w, c]
        img_reshaped = tf.reshape(pixels, [self.channels, self.height, self.width])
        img = tf.transpose(img_reshaped, [1, 2, 0])

        # Convert to float
        img_float = tf.cast(img, tf.float32)

        # 3. Batch
        label_batch, image_batch = tf.train.batch(
            [label, img_float], batch_size=100, capacity=100)

        with tf.Session() as sess:
            coord = tf.train.Coordinator()
            threads = tf.train.start_queue_runners(sess=sess, coord=coord)

            lbl, imgs = sess.run([label_batch, image_batch])
            print('Labels shape:', lbl.shape)
            print('Images shape:', imgs.shape)

            coord.request_stop()
            coord.join(threads)

if __name__ == '__main__':
    data_dir = './cifar-10-batches-bin'
    files = [os.path.join(data_dir, f) for f in os.listdir(data_dir) if f.endswith('.bin')]
    cifar = CifarDataset()
    cifar.read_binary(files)

TFRecords Format

TFRecords is a binary format that stores data as tf.train.Example protocol buffers. It saves memory and does not require separate label files.

Writing TFRecords
# Example of writing CIFAR-10 data to TFRecords
with tf.python_io.TFRecordWriter('cifar10.tfrecords') as writer:
    for i in range(100):
        image_bytes = image_batch[i].tostring()
        label_val = int(label_batch[i][0])

        example = tf.train.Example(features=tf.train.Features(feature={
            'label': tf.train.Feature(int64_list=tf.train.Int64List(value=[label_val])),
            'image': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_bytes]))
        }))
        writer.write(example.SerializeToString())

Reading TFRecords
def read_from_tfrecord(self):
    # 1. Filename queue
    queue = tf.train.string_input_producer(['cifar10.tfrecords'])

    # 2. Read and parse Example
    reader = tf.TFRecordReader()
    _, serialized = reader.read(queue)
    features = tf.parse_single_example(serialized, features={
        'label': tf.FixedLenFeature([], tf.int64),
        'image': tf.FixedLenFeature([], tf.string)
    })

    image = tf.decode_raw(features['image'], tf.uint8)
    image = tf.reshape(image, [self.height, self.width, self.channels])

    # 3. Batch
    label_batch, image_batch = tf.train.batch(
        [features['label'], image], batch_size=100, capacity=100)

    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        lbl, imgs = sess.run([label_batch, image_batch])
        coord.request_stop()
        coord.join(threads)

Neural Network Basics

An artificial neural network (ANN) mimics biological neural structures. It typically consists of an input layer, one or more hidden layers, and an output layer. Each connection has a weight, and each neuron (except input) applies an activation function. The output layer is often a fully connected layer.

Perceptron (PLA)

The perceptron is the simplest neural unit: it computes a weighted sum of inputs plus bias and passes the result through a step function (sign). It can solve linear separable problems.

Softmax Regression and Cross-Entropy Loss

For multi‑class classification, the network often uses a softmax output layer to convert logits into probabilities. The loss is the cross‑entropy between the true labels (one‑hot) and the predicted probabilities:

loss_per_sample = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_pred)
mean_loss = tf.reduce_mean(loss_per_sample)

MNIST Handwritten Digit Recognition

The MNIST dataset contains 28×28 grayscale images of digits 0‑9 (60,000 training, 10,000 test). Images are flattened into 784‑dimensional vectors. Labels are one‑hot encoded.

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
print('Train shape:', x_train.shape, y_train.shape)   # (60000, 28, 28) (60000,)

A simple linear model (softmax regression) can achieve about 92% accuracy. However, linear models cannot handle non‑linear patterns without feature engineering.

Introduction to Convolutional Neural Networks

Convolutional Neural Networks (CNNs) extend traditional MLPs by adding convolutional and pooling layers before the fully connected layers. This enables effective feature extraction from grid‑like data (e.g., images).

Why CNNs?

Traditional fully connected networks ignore spatial structure and have too many parameters for large images. CNNs use local connectivity, weight sharing, and pooling to reduce parameters and capture hierarchical features.

Convolutional Layer

A convolutional layer applies multiple learnable filters (kernels) to the input. Each filter slides across the input with a given stride and optionally zero‑padding to produce a feature map. The output size is determined by:

output_size = (input_size - filter_size + 2 * padding) / stride + 1

Filters at early layers detect edges and corners; deeper layers combine them into higher‑level concepts.

Pooling Layer

Pooling (e.g., max pooling or average pooling) reduces the spatial dimensions, lowering computational load and providing translation invariance. Common pooling size is 2×2 with stride 2.

Typical CNN Architecture

Input → [Conv + ReLU] → Pooling → [Conv + ReLU] → Pooling → Flatten → Fully Connected → Softmax

This structure has been the foundation for breakthroughs in image classification (e.g., AlexNet, VGG, ResNet).

Tags: tensorflow

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.