How can Tensorflow and Python be used to download and prepare the CIFAR dataset?


The CIFAR dataset can be downloaded using the ‘load_data’ method which is present in the ‘datasets’ module. It is downloaded, and the data is split into training set and validation set.

Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?

We will use the Keras Sequential API, which is helpful in building a sequential model that is used to work with a plain stack of layers, where every layer has exactly one input tensor and one output tensor.

A neural network that contains at least one layer is known as aconvolutional layer. A convolutional neural network would generally consist of some combination of the below mentioned layers:

  • Convolutional layers
  • Pooling layers
  • Dense layers

Convolutional neural networks have been used to produce great results for a specific kind of problems, such as image recognition.  

We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
print("The CIFAR dataset is being downloaded")
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
print("The pixel values are normalized to be between 0 and 1")
train_images, test_images = train_images / 255.0, test_images / 255.0
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer','dog', 'frog', 'horse', 'ship', 'truck']

Code credit: https://www.tensorflow.org/tutorials/images/cnn

Output

The CIFAR dataset is being downloaded
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 11s 0us/step
The pixel values are normalized to be between 0 and 1

Explanation

  • The CIFAR10 dataset contains 60,000 color images in 10 classes, with 6,000 images in each class.
  • This dataset is divided into 50,000 training images and 10,000 testing images.
  • The classes are mutually exclusive and there is no overlap between them.
  • This dataset is downloaded, and the data is normalized to fall in between 0 and 1.

Updated on: 20-Feb-2021

156 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements