Article Categories

Selected Reading

How can Tensorflow be used to load the flower dataset and work with it?

Python Server Side Programming Programming Tensorflow

TensorFlow provides built-in utilities to work with image datasets. The flower dataset contains thousands of flower images organized into 5 classes: daisy, dandelion, roses, sunflowers, and tulips. This dataset is perfect for demonstrating image classification tasks.

Loading the Flower Dataset

First, we need to download and load the flower dataset. TensorFlow's get_file method downloads the dataset, and image_dataset_from_directory creates a dataset from the directory structure.

import tensorflow as tf
import pathlib

# Download the flower dataset
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)

print("Dataset downloaded and extracted successfully")
print(f"Dataset location: {data_dir}")

Dataset downloaded and extracted successfully
Dataset location: /root/.keras/datasets/flower_photos

Setting Up Dataset Parameters

We define the parameters for loading and preprocessing the images. These parameters control batch size, image dimensions, and data splitting ?

import tensorflow as tf
import pathlib

# Download dataset
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)

print("Loading parameters for the loader")
batch_size = 32
img_height = 180
img_width = 180

print("Preprocessing the image dataset using Keras")
print("Splitting dataset into training and validation set")

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)

print("Printing the class names present in sub-directories")
class_names = train_ds.class_names
print(class_names)

Loading parameters for the loader
Preprocessing the image dataset using Keras
Splitting dataset into training and validation set
Found 3670 files belonging to 5 classes.
Using 2936 files for training.
Found 3670 files belonging to 5 classes.
Using 734 files for validation.
Printing the class names present in sub-directories
['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

Understanding the Dataset Structure

The flower dataset is automatically organized with 80% images for training and 20% for validation. Each class represents a different type of flower, making it ideal for multi-class classification tasks.

Key Parameters

batch_size: Number of images processed together (32 images per batch)
img_height, img_width: Standardized image dimensions (180x180 pixels)
validation_split: Percentage of data reserved for validation (20%)
seed: Random seed for reproducible results

Dataset Summary

Component	Count	Purpose
Total Images	3,670	Complete dataset
Training Images	2,936	Model training
Validation Images	734	Model evaluation
Classes	5	Flower categories

Conclusion

TensorFlow's image_dataset_from_directory function simplifies loading and preprocessing image datasets. The flower dataset provides a well-structured foundation for image classification projects with 5 distinct flower classes and automatic train-validation splitting.

AmitDiwan

Updated on: 2026-03-25T15:57:48+05:30

270 Views

Kickstart Your Career

Get certified by completing the course

Get Started

Previous Next