NumPy - File Formats Supported
File Formats in NumPy
NumPy provides support for saving and loading arrays in various file formats extensively. These formats allow you to store data in a manner that is easy to share, read, and process.
When working with large datasets, it is important to choose the appropriate file format for storing your NumPy arrays. The most common formats are:
- .npy: Used for saving a single array with its metadata.
- .npz: A compressed archive format used for saving multiple arrays in a single file.
- Text Files: For human-readable data storage, including CSV and custom text formats.
The .npy Format
The .npy format is the native binary format used by NumPy to store arrays. This format preserves the array's data type and shape information.
Using numpy.save() Function
The numpy.save() function saves a single NumPy array in the .npy format. This format is highly efficient for storing large arrays as it keeps all metadata intact.
Example
In this example, we will create a NumPy array and save it to a .npy file −
import numpy as np
# Create a NumPy array
arr = np.array([[1, 2, 3], [4, 5, 6]])
# Save the array to a .npy file
np.save('array_data.npy', arr)
# Load the saved array to verify
loaded_array = np.load('array_data.npy')
print("Loaded Array:\n", loaded_array)
After executing the above code, the output will be −
Loaded Array: [[1 2 3] [4 5 6]]
The .npz Format
The .npz format is a compressed archive used to store multiple arrays in one file. It is helpful when working with several arrays, as it allows you to save them together in a single, compressed file.
Using numpy.savez() Function
The numpy.savez function saves multiple arrays in a single .npz file. You can also use numpy.savez_compressed() function to compress the data within the archive.
Example
In the following example, we will save two arrays into a single .npz file and load them back −
import numpy as np
# Create two arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([[4, 5, 6], [7, 8, 9]])
# Save arrays into a .npz file
np.savez('arrays_data.npz', array1=arr1, array2=arr2)
# Load the arrays from the .npz file
loaded_data = np.load('arrays_data.npz')
print("Loaded array1:\n", loaded_data['array1'])
print("Loaded array2:\n", loaded_data['array2'])
The output will be −
Loaded array1: [1 2 3] Loaded array2: [[4 5 6] [7 8 9]]
Text File Formats
Text file formats such as CSV (Comma Separated Values) are popular for storing tabular data. NumPy provides methods for saving and loading arrays to and from text files.
Using numpy.savetxt() Function
The numpy.savetxt() function is used to save arrays to text files. It allows you to specify a delimiter, format, and header.
Example
In this example, we will save a NumPy array to a text file using the numpy.savetxt() function −
import numpy as np
# Create a NumPy array
arr = np.array([[1, 2, 3], [4, 5, 6]])
# Save the array to a text file
np.savetxt('array_data.txt', arr)
# Load the array from the text file to verify
loaded_array = np.loadtxt('array_data.txt')
print("Loaded Array:\n", loaded_array)
After executing the code, the output will be −
Loaded Array: [[1. 2. 3.] [4. 5. 6.]]
Using numpy.genfromtxt() Function
The numpy.genfromtxt() function is used for reading CSV files and converting them into NumPy arrays.
Example
In the example below, we will read a CSV file into a NumPy array using the numpy.genfromtxt() function −
import numpy as np
# Read data from a CSV file
data_from_csv = np.genfromtxt('array_data.txt')
# Print the loaded data
print("Data loaded from CSV:\n", data_from_csv)
The output obtained is as follows −
Data loaded from CSV: [[1. 2. 3.] [4. 5. 6.]]
Custom Formats
In some situations, you may need to write or read data in a custom binary format. NumPy provides the numpy.ndarray.tofile() and numpy.fromfile() functions for handling custom binary data formats.
Using numpy.ndarray.tofile() Function
The numpy.ndarray.tofile() function is used to write a NumPy array to a binary file in a raw format. You can specify the file and format of the data.
Example
In the following example, we will write a NumPy array to a custom binary file using the tofile() function −
import numpy as np
# Create a NumPy array
arr = np.array([1, 2, 3, 4, 5], dtype='int32')
# Write the array to a binary file
arr.tofile('binary_data.dat')
print("Array written to binary file:", arr)
The result produced is as shown below −
Array written to binary file: [1 2 3 4 5]
Using numpy.fromfile() Function
The numpy.fromfile() function is used to read data from a custom binary file into a NumPy array.
Example
In this example, we will read the custom binary file and load it into a NumPy array using the fromfile() function −
import numpy as np
# Read the binary data from the file
data_from_binary = np.fromfile('binary_data.dat', dtype='int32')
# Print the data loaded from the binary file
print("Array read from binary file:", data_from_binary)
We get the output as shown below −
Array read from binary file: [1 2 3 4 5]