The bz2 Module



The bzip2 is an open source algorithm for compression and decompression of files. Python's bz2 module provides functionality to implement bzip2 algorithm programmatically.

The open() function is the primary interface to this module.

Open() function

This function opens a bzip2 compressed file and returns a file object. The file can be opened as binary/text mode with read/write permission. The function performs compression based on compressionlevel argument between 1 to 9.

write() function

When the file is opened in 'w' or 'wb' mode, this function is available to the file object. In binary mode, it writes compressed binary data to the file. In normal text mode, the file object is wrapped in TextIOWrapper object to perform encoding.

read() function

When opened in read mode, this function reads it and returns the uncompressed data.

Following code writes the compressed data to a bzip2 file.

import bz2
f=bz2.open("test.bz2", "wb")
data=b'Welcome to TutorialsPoint'
f.write(data)
f.close()

This will create test.bz2 file in current directory. Any unzipping tool will show a 'test' file in it. To read the uncompressed data from this test.bz2 file use following code −

import bz2
f=bz2.open("test.bz2", "rb")
data=f.read()
print (data)
b'Welcome to TutorialsPoint'

The bz2 module also defines BZ2File class. Its object acts as a compressor and decompressor depending upon mode parameter to the constructor.

BZ2File() method

This is the constructor. As in open() function, file and mode parameters are required. The compression level by default is 9 and can be between 1 to 9.

BZ2Compressor() method

This function returns object of Incremental compressor class. Each call to compress() method in this class returns a chunk of compressed data. Multiple chunks can be concatenated together and finally written to the bzip2 compression file.

flush() method

This method empties the buffer and returns chunk of data in it to be appended to the compressed object.

BZ2Decompressor() method

This function returns incremental decompressor's object. Individual chinks of decompressed data concatenated together with flushed data form the uncompressed data.

Following example first compresses each item in the list object and writes the concatenated byte object to the file. The data is retrieved by BZ2Decompressor object.

import bz2
data=[b'Hello World', b'How are you?', b'welcome to Python']
obj=bz2.BZ2Compressor()
f=bz2.open("test.bz2", "wb")
d1=obj.compress(data[0])
d2=obj.compress(data[1])
d3=obj.compress(data[2])
d4=obj.flush()

compressedobj=d1+d2+d3+d4
f.write(compressedobj)
f.close()

To uncompress, use BZ2Decompressor class.

obj=bz2.BZ2Decompressor()
f=bz2.open("test.bz2", "rb")
data=f.read()
obj.decompress(data) 
python_data_compression.htm
Advertisements