Protocol Buffers - Introduction

Quiz

Before we jump into Protocol Buffer, let us go over a brief background of Serialization which is what Protocol Buffer does.

What is Serialization and Deserialization?

Serialization is the process of converting an object (of any language) into bytes and storing them in persistent memory system. This memory system could be a file on the disk, messaging queue or a database. The major intention with serialization of object is that we can reuse the data and recreate the object on same or different machine. In deserialization, we convert the stored bytes back to an object.

Why do we need Serialization and Deserialization?

While there are a few other use-cases, the most basic and important one is that it provides a way to transfer object data over a network to a different service/machine etc. and then to recreate object for its further use. Transferring object data via API, database or messaging queue requires the object to be converted into bytes so that it can be sent over a network. And this is where serialization becomes important.

In microservice architecture, the application is broken down into small services and these services communicate with each other via messaging queue and APIs. And all of this communication happens over a network which requires frequent conversion of object to bytes and back to objects. So, serialization and deserialization becomes very critical aspects when it comes to distributed environment.

Why Google Protocol Buffers?

Google Protocol Buffers perform the serialization and deserialization of the objects to bytes which can be transferred over the network. But there are some other libraries and mechanisms to transfer data as well.

So, what makes Google Protocol Buffers special? Here are some of its important features −

Language independent − Multiple languages have Protocol Buffers library, few famous ones being Java, Python, Go, etc. So, a Java object can be serialized into bytes from a Java program and can be deserialized to a a Python object.
Efficient Data Compaction − In microservice environment, given that multiple communications take place over a network, it is critical that the data that we are sending is as succinct as possible. We need to avoid any superfluous information to ensure that the data is quickly transferred. Google Protocol Buffers have that as one of the focus areas.
Efficient serialization and deserialization − In microservice environment, given that multiple communications take place over a network, it is critical how fast can we serialize and deserialize. Google Protocol Buffers ensure that it is as quick as possible in serializing and deserializing the data.
Simple to use − Protocol Buffers library auto-generate serialization code (as we will see in the upcoming chapters), has a versioning scheme to ensure that the creator of data and the user of data can have separate versions of the serialization definition, etc.

Protocol Buffers vs Others (XML/JSON/Java serialization)

Let's take a look how other ways to transfer data over a network stack up against Protocol Buffers.

Feature	Protocol Buffers	JSON	XML
Language independent	Yes	Yes	Yes
Serialized data size	Least of three	Less than XML	Highest among the three
Human Readable	No, as it uses separate encoding schema	Yes, as it uses text based format	Yes, as it uses text based format
Serialization speed	Fastest among the three	Faster than XML	Slowest among the three
Data type support	Richer than other two. Supports complex data types like Any, one of etc.	Supports basic data types	Supports basic data types
Support for evolving schema	Yes	No	No

Print Page