JSON vs Protocol Buffer Simplified

sakshi chahal
3 min readJun 11, 2019
Source: Google

Data serialization is the process of converting structured data to a format that allows sharing or storage of the data in a form that allows recovery of its original structure. Data De-serialization is the exact opposite.
Various Data Serialization formats include XML, CSV, YAML, JSON, Protobuf etc.

A Brief Overview:

  • Protocol Buffers usually referred to as Protobuf, was internally developed by Google with the goal to provide a better way, compared to XML, for data serialization -deserialization. So they focused on making it simpler, smaller, faster and more maintainable then XML. But, this protocol even surpassed JSON with better performance, better maintainability, and smaller size.
  • JSON (JavaScript Object Notation) is a lightweight data-interchange format and is based on a subset of the JavaScript Programming Language.
    JSON is “self-describing” and easy to understand.

Major Differences:

  • JSON is a text data format independent of the platform.
{
"quiz": {
"sport": {
"q1": {
"question": "Which is correct team name in NBA?",
"options": [
"Golden State Warriros",
"Huston Rocket"
],
"answer": "Huston Rocket"
}
},
"maths": {
"q1": {
"question": "5 + 7 = ?",
"options": [
"10",
"11",
"12",
"13"
],
"answer": "12"
},
"q2": {
"question": "12 - 8 = ?",
"options": [
"1",
"2",
"3",
"4"
],
"answer": "4"
}
}
}
}
//File source: https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files
  • Protobuf uses binary message format that allows programmers to specify a schema for the data. It also includes a set of rules and tools to define and exchange these messages.
    A schema for a particular use of protocol buffers associates data types with field names, using integers to identify each field.
//A simple Proto file - Polyline.proto
syntax = “proto2”;
message Point {
required int32 x = 1;
required int32 y = 2;
optional string label = 3;
}
message Line {
required Point start = 1;
required Point end = 2;
optional string label = 3;
}
message Polyline {
repeated Point point = 1;
optional string label = 2;
}
//File source: https://en.wikipedia.org/wiki/Protocol_Buffers
  • As JSON is textual, its integers and floats can be slow to encode and decode. JSON is not designed for numbers. Also, Comparing strings in JSON can be slow.
  • Protobuf is easier to bind to objects and faster.
  • JSON is widely accepted by almost all programming languages and highly popular.
  • Protocol buffers currently support generated code in Java, Python, Objective-C, and C++. With proto3 language version, one can also work with Dart, Go, Ruby, and C#, with more languages to come.
Protobuf binary format Serialization [Source: Google]

Advantages of Protobuf:

  • Simpler, faster, smaller in size.
  • RPC support: Server RPC interfaces can be declared as part of protocol files.
  • Structure validation: Having a predefined and larger structure, when compared to JSON, set of data types, messages serialized on Protobuf can be automatically validated by the code that is responsible to exchange them.

Why use JSON aka disadvantages of Protobuf?

  • Non-human readability: JSON, as exchanged on text format and with simple structure, is easy to be read and analyzed by humans. This is not the case with a binary format. [There are now ways to make protobuf human readable too though. ]
  • Lesser resources and support: You won’t find that many resources (do not expect very detailed documentation, nor too many blog posts) about using and developing with Protobuf.
  • Smaller community: Probably the root cause of the first disadvantage. On Stack Overflow, for example, you will find roughly 1.500 questions marked with Protobuf tags. While JSON has more than 180 thousand questions on this same platform.

Sources:

--

--