Introduction to Serialization

Introduction

Serialization is the process of taking an object from memory and converting it into a series of bytes so that it can be stored or transmitted over a network and then reconstructed later on, perhaps by a different program or in a different machine environment.

Deserialization is the reverse action: taking serialized data and reconstructing the original object in memory.

Many object-oriented programming languages support serialization natively, including:

  • Java

  • Ruby

  • Python

  • PHP

  • C#


PHP Serialization

Example of serializing an array in PHP:

php -a
php > $original_data = array("HTB", 123, 7.77);
php > $serialized_data = serialize($original_data);
php > echo $serialized_data;
a:3:{i:0;s:3:"HTB";i:1;i:123;i:2;d:7.77;}
php > $reconstructed_data = unserialize($serialized_data);
php > var_dump($reconstructed_data);
array(3) {
  [0]=>
  string(3) "HTB"
  [1]=>
  int(123)
  [2]=>
  float(7.77)
}

Understanding PHP Serialized Format


Python Serialization (Pickle)

Multiple libraries implement serialization in Python:

  • Pickle (native)

  • PyYAML

  • JSONpickle

Understanding Pickle Format

A pickle is a program for a virtual Pickle Machine (PM). The PM contains:

  • Stack - Last-In-First-Out (LIFO) data structure

  • Memo - Long-term memory for tracking already-seen objects

Pickle Opcodes Breakdown


Quick Reference

PHP Serialization

Python Pickle (Protocol 0)

Last updated