Why Serialization is needed?
In my last blogpost I discussed about what is serialization and some basics about it. Now we will address following question here:
Why Serialization is needed:
In non-object-oriented languages, one would typically have data stored in memory in a pattern of bytes that would ‘make sense’ without reference to anything else. For example, a bunch of shapes in a graphics editor might simply have all their points stored consecutively. In such a program, simply storing the contents of all one’s arrays to disk might yield a file which, when read back into those arrays would yield the original data. So in non-object-oriented languages, for some limited scenario, serialization may be straight forward and no need of special serialization mechanism or formats.
In object-oriented languages, many objects are stored as references to other objects. Merely storing the contents of in-memory data structures will not be useful, because a reference to object #24601 won’t say anything about what that object represents. While an object-oriented system may be able to do a pretty good job figuring out what the in-memory data “mean” and try to convert it automatically to a sensible format, it can’t recognize all the distinctions between object references which point to the same object, and those that point to objects which happen to match. It’s thus often necessary to help out the system when converting objects to a raw stream of bits.
In this regard, more points to be considered are:
- Binary representations may be different between different architectures, compilers and even different versions of the same compiler. There’s no guarantee that what system A sees as a signed integer will be seen as the same on system B. Byte ordering, word lengths, struct padding etc will become hard to debug problems if you don’t properly define the protocol or file format for exchanging the data.
- Serializing the data structure in an architecture independent format means that we do not suffer from the problems of byte ordering, memory layout, or simply different ways of representing data structures in different programming languages.
- While storing information in memory is great, there comes a time your users will have to shut your application down. This means (probably) that you will need to write information to a file at some point, because you will want to store whatever data was in memory.
- To discourage competitors from making compatible products, publishers of proprietary software often keep the details of their programs’ serialization formats a trade secret. Some deliberately obfuscate or even encrypt the serialized data. Yet, interoperability requires that applications be able to understand each other’s serialization formats. Therefore, remote method call architectures such as CORBA define their serialization formats in detail.
- Writing crucial data to the disk as TEXT is always dangerous. Any anonymous user can open the text file and easily read your data. With Object Serialization, you can reduce this danger to a certain extent.
Steps in the Serialization Process in C#
- A check is made to determine whether the formatter has a surrogate selector. If the formatter does, check whether the surrogate selector handles objects of the given type. If the selector handles the object type, ISerializable.GetObjectData is called on the surrogate selector.
- If there is no surrogate selector or if it does not handle the object type, a check is made to determine whether the object is marked with the Serializable attribute. If the object is not, a SerializationException is thrown.
- If the object is marked appropriately, check whether the object implements the ISerializable interface. If the object does, GetObjectData is called on the object.
- If the object does not implement ISerializable, the default serialization policy is used, serializing all fields not marked as NonSerialized.
Useful links on Serialization: