Steve Holstad's "the bright lights"

"Just because your voice reaches halfway around the world doesn't mean you are wiser than when it reached only to the end of the bar." - Edward R. Murrow
in

Remoting De-bloating with .Net 2.0

The .Net 2.0 framework has baked in some new remoting capabilites that, in my opinion, no remoting solution should ever hit production status without. Today I'm focusing on the new binary serialization format available for datasets.

In a nutshell, remoting datasets can be a cumbersome and bandwith-costly task, but there are some very valid real-world scenarios when this is the most appropriate solution: for example, if data RowState and row validation info is required at the remote server, or data transformation will be handled by a server side application. Regardless of your reasons, if a dataset is crossing the cloud, we need to find ways to improve performance and limit how much bandwidth is being monopolized by the process.

I'm going to use an example of a dataset with 60,000 records to illustrate the impact these enhancements can have on a remotable dataset. As the Cutting Edge article below states, the impact of binary serialization of datasets has minimal impact on small datasets, but due to the dependence on the ReadXML method used by the remoting classes, as the size of the dataset grows, performance decreases exponentially.

The setup:

A standard, strongly typed dataset containing ~60,000 records.

The goal:

Transport via remoting using as little bandwidth as possible.

Step 1: Regular XML serialization

By default, remoting uses SOAP to serialize a dataset for transport. I've used ds.WriteXML() to output a file in this manner.

Result:

Output as XML: 30,372 KB

Step 2: Remoting with binary format, SerializationFormat.Xml

Make sure your remoting app is using binary formatting, by adjusting config value to "binary".
in app config: <system.runtime.remoting><application><channels><channel><clientProviders><formatter ref="binary"/>.....

One thing I was surprised about was that applying this value doesn't affect how the dataset is serialized itself, it simply encodes the XML formatted dataset as binary... this reduces the file size, increasing performance, but now we're getting somewhere...

Result:

XML serialization to binary stream: 25,481 KB

Step 3: Fall in love with SerializationFormat.Binary

Before the dataset is remoted, set the dataset's SerializationFormat property to SerializationFormat.Binary. Bam. that's it, and take a look at the size difference! This format still keeps the Xml tags as is, but the data contained in the nodes is serialized as binary, and the effect is impressive:

Result:

Binary serialization to binary stream: 12,385 KB!

Step 4: Compress the binary output

For many applications, what we've accomplished so far may be enough. For even more performance enhancement, a slightly more involved process is applied, and the effect is an efficient, tiny remotable object.

Peter Bromberg's article shows a nice example of how to apply compression. Basically you will set your dataset SerializationFormat to Binary, serialize your dataset with a BinaryFormatter using a MemoryStream object, and then retrieving the byte array object. A System.IO.Compression.DeflateStream object is created, and using your byte array a stream is compressed, and outputs to you a condensed byte array object of you orignal dataset. Keep in mind that if using this method, you'll now want to remote this byte array, instead of a dataset object. At the server, you'll have to implement custom code to decompress the object and cast it back into your original dataset object. This is basically what the .Net remoting framework is doing behind the curtain, now you've just added an extra layer of performance enhancing functionality.

Result:

Binary serialization to compressed binary stream: 2,436 KB. That's an incredible reduction in size!



some articles we've used to research remoting performance (thanks to egorny):

MSDN Cutting Edge remoting article: http://msdn.microsoft.com/msdnmag/issues/04/10/CuttingEdge/

ADO.NET 2.0 DataSet as a Self-Contained Compressed Remoteable Object: Peter Bromberg: http://www.eggheadcafe.com/articles/20060220.asp

Binary Serialization of a DataSet - ADO.NET 2.0: http://www.knowdotnet.com/articles/binaryserialization.html

Comments

No Comments