Editorials

Use Azure TableBatchOperation to Increase Performance

Use Azure TableBatchOperation to Increase Performance

Azure Table Storage is one of Microsoft’s solutions for storing objects in the Cloud. From the client perspective it works with objects, making your code easier to understand, by performing the heavy lifting of serialization for you to allow transport over the network.

Objects are stored in a hierarchical structure as follows:

StorageEndPoint
Table
Partition
Key

When you save data to Azure you

  1. Open a connection to the StorageEndPoint
  2. Get a reference to the Table
  3. Create your entity to be saved (there is some inheritance you must implement). When you create the object to be saved you specify the Partition and the Key values.
  4. Save the Table to Azure

You can follow these steps for each object you save to Azure. However, for optimization, Microsoft allows a batch of entities to be saved up to 100 objects with a combined size no larger than 4mb. If your data fits within these parameters, using the Batch process is much faster than saving each item one at a time.

There are some restrictions. All the objects in the batch must belong to a single StorageEndPoint, Table, Partition. This makes sense because you have to have a reference to the table in order to save a batch of data. The single Partition is also included because the Partition value is used to shard data across multiple servers for performance.

If you are interested in looking into using the batch method for TableStorage there is an example available to you at www.windowsazure.com/en-us/develop/net/how-to-guides/table-services/#insert-batch.

I have found that the batch method has a many fold increase in performance.

Are you looking into the Azure services yet? What ones are you using or exploring? Are you considering any other cloud service? Please drop me a note to help direct future editorials. You are always invited to add comments below as well.

Cheers,

Ben