Editorials

Azure Cosmos DB

Today I wanted to give a shout out to the Azure Cosmos DB. This is a unique set of data persistence tools all wrapped under a single system providing solutions resolving a large number of difficult persistence problems faced by most applications. Cosmos DB is an Extension of the Azure Document DB solution, with a lot more capabilities.

There are a number of factors influencing the selection of a data engine. Here are a few problems Cosmos DB has been designed to resolve:

  • Performance in the amount of time required to perform a request
  • Availability in the redundancy of storage
  • Durability in the ability to establish the necessary consistency model for your application
  • Cost of ownership
  • Ease of management
  • Client connectivity for your programming environment
  • Binding Service Level Agreement

Azure Cosmos DB addresses each of the factors in a unique way. Availability and performance are both addressed in the way data is persisted. You automatically have redundancy in any data store you configure in Azure. With Cosmos DB, you have the ability to distribute your data in many different global data centers. Data can be partitioned across your sites for priority, and replicated across sites for failover. Reading from the closest store to your application can speed up read actions. Writes hit the closest store, and replicate to the remote stores.

One of the many complaints about No-Sql from the SQL community is the typical weak data consistency model of “Eventually Consistent”. This means you could get stale data returned to you application until the data ultimately completes replication to all physical data stores. Cosmos DB has taken a different approach. They have five different consistency levels you may select. You can choose to be as specific as the model found in most SQL engines, to Eventually Consistent, with three other options in between. You will need to select the model that best fits your data access needs. You probably understand that as your consistence model is more strict, your performance per transaction will be lower. Even that difference depends on the number of locations where your data must be replicated, etc.

The cost of ownership is some
thing for you to determine for yourself. It is like any hosted software as a service. In order to compare the cost you must not only compare the cost for a software license, but all of the other things that you must duplicate in some fashion to have the SLQ being provided. Consider the number of sites, the redundancy of hardware, the redundancy of network access, power supply, server oversight and management, failover, upgrade management, and 24/7 monitoring. Yes, you can often host things yourself, but when you factor in all of the costs, you usually save money hosting on your own in those situations where you don’t need all of the capabilities provided by software as a service.

I’ll take a look at some of the other features of Cosmos DB tomorrow. There are some key features that will take more space than I have today.

See Introduction to Azure Cosmos DB for more information.

Cheers,

Ben