Editorials

Readers Notes on Unique Keys for Multiple DataCenters

Readers Notes on Unique Keys for Multiple DataCenters
Today we have some responses from folks having to implement multiple datacenters with unique keys.

Giles:
First, thanks for all the good articles you provide every day!

Using a GUID to identify a record can produce some problems if you don’t take care of the function used to generate the GUID!

From SQL Server 2005, you have two functions to generate a GUID: NEWID() and NEWSEQUENTIALID(). The NEWID() function generates a random GUID, while the NEWSEQUENTIALID() function generates a sequential id as the name implies! Both uses the MAC address of the first Network Card to generates the GUID, we will see later why this can be useful.

Let me provide you two tables using each of them:

CREATE TABLE [ProductsWithNEWID] (
[ProductGUID] [uniqueidentifier] NOT NULL DEFAULT NEWID(),
[Name] [nvarchar] (255) NOT NULL
CONSTRAINT [PK_ProductsWithNEWID] PRIMARY KEY CLUSTERED
(
[ProductGUID]
)
)

CREATE TABLE [ProductsWithNEWSEQUENTIALID] (
[ProductGUID] [uniqueidentifier] NOT NULL DEFAULT NEWSEQUENTIALID(),
[Name] [nvarchar] (255) NOT NULL
CONSTRAINT [PK_ProductsWithNEWSEQUENTIALID] PRIMARY KEY CLUSTERED
(
[ProductGUID]
)
)
-- Insert a few records in both tables
INSERT INTO [ProductsWithNEWID] ([Name]) VALUES (N'First Product')
INSERT INTO [ProductsWithNEWID] ([Name]) VALUES (N'Second Product')
INSERT INTO [ProductsWithNEWID] ([Name]) VALUES (N'Third Product')
INSERT INTO [ProductsWithNEWSEQUENTIALID] ([Name]) VALUES (N'First Product')
INSERT INTO [ProductsWithNEWSEQUENTIALID] ([Name]) VALUES (N'Second Product')
INSERT INTO [ProductsWithNEWSEQUENTIALID] ([Name]) VALUES (N'Third Product')

Now, we select records from both tables:

SELECT * FROM [ProductsWithNEWID]

ProductGUID Name
26467C28-E618-48AD-A3F3-263CB5E001DE Second Product
EFB24C95-E4EE-41EB-A563-BD8B6777632E Third Product
B5140037-397A-4C8C-8959-DB96D91A88ED First Product

SELECT * FROM [ProductsWithNEWSEQUENTIALID]

ProductGUID Name
B80B5B2C-CC3F-E211-9C96-00269A26B6CC First Product
B90B5B2C-CC3F-E211-9C96-00269A26B6CC Second Product
BA0B5B2C-CC3F-E211-9C96-00269A26B6CC Third Product

As you can see, the first table uses random values for the ProductGUID column, while the second table uses sequential values (B8, B9, BA at the begining).

The major drawback with NEWID for a primary key is the random nature of the generated values, because of this your PK clustered index will be highly fragmented!

That’s why you should always uses NEWSEQUENTIALID when your GUID column is a primary key.

Moreover using NEWSEQUENTIALID, the network card address is in plain text at the end of the GUID (here 00269A26B6CC is my network card address). Using TSQL functions to extract the right part of the GUID like SUBSTRING( CAST( [ProductGUID] AS char(36) ), 25, 12 ), you will be able to find from which location your GUID have been generated! With a table containing all possible network card addresses and a link to another table containing informations from each location, you could even group your selections by location! 🙂

Bret:
Hi Ben,

I’ve been following your column in the SSWUG newsletter for a few years now and consider it can’t miss reading each week. The topic are so relevant to DBAs and developers alike. Thanks for keeping me up to date with industry trends.

I got the Tuesday newsletter where you introduce the topic of replicating to a central database and asked for submissions. This is actually an issue I am dealing with at work right now, and was looking forward to getting some new ideas. However, the Wednesday edition never arrived in my mailbox…somehow it just didn’t get to me, but wondering if there is a place where I can view your column from yesterday?

Not sure if you covered this idea yesterday, but there is another option I thought of when tasked with this problem at work a few months ago. In short, you determine how many data centers (or DB instances) you need to keep unique. In our case, we have 4 major data centers, and I wanted to add a buffer to account for up to 20 data centers in the future. So I set the identity STEP to be 20 in all data-centers. Then I assigned each data-center an incremental starting value. Data-center A starts the identities at 100,001, Data-center B starts at 100,002, and so forth. Because of the globalized step value, you are creating globally unique identities, for up to 20 data-centers. Obviously the constraint here in my example is the pre-determination of step; adding a 21st data-center would be a big problem. But I think this could be a valid solution for some cases.

I have been asked a few times lately if past editorials are available. You can find historical newsletters on our web site at www.sswug.org. Scroll to the bottom of the page and click on Editorials. You can now view the latest editorial online. If you wish to find other topics, enter your search text above and it will search the archives for other articles with your areas of interest.

Another question that has been asked is why I only share the first name of those who write in with responses. Using only the first name grants some people a degree of anonymity and encourages people to be more direct in their statements. When you respond with information and ask in your email, I can share your whole name should you wish. I am glad to provide attribution for your comments.

Cheers,

Ben

$$SWYNK$$

Featured Article(s)
Accessing Oracle XML DB Repository Programmatically
In a nutshell, Oracle XML DB Repository is an Oracle Database feature, which can be used to store the database�s content in hierarchical structures, like in a conventional file system, using File/Folder/URL metaphor. In this article, you�ll look at how to use use Oracle XML DB Resource APIs to access and manipulate XML DB Repository resources from within your PL/SQL code.When to hire?
In today’s tight economy, many employers are under financial strain. They need additional staff and resources to get the products and services out the door, so that they can expand and sell more products. On the other hand, ‘right now’ money is tight to hire new employees. Because of this dilemma, many businesses wait much too long to hire and fail because of this.

Featured White Paper(s)
Encryption & Key Management for Microsoft SQL Server 2008/2012
Simplify encryption and key management on your SQL Server. Data thieves are targeting SMB companies because of their inadequa… (read more)