Editorials, Encryption/Data Security, Townsend Security

Merge Comments


Merge Comments

Today I had a number of comments regarding an SQL merge process.

Jason:

If you do the update first you have X rows in the table that it has to evaluate to join on. Now when you delete you still have X rows to join on.


I believe you said you do the insert after the update so the update has less rows to join on. For that reason it would seem like we should do the deletes first, updates second and inserts third.

Editor:

I could go with that. This is really optimizing, and unless the sets are really large, the impact is negligible. However, there is a good probability that in many cases you will not have indexes on the table being merged into your permanent storage. So, keeping things efficient may make a difference on a heavily used database.

I think for me the key is to do the Insert step last. Updates first or Deletes first is less relevant. The deletes reduce the number of rows in the permanent table, so there are less rows to ignore while doing a join for the update…so I can see some value there. Again, the value depends on the number of rows in both the new set table and the permanent table. If you are merging in thirty rows into a table containing two million rows, the issue is less significant. If you are merging in ten thousand rows into a table with 2 million rows, the impact will most likely be higher. If you are merging in one million rows into a table with one million rows, the impact is significant.

Bryan Writes:
We’ve used incremental update methodology extensively in multiple SQL Server versions over the years. We’ve found correlated sub-queries to typically outperform just about anything else we try…


DELETE P
FROM PermTable as P
WHERE NOT EXISTS (Select 1 from NewSet as N
Where P.Keyn = N.Keyn)

INSERT PermTable (col list)
SELECT N.Col List
FROM NewSet as N
WHERE NOT EXISTS (Select 1 from PermTable as P
Where P.Keyn = N.Keyn)


Does this tactic work equally well on our DB2 platforms? No! Your examples look similar to what our DB2 specialists did to facilitate incremental updates on their servers. Even so, our DB2 platforms consistently failed to meet load window requirements, no matter how much hardware or consulting dollars we threw at the problem to try and satisfy political motivations. In the end, SQL Server’s ability to deliver on our SLA requirements drove us back to that platform for essentially all data warehousing efforts in our company.

Dana Writes:
In the past we used IDENTITY_SCOPE when adding records to a string of tables. We needed to include the key value of Table1 as a foreign key value in Table2. We’d insert the record into Table1, then use IDENTITY_SCOPE to capture the value for the insert into Table2. To accomplish this for multiple records we used a cursor because we could only use IDENTITY_SCOPE for one record at a time. Then along came the "OUTPUT" clause so we could capture the value to be used as a foreign key for a set of records. That was an exciting option until we came upon a situation where a value needed to be put into Table2 which was not included in Table1, so it was not part of the Inserted.* values. The Merge statement overcomes that barrier. Not only do you have access to the inserted values, but also to any of the values from your source query. We’ve not yet implemented this, but look forward to doing so in the future.

If you have thoughts you’d like to share please send them to btaylor@sswug.org.

Cheers

Ben

SSWUGtv
Special Edition Leon Guzenda from Objectivity DB joins us to talk about Big Data, some real applications for it and the history behind this kind of massive data procession. Really interesting applications of this technology.
Watch the Show

$$SWYNK$$

Featured Article(s)
Creating a No-Excuse Zone – Part 2
We all have "excuses" as part of our normal, default speaking and thinking patterns. It’s normal. It’s human. We often entertain ourselves with stories of how we got to where we "don’t want to be." We do not readily acknowledge these stories as "excuses" or blaming something or someone external to ourselves for our predicament; we’re simply "explaining" ourselves. But the longer we stay in the "explaining" stage of the current situation; the longer we are stalled and not making forward progress.

Featured White Paper(s)
Encryption & Key Management for Microsoft SQL Server 2008
Written by Townsend Security

Simplify encryption and key management on … (read more)