Reader Response – Null and MOD
Last time I introduced the topic of null from a different light. Usually, those opposed to the use of null, are approaching it from a perspective of “Why store nothing if nothing is what you really have.”
I submitted a practical reason; if you allow null in your data, you are required to have test conditions taking the possibility of null into account. Many times we don’t test for null.
I also gave a challenge for how you would determine if rows in a sequence are even or odd without using the traditional MOD function.
In SQL Server a common way to determine odd or even is similar to the following:
IsEven = CASE WHEN anynumber % 2 = 0 THEN 1 ELSE 0 END
or
CONVERT(BIT, anynumber % 2)
since a converting any number to a bit other than 0 results in 1.
One way you accomplish the same thing is to do a bitwise comparison of the value 1 to any number.
IsEven = anynumber & 1
If you have done any assembler, you can see pretty quickly that this option performs the calculation with a lot less overhead on the CPU. One of our readers was familiar with this technique using C#. It turns out the same syntax works in SQL Server as well.
With that out of the way, here are some readers responses to all of the above.
Karen:
One that is similar to using MOD that I have used is FLOOR (or CEILING) as follows (where NumCol is the numeric column being evaluated):
Case when FLOOR(NumCol/2.0)=NumCol/2.0 then 1 when FLOOR(NumCol/2.0)<> NumCol/2.0 then 0 end as EvenFlag
Note that this case statement results in a NULL value rather than 0 if numcol is NULL thereby addressing the NULL issue you raised in the introductory paragraphs of your newsletter!
Also, I have occasionally used this same type of function to select a sampling of rows from a table where an increasing value in the denominator gives you a smaller subset – as opposed to 2.0 which gives about ½ the rows if you have a somewhat randomly assigned numeric column. Actually if you have a truly randomly assigned numeric unique row identifier (or using row_number()) you can obtain a ‘random sample’ of a table this way.
Jason:
There are two ways I know of to test for even/odd. The first is MOD, of course. MOD in SQL is a function so that’s probably a fairly slow way to do it. If you are using C# or Transact-SQL then you can use the modulo(%) operator which can be implemented as a single inline CPU instruction. This may be more than one instruction on some RISC processors.
The other, also available in C# and Transact-SQL but probably not all other SQL implementations, is the bitwise AND(&) operator. The expression ‘value & 1’ will result in 1 for odd and 0 for even since all odd numbers in binary have a 1 in the least significant bit. This should also execute in a single instruction on x86 processors and it’s more likely to be a single instruction on RISC processors as it’s a less complicated and more common operation than integer division with a remainder.
Depending on the processor, its optimizations and the structure of the code, the bitwise AND could execute faster even though they are both single instructions.
Whether or not any given SQL server actually does those in a single instruction is another matter entirely. Still, I’d be very surprised if the operators didn’t perform considerably better than the function.
Derrick:
I’m curious about a couple of things in your post.
1) When the original procedure was written were NULL values taken into consideration?
2) In converting to C# did you solely use C# or was a portion of the procedure still written in SQL?
I do SQL development and Administration and really curious to know if one can use another the programming language vice SQL to get the same results. In much of my research SQL is still more efficient for most operations. Thirdly, I have to ask, you stated, “The problem, in C#, was completely impossible to define.”, did you ever get the issue resolved and if so how?
Thanks for the article it was very interesting and forced me to dig into some of the things that I have done. You are right because error trapping and consideration for NULL values have more often than not been performed during testing which means it was not formally considered initially.
Richard:
Excellent topic, a can of worms indeed… I’ve had this same discussion more than a few times with my favorite web, C++ developers. The contention between accuracy vs easy to program. In your few paragraphs you hit on some good points but being a DBA working with code writers I had to play out these scenarios in concrete terms to get the real issues on the table for discussion.
In general I find NULL values to be very useful for at least two reasons.
The first is practical and from my point of view unavoidable, the second I believe provides more accuracy and insight to the dataset.
A) Some data types cannot and should not default to ZERO or BLANK. Sure a text field can be set to " " but not a DateTime field. If a table storing data on people, for example, asks for a Date of Birth, I cannot justify ANY value if it was not collected or entered. A web developer once asked "Can’t we just make it 1900-1-1?" A little convenience in the near term, while introducing more confusion, bad data results and extra coding on the analysis end to try and avoid these issues later.
B) I try to avoid even using ZERO as the default value for an INT field.
For example, lets say I create a database for the US Census bureau. And one of the questions might be "How many children do you have living with you now?" Respondent A replies "0" i.e. they have no children living with them now, while respondent B does not answer the question but actually has 4 children. To me, 0 and NULL have two different meanings in this case. NULL does not equal 0, and 0 should not be equal to NULL.
As for the case logic in your example it seems a little too brief to me… something like this would be more accurate, flexible and not much more work:
case when @VAR1 < @VAR2 then @VAR1
when @VAR1 > @VAR2 then @VAR2
when @VAR1 is not null then @VAR1
when @VAR2 is not null then @VAR2
else 0 end
Price:
I enjoyed your discussion about Null values and your problem in dealing with it in C#. I write and sell code for use with the Microsoft Dynamics product and a lot of my code uses the SQL DataReader. Some rows in the Dynamics SQL tables may have null values in some of their columns, even though they are not supposed to be that way.
Thanks for all the great feedback. Some have asked for clarification regarding the difference between the SQL and the C# implementations I referenced previously. Here is the issue in a nutshell.
C# supports nullable types. For example, in C# you can have an int data type or a nullable int data type (int?). I was working with fees. Sometimes, if one particular item doesn’t have a fee, then nothing is stored in the database, resulting in null. In this case, our business rule says the fee is zero. For this reason, the C# program does not use NULL, it uses a scalar variable and defaults null database values to zero.
The SQL stored procedure did allow null values. Using the formula
CASE WHEN @Value1 < @Value2 THEN @Value1 ELSE @Value2 END
If @Value1 was null it returned the ELSE branch of the case statement, instead of returning @Value1.
The procedure was fixed by using ISNULL(@Value1, 0) instead.
This was unique to a specific situation. As described by some of our other readers today, there are many times when null really is the correct value in keeping with your business rules.
Feel free to send your comments or suggestions to btaylor@sswug.org. And thanks for taking my little SQL Challenge.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
Elemental MDX: Relative Member Functions (Part 3)
Part 3: BI Architect Bill Pearson continues his Elemental MDX Series with an introduction to some "major players" in the "relative" member functions. In this article, we will examine the .CurrentMember, .PrevMember and .NextMember functions. In addition, we will introduce calculated members and named sets as a preview to their examination in future articles.
Featured White Paper(s)
Query Tuning Strategies for Microsoft SQL Server
Written by Quest Software
When you’re looking for a reliable tool to diagnose … (read more)