Why NULL?
Let me open a can of worms here. Should you use/allow null in your databases? Personally, I’m very flexible on the use of null. I’m not going to move to 4th or 5th normal form in an OLTP database just to get rid of null values; although I’m not against the process either.
What I am finding, however, is that many designs that allow null don’t take into consideration the ramifications of the practice. Here’s a good real world example.
Lately I have been working on converting a TSQL process written in stored procedures into C#. One of my final acceptance steps is to run the query with a number of scenarios, confirming that the C# results are the same as those produced by the TSQL Stored Procedures.
There were two different variant bugs in the comparison. If I fixed the first bug, it introduced the second bug. When I fixed the second bug, it re-introduced the first bug. The problem, in C#, was completely impossible to define. Why, because C#, the way I had written the program, did not allow NULL.
When you perform operations on null in SQL Server, by default the result is NULL. Consider this case statement: CASE WHEN @Val1 < @Val2 THEN @Val1 ELSE @Val2 END.
If @Val1 is null or @Val2 is null the result is false because NULL can not really be evaluated. The CASE statement will always return @Val2 if either value is NULL. In contrast, using datatypes in C# that are not nullable, we translated @Val1 and @Val2 to zero if it is NULL (the appropriate business rule). As a result, in certain cases we returned @VAL2 where @Val1 was the correct choice..
The stored procedure had been used for a while in production systems, and the inaccurate results were unknown. It was the equivalent of trapping and swallowing an error rather than letting the program expose the problem.
What’s the lesson learned?
- If you use or allow nulls you have additional tests you must perform assuring accurate results. You can’t just assume that a working query returns the correct results.
- Ask yourself if you are allowing NULL values when they are not necessary. In my example the business rule was not that missing data was NULL, but rather that missing data was ZERO. The use of ZERO and a non-nullable data type would have reduced the complexity and assured more accurate results.
Are there other lessons that could be brought out? Share your thoughts or experiences to btaylor@sswug.org to post in future newsletters.
SQL Challenge
When returning lists of data in a sequence, we often wish to determine if a row is ODD or EVEN. Typically this is performed using a MOD function. If the sequence MOD 2 = 0 then it is an even number, else it is an odd number. What other way can you test for even/odd without using a MOD function?
Readers Comments:
LINQ – Include as Native language in SQL Server
David:
I have to admit I don’t understand why LINQ is such a dividing topic.
It is, as you state, syntactic sugar on top of SQL, and the underlying concepts are the same, so a DBA should have no trouble understanding what a LINQ query does nor should a dev familiar with LINQ have a problem reading SQL.
A point in favor of LINQ is the more regular syntax versus SQL’s FORTRAN-like syntax where all four CRUD operations have slightly different orderings.
A point in favor of SQL is stored procs to not require recompilation of the query on access. Though on that front, SQL Server could copy PostgreSQL’s “prepared statements”, which allow the query to exist in the source code with a unique identifier and PostgreSQL just builds a stored procedure for that query the next time it sees that unique identifier asked for again.
I am approaching this comparison from the perspective of a 1:1 App:DB relationship. If a single DB is being used by several apps and/or ad-hoc reporting services, then you have to use stored procs to maintain the DB’s integrity, but couldn’t those stored procs also be written with the LINQ syntax if so desired?
I suppose I want to say, from a primarily-dev perspective to DBAs: welcome to our world. Programming languages, patterns, libraries, and so on are in a constant 3-5 year flux for devs, while the database world has only recently begun to be shaken up by Non-Relational DBs and Non-SQL Relational DB syntaxes. (There have been additions/updates to the SQL syntax and improved techniques, but they’ve always been a refinement of the relational SQL paradigm.)
Greg:
LINQ serves no purpose and should just be dropped. Some of the programmers here like to use LINQ and I challenged them to do something with LINQ that I could not do without LINQ and I could do it with less code and faster performance. I proved this with the examples they sent me. An aspect of complexity is having multiple ways to accomplish the same task. LINQ simply adds complexity to an already bloated language.
Richard:
Interesting discussion…Should we teach developers SQL?
Hmmm………….
To my mind Linq breaks a fundamental tenet of good database design, which is that developers know pretty much squat about good database design.
A developers job is (generally speaking, at least in the context of my business) to collect the data and present it to the database so a transaction can be enacted.
So, from my point of view, when speaking to developers, I express my requirements in terms of my starting point for the data and the transformation I wish it to undergo. For example to book an order, we start with maybe a UID for a new record, query our customer table to uniquely identify who we are doing the booking for, collecting some details from the product table, adding a quantity, then committing a record to the db in the appropriate place to record the transaction.
Now I don’t care how the developer does that as long as they take the data I present them with (i.e. I specify the source data in terms of SQL views or queries) and the data that is returned (i.e. their form returns x,y,z to a procedure I have written which writes to the DB). What I care about is the atomicity of the transaction and the security on the database, which is why I always insist on a three tier database design, nothing else makes sense to me. I don’t care about UI and I’m not overly concerned with implementation details (well, only insofar as the developer needs to do it well).
I haven’t met too many developers who are “good” at SQL, but most of them can unambiguously understand and fulfil requirements when I present them with the data transforms that I want in SQL.
As such I’ve yet to see the point of Linq, a developer using Linq is a bit like a finance guy building a “database” in Excel, i.e. a frigging stupid idea.
Best regards
Llewellyn:
Llewellyn here, I gave the Entity Framework (EF) talk with Lynn Langit, and just wanted to clarify a couple of details.
First, the real beauty of using something like EF (or any object relational mapping (ORM)) is that you don’t use LINQ for updates, inserts or deletes. That is all handled by the .Save command. As a developer I always knew a bunch of T-SQL for getting stuff out of the database, but always got fuzzy on the syntax for saving things. As such I tried to keep it simple. insert or update statement off of a identity primary key. even with that it was still fuzzy. It’s nice not to have to keep track of those any more.
Second, you mentioned the "universality of SQL". Saying that it’s cross database companies, while LINQ is microsoft only…. This is kinda true. I mean to say I could write T-SQL in java or php to other databases, and it’s mainly the same.
However, while LINQ might be for .Net only, it has a very different universality. I can do LINQ statements against my Database, but I can also do LINQ against just regular .Net object no database at all. But it doesn’t stop there, I can do LINQ against; Mouse and keyboard events (Reactive Framework Rx) Odata web services sources REST JSON Objects Sharepoint WMI LDAP (Active Directory) XML Amazon Excel Lucene Javascript and many more….
As a developer this universality is much more valuable to me than being able to move databases or programming languages.
Alexander:
My view is that LINQ is a useful option that can allow developers to code is a way that they are accustom to. And if it can do everything that SQL can do then why from a pure developers view point would they want or need to know SQL.
However, this assumes that the developer works for one client or like minded clients that develop in the same way all the time. As I’m sure you are aware all clients are different even if they are separated by team name only. I’ve worked for a client that used .Net to develop old style procedure code applications and another client that took advantage of the latest technology and techniques that were available. No programme or project that I’ve been on has been the same. Therefore, in my opinion you have to understand SQL because it is the common language across all database platforms and you may find yourself on a project that expects you to create database objects that your programs need to consume.
Another reason to know SQL is the fact that in today’s market you are very lucky if you can remain solely a developer. Most likely you’ll have towear a number hats in your career and knowing SQL is an important tool in your toolbox.
Lastly, I remember being introduced to software engineering tools starting in the early 90s. They promised to remove the need for understanding the underlying programming language but the theory wasn’t as close to reality as the salesman suggested when the client bought the tools.
Brad:
I agree that LINQ has its place as an abstraction layer between the database and the data access layer. Objects are nice to use because just about everyone has trained on them. All the folks most concerned about building business systems think in terms of objects – real-world, useful, or fanciful.
The part that has been missed is the major business objective of all code-generating frameworks: faster/cheaper delivery of product to the customer because the code generator writes a bunch of stuff for you. Code generators are just big old wizards. When they are useful, use one when they are not useful don’t use one.
IMO:
Over the past 25+ years of coding various OSs’ & languages, I have used many code (Application, Component, Reporting, …) generators from Big Iron to mid-size to the current web tooling.
Like any other code-monkey, I look at the generated code & think, that’s nice, but why in the world…? Then I have to fix it because it does not work the way I have to have it work to produce the product.
Two lessons I have learned:
A. K.I.S.S
B. Use Screwdriver on Screw, Use Hammer on Nail.
SQL Server is an RDBMS, not a code generator. The design intent of the two different tools are, unsurprisingly, different.
Putting a code generator into SQL Core Services does not meet the intended goal of providing a RDBMS. (Rule A violation)
Another rule I have learned the hard way: Don’t mix different bleep together just because it may be convenient, look good, or sound like a cool idea after your 4th shot of Jack.
All you wind up with is a banana-smelling, lumpy bleep pie with tiny marshmallows, washers and asteroids. Or a hangover. (Rule A and B violations).
That’s all for today.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
Tackling a Long-running Query with Management Data Warehouse
I shall show you how Management Data Warehouse can be put to effective use if you are doing post-mortem analysis of a performance issue on your SQL Server instance.
Featured White Paper(s)
Query Tuning Strategies for Microsoft SQL Server
Written by Quest Software
When you’re looking for a reliable tool to diagnose … (read more)