Editorials

The SQL “LIKE” Operator

Today I was discussing with a colleague how we might implement a search against a database containing a table of phrases. We want to allow a user to specify a list of words that may be found in a list of phrases. One search method would determine those records where one or more of the search words were found. Another method would find those records where all of the search records were found. A third option might find all the records where all of the search words were found in the order where they were entered.

As developers are want to do, we started evaluating the different ways we could implement this, and how an SQL query might be framed. Using Linq Expressions we could build an expression tree with all of the search words, and join expressions with an Or, or an And. The expression would search our phrase table, using the dot net Contains method. By using Linq expressions, these dot net Expressions get altered at run time into native SQL. So, an expression that searches my term table for records that TermString.Contains(“SSWUG”) will be translated into WHERE TermString LIKE ‘%SSWUG%’.

Now I have come to the key point of my editorial for today. Using .Contains is a perfectly legitimate way to write your code. However, if your table is very large, regardless of how this term column may be indexed, using LIKE ‘%SSWUG%’ could perform rather poorly. The LIKE operation in SQL Server has different behaviors depending on how it is written. If it starts with a wild card, it is not able to take advantage of an index build on the search column in the database. If, instead, you use the dot net operation .StartsWith, the translated SQL would be WHERE TermString LIKE ‘SSWUG%’. Notice it does not begin with a wild card.

Well, that’s nice, and everything. But, it doesn’t produce the logic necessary to filter data the way I need. In that case, one option available to you is to use the Full Text Search capability built into SQL Server. Using Full Text, SQL Server will take each row in my list of terms, and parse out each individual word, creating an index on those words, associating each word with the rows containing them. Using full text capabilities, I can now have an index on every word contained in the table of terms, without parsing and maintaining that list myself.

So, be aware that when you use the LIKE operator, you have a lot of power. However, that power may come with a price of lowered performance.

Cheers,

Ben