Null represents the absence of a value in a database column. Null means no value at all, and in that sense null can be thought of as nothing. Should you fear the nothing? Yes, indeed! Because nulls lead to three-valued logic, which is more like a three-headed monster because of all the unintended, counterintuitive, and often just plain wrong results it can cause.
Following is a contrived example showing how nulls can lead to counterintuitive results. Execute the query and check the number of rows that are returned. Compare to the total number of products. Evidently there is a large quantity of products that are simultaneously not red and not some other color. How can this be?
SELECT Name, Color FROM Production.Product WHERE Color = 'Red' OR Color <> 'Red'
The answer is that close to half the products lack a color specification. Their color is null. Those products do of course have a color. How could they not? But their colors are not captured in the database. When the color of a product is not known, it is not possible to know whether that unknown color is red, or whether it is green, or blue, or silver. It is likewise not possible to know whether that unknown color is not red, or not green, or not blue, or not silver. Neither condition in the WHERE clause can be determined to be true, and the row is thrown out.
Here is where you have to pause and think carefully about the business context surrounding a given query. Why is the query being executed? What is the business intent? What columns might be null, and how do you want those null values to affect the query output? Properly dealing with nulls is more than a technical matter, and it’s important to make the time to understand the business ramifications.
Predicates for Nulls
It’s pretty obvious that my example query is intended to return all rows. We can make the query do that by adding an IS NULL predicate:
SELECT Name, Color FROM Production.Product WHERE Color = 'Red' OR Color <> 'Red' OR Color IS NULL
There’s also an IS NOT NULL predicate that we can use to return all products having a known color:
SELECT Name, Color FROM Production.Product WHERE Color IS NOT NULL
Always think about whether any column you mention in a WHERE clause can be null. Some columns won’t ever be null due to database constraints of one sort or another. But whenever a column has the potential to be null, think about whether adding an IS NULL or IS NOT NULL predicate is the right thing to do in order to generate the intended business result.
Speaking of business results, let’s look at a more nuanced query. Because nuance is where nulls become dangerous. Here is a simple query for all the parts that are not red:
SELECT Name, Color FROM Production.Product WHERE Color <> 'Red'
Should we add an IS NULL predicate to this query? You can’t answer that question without understanding the business context. You might need to pick up the phone and call someone. When you make that call, you might find that the business user consuming the query output has never considered whether “not red” should include products for which color is unknown. Time may be needed to explore the business intent, and you could end up on a fascinating journey of discovery alongside the user as you learn more about the business while the user learns more about the data.
Whatever happens, don’t just toss in a NOT NULL predicate without knowing why you are doing so. Likewise, don’t omit a NOT NULL predicate without knowing why you are doing so. Know the result you want for the business, and write predicates to get that result.
Whenever a column might be null, you simply must stop and take time to think through the implications and write your query to deliver sensible results. Think intentionally about what nulls mean for each column mentioned in the query. Only then can you begin to be certain that you're delivering the results your business needs.