A comparison to sliced bread is a rhetorical device evoking the wonder of some new invention or feature in comparison to what went before. Sliced bread revolutionized the making of lunches and toast for breakfast. It meant so much to harried families that a World War II attempt to ban the slicing of bread to save on resources was opposed by New York City Mayor LaGuardia and lasted a mere 49 days from January 18, 1943 until being rescinded on March 8 that same year.
There is sliced bread in SQL.
You'll find it in the form of window functions, which are sometimes in the world of Oracle Database referred to as analytic functions. If you can write a join and compute sums and averages, then drop everything and make window functions the very next thing you learn in SQL. Yes, they are that important.
Here's an example business question involving the Adventure Works database:
For which products does our most recent change in cost represent a decrease in cost?
Following is a painful query to answer this question. I wrote it without benefit of window functions, and the result is a doubly-nested set of correlated subqueries.
SELECT pch.ProductID, pch.StartDate, pch.StandardCost FROM Production.ProductCostHistory AS pch WHERE pch.EndDate IS NULL AND pch.StandardCost < ( SELECT StandardCost FROM Production.ProductCostHistory AS pch_prior WHERE pch.ProductID = pch_prior.ProductID AND pch_prior.EndDate = ( SELECT MAX(EndDate) FROM Production.ProductListPriceHistory AS pch_prior_max WHERE pch_prior.ProductID = pch_prior_max.ProductID ) )
I used to enjoy writing this type of query in the way that one might enjoy putting together a jigsaw puzzle. And so much the better if a few self-joins were involved! But today I'd rather just grab some sliced bread out of the bag and be done with it. Here's the rewrite using a window function:
SELECT pch.ProductID, pch.StartDate, pch.StandardCost FROM ( SELECT ProductId, StartDate, EndDate, StandardCost, ISNULL(EndDate, GETDATE()) AS NoNullDate, LAG(StandardCost, 1) OVER ( PARTITION BY ProductID ORDER BY ISNULL(EndDate, GETDATE()) ) AS PastCost FROM Production.ProductCostHistory ) AS pch WHERE pch.EndDate IS NULL AND pch.StandardCost < pch.PastCost
Execute both queries against Adventure Works if you are able. Here are some things to try while you are doing that:
- Execute the subqueries independently, like you might do when debugging a query. This is not so easily done in the first query, but from the window function query it's a simple copy/paste operation to execute the inner query by itself.
- Execute the window function subquery and notice how it returns all the data needed to answer the business question at hand. Then the outer query filters that data for the answer.
- Add the previous cost to the final output. This is dead-simple to do in the window function solution, and not so easy at all in the double-nested-correlated-subquery solution.
It's simple to show the previous cost in the output from the window function solution. Just add one column to the SELECT list and job done:
SELECT pch.ProductID, pch.StartDate, pch.StandardCost, pch.PastCost
I don't want to bother with how to get the same output from the other query. Window functions make the job so much easier that it's not cost-effective to spend time on alternative solutions.
Recognizing when to use window functions is useful. Train yourself to look for the following indicators of when window functions can save you time and help create easier to understand and better performing queries:
- Detail and aggregate data needs to be compared, or both returned by the same query.
- Data from different rows needs to be compared, or returned together.
- Rows must be ranked against other rows in the result set.
- Running computations such as running sums or a running average are needed.
Learning the execution model is helpful too. Following are a some resources to help you grasp the conceptual details of how window functions work and how to think about using them:
- My Interlude post that is part of my series on Window functions shows conceptually how they are executed in the context of a query.
- Kathi Kellenberger's book, Expert T-SQL Window Functions in SQL Server is my top choice on the topic, and the many five-star reviews on Amazon back me up on that.
No one would run a restaurant or any other food operation today without readily available sliced bread. Likewise, no one should be running database queries without an understanding of window functions and when to use them. Learn them and use them. They are the best thing in SQL since sliced bread.