Subquery Madness!

Shards flew everywhere. Coffee ran down the wall. I stood there, my heart pounding, my adrenaline pumping, a bit astonished at what I'd done.

Jonathan, WTF! What was that noise?

Uh, boss, it was, uh, my coffee cup smashing against the wall.

Your coffee cu... Hey! What happened to your keyboard?

Sorry. I guess I sorta smashed it with my fist.

Smashed it! What're you crazy? You trying to get fired or something?

Yes boss. I mean, no boss. Sorry boss.

In my office Gennick. Now!

Yes sir.

Sit down Jonathan. Relax a bit. You want a coffee? No, never mind, you've had enough coffee for today. Why don't you just pull yourself together and tell me what this is all about.

Well boss, you see, we have this table with a character column that contains numeric data for some rows and character data for others, and...

What! You mixed numeric and character data in the same column? Why in blazes did you do that? What were you thinking? You know better than that.

It wasn't me boss, honest! It was those consultants you hired. They did it.

Sounds like what Adam said after he ate the fruit.

Yeah, it's an inherited trait, I guess.

So it is Jonathan, so it is, and for all of us. Well, let's get back on track here. Tell me more about this mixed-up, character/numeric column.

I'm creating a report that depends on the numeric data in that column. I need to list all occurrences of non-zero data. To isolate those rows with numeric data in the column, I wrote the following subquery:

     SQL> SELECT flag, TO_NUMBER(num) num
        2 FROM subtest
        3 WHERE flag IN ('A', 'C');
     F NUM
     - ----------
     A 100

Then, I use that subquery as the rowsource for my main, report query:

     SQL> SELECT *
        2 FROM ( SELECT flag, TO_NUMBER(num) num
        3 FROM subtest
        4 WHERE flag IN ('A', 'C') )

        5 WHERE num > 0;

     ORA-01722: invalid number

Only it doesn't work! Boss, you can see that there's no non-numeric data flowing into the main query, and yet I still get an "invalid number" error. I don't understand. I'm so frustrated, I could just...

I can see you're frustrated. Otherwise you wouldn't have thrown that coffee cup. Jonathan, you've fallen into the trap of thinking procedurally, and you know SQL isn't procedural.

No, that can't be boss. I'm not thinking procedurally here at all.

Sure you are. But you're doing it in such a subtle way that you don't even recognize it.

How so, boss?

Let's back up a bit Jonathan. You know, like everyone else around here, that SQL is free to evaluate predicates in any order, right? For example:

     WHERE TO_NUMBER(num) > 0 AND flag IN ('A', 'C')

Sure boss, I know that either the greater-than (>) condition or the IN predicate can be executed first.

Right Jonathan. And you also accept, don't you, that using parentheses in a WHERE clause doesn't change anything, right? As in:

     WHERE (TO_NUMBER(num) > 0 AND (flag = 'A' OR flag = 'C'))

Sure boss, I understand this too. The parentheses specify the order of precedence for evaluating the Boolean expression, but the database engine is still free to evaluate TO_NUMBER first, or to evaluate the nested, OR condition first.

Right Jonathan, right. Let's look at a more complicated case. Let's add a hypothetical join to your subquery:

     SELECT flag, TO_NUMBER(num) num
     FROM subtest JOIN other_table
     ON =
     WHERE subtest.flag IN ('A', 'C');

Now, Jonathan, you do know, don't you, that the database engine might choose to evaluate the join condition first, or it might choose to eliminate rows from subtest first, by evaluating the condition on subtest.flag? In other words Jonathan, the entire join does not have to take place before the query begins evaluating rows against the WHERE clause predicates. Furthermore, and this is important, it's entirely possible for rows that would not pass the JOIN conditions to be tested against the WHERE conditions.

Sure boss, I understand all that. I've no problem with that behavior. I must admit though, that last bit you said is interesting. I never thought about joins quite that way before. The join is the rowsource to the query, and yet the WHERE clause might touch rows that are not returned by the join.

Exactly! And it's the same with subqueries. You're upset because your outer query is "seeing" a row that is not returned by your subquery. Isn't that right?

Spot on boss. And it still bothers me. Joe Celko, a past member of the ANSI SQL standards committee, tells me that parentheses, such as are around my subquery, make a difference. Joe says:

In Standard SQL, the scoping rules are probably what will determine the action. In the following, Foo and Bar are both exposed to the outer query.

     SELECT *
     FROM Foo
     ON Foo.x = Bar.y
     WHERE Foo.a = 42;

However, in this next version, Foo and Bar are hidden from the outer query and the derived table Foobar is seen:

     SELECT *
     FROM (Foo
     ON Foo.x = Bar.y) AS Foobar (a, x, y)
     WHERE Foobar.a = 42;

This time, the engine has to act as if Foobar is materialized.

And that last sentence is important boss, that "the engine has to act as if Foobar is materialized."

Hmmm... Interesting point Jonathan, but I'm not sure I agree with Joe this time. Bear with me for a moment. It is true, Jonathan, that when you select from a table, as in:

     SELECT flag, TO_NUMBER(num)
     FROM subtest
     WHERE flag IN ('A', 'C');

that the database engine will not apply your predicates to rows from some other table. There's no need to. All the possible rows are in subtest, and the database has no reason to look in any other table. However, let's change the situation. Let's look again at the query you are trying to execute:

     SELECT *
     FROM ( SELECT flag, TO_NUMBER(num) num
            FROM subtest
            WHERE flag IN ('A', 'C') )
     WHERE num > 0;

What's the rowsource of this query Jonathan?

Why, boss, it's the table returned by the subquery.

Yes, it is. And where do the rows in that table come from Jonathan?

Why, from the subtest table.

Exactly! And the database engine isn't stupid Jonathan. It knows where those rows ultimately come from.

So the subquery is not opaque then?

Opaque? As in not "seen" from the main query? That's a good word Jonathan, but no, the subquery is not opaque. The optimizer "sees" the subquery, and knows that it must ultimately look at rows from the subtest table. Knowing that, the optimizer is free to test those rows against your num > 0 predicate first, if that's deemed more efficient.

So you think that's what's happening boss?

Almost certainly, at least from a conceptual standpoint. Tom Kyte explains it well in his AskTom response to a question about invalid number errors.

If you get right down to what our database engine is doing, Jonathan, in this specific case the optimizer is probably rewriting your query to merge the subquery into the main query, and the result probably looks like the following:

     SELECT flag, TO_NUMBER(num) num
     FROM subtest
     WHERE TO_NUMBER(num) > 0 AND flag IN ('A', 'C');

But boss, is that really correct according to the SQL standard? Phil Shaw, a longtime member of both the ANSI and ISO SQL standards committees, speaks to this very point:

Merging the main and subqueries isn't the right thing to do when the subquery is filtering rows that would raise an exception in the main query. SQL has always specified that the FROM, WHERE, GROUP BY, HAVING, and SELECT clauses are effectively evaluated in that order (see the SQL Foundation, section 7.4, GR1).

Furthermore boss, Peter Gulutzan, Software Architect at MySQL, also points to the 2003 ANSI SQL standard, Part 2, Foundation, Section 7.4, which defines a table expression as follows:

     <table expression> ::=
     <from clause>
     [ <where clause> ]
     [ <group by clause> ]
     [ <having clause> ]
     [ <window clause> ]

And boss, under "General Rules", in paragraph 1, you'll find:

If all optional clauses are omitted, then the result of the <table expression> is the same as the result of the <from clause>. Otherwise, each specified clause is applied to the result of the previously specified clause and the result of the <table expression> is the result of the application of the last specified clause.

Look boss, at the second sentence in this paragraph. Doesn't it clearly state that the <where clause> must be applied to the *results* of the <from clause>? Hence, wouldn't it be incorrect to apply a WHERE clause predicate to a row not returned by a FROM-clause subquery?

Wow! Jonathan, I wish all my developers would dig into an issue like you do. You've raised an excellent point here, and I don't know how to respond. Clearly, our own database vendor sees things differently from Phil and Peter. I don't know how to reconcile our database's behavior with the section of the standard you've just quoted.

But Jonathan, I can tell you that there is yet another viewpoint that you haven't thought about yet.

There is boss? A third viewpoint? On this same issue? Tell me boss. I'm all ears.

Jonathan, if you talk to Dan Tow, who wrote the very excellent "SQL Tuning" book from O'Reilly Media, you'll find that he takes a very set-oriented approach, "holding that the intermediate steps should be irrelevant to the functional result, which certainly includes the generation, or not, of an error (though the intermediate steps are inevitably relevant to the performance)":

When an RDBMS generates an error evaluating some function (dividing by zero, converting something that doesn't convert,...) instead of instantly returning an error, it should just silently *mark* the row as a potentially-error-generating row and continue processing. If the row is *later* discarded (perhaps after some joins and filtering on the joined-table columns, or just evaluating some more conditions on the same table) because some WHERE-clause condition (which is evaluated unambiguously *without* getting an error) is false for that row, then the RDBMS "forgets" the error - I reason that since the error was generated on a row the application wasn't asking for, anyway, it is irrelevant to the query.

Hey, boss, that's neat! Dan sidesteps the issue completely. That's a unique viewpoint I haven't heard anywhere else.

Yes it is Jonathan. Dan's a deep-thinker when it comes to SQL. He'll open your mind to new ideas, that's for sure. By the way, related to the issue we've been discussing, one of my colleagues made the following comment the other day:

From a usability standpoint, it would be really nice if the optimizer could notice that a predicate may introduce conversion errors that are not apparent in the query text and warn the query writer.

Sure boss. I can see where such a warning would be helpful. It's all too easy to write query such as I did, walk away believing it to be correct, only to later find that it fails as a result of an execution plan change, perhaps because the DBA chose to compute fresh statistics.

Absolutely, Jonathan. If there was a potential land mine in your SQL, you'd want to know sooner rather than later.

I'm beginning to see now, boss, that the proper mental model for SELECT statements is something along these lines:

A SELECT statement describes the state of the result set when it is returned to you.

And furthermore, at least the way that our vendor looks at things, any explicit or implicit function that is applied to a column in a predicate may be applied to any value in that column, for any row in the *ultimate* source table.

Jonathan, I do believe you're catching on.

So boss, getting back to my query, I began by writing this:

     SELECT *
     FROM ( SELECT flag, TO_NUMBER(num) num
            FROM subtest
            WHERE flag IN ('A', 'C') )
     WHERE num > 0;

And the optimizer rewrote my query by merging the subquery into the main query:

     SELECT flag, TO_NUMBER(num) num
     FROM subtest
     WHERE TO_NUMBER(num) > 0 AND flag IN ('A', 'C');

Jonathan, yes, that's undoubtedly what happened.

It's all so clear now boss. The optimizer's rewrite of my query is ok, at least in the eyes of our particular vendor, because, conceptually, num > 0 can be executed on any row in subtest, and num > 0 is equivalent to TO_NUMBER(num) > 0.

Precisely! Jonathan. And you received an error when that predicate was applied to a character-string value that could not be interpreted as a number. You need to take another look at your num > 0 predicate, and you need to write it in such a way that it can be evaluated successfully even on non-numeric column values.

Boss, this has been great. Very enlightening. I feel much better now.

Good! Why don't you clean up that broken coffee mug and requisition a new keyboard. And no more coffee for the rest of the day, ok?

Sure boss. Thanks boss. No more coffee. Got it.


My sincere thanks to all the many SQL experts both on and off the Oracle-L list who took the time to discuss a very subtle aspect of FROM-clause subqueries with me. Thanks also to those SQL experts who allowed me to quote them in this article. The conversation between me and my boss is fictional. The quotes from Phil Shaw, Peter Gulutzan, Dan Tow, and Joe Celko are all real, and are used here with permission.

Follow-up and Related Articles

Chris Date wrote two follow-up pieces in which he comments on this article: A Cure for Madness, and More on Cure for Madness. These were originally published on Fabian Pascal's Database Debunking site. They now appear in Chris's book with Apress: Date on Database: Writings 2000-2006.

Dan Tow wrote the article Wrong-Errors Bugs: A New Class of Bug? for the O'Reilly Network, laying out his thinking about how intermediate results should be handled.

Example Data

You can use the following SQL statements to create and populate the subtest table used by the example queries in this article:

DROP TABLE subtest;
CREATE TABLE subtest (
   flag VARCHAR2(1),
   num VARCHAR2(3));
INSERT INTO subtest VALUES ('A', '100');
INSERT INTO subtest VALUES ('B', 'xxx');


Be sure to analyze the table. Without statistics, the query that fails in my article may well succeed. In my own testing, the lack of statistics consistently leads Oracle Database 10g to choose an execution plan that avoids the numeric conversion error.


SQLJonathan GennickSQL