Which Table, Which Column?

When I first learned SQL, I was enthralled by the idea that I could just declare what data I wanted and let the database software figure out how best to get that data. Let the computer do the work, right? That same attitude also influenced the way in which I wrote SQL statements: when I wrote a query involving multiple tables, I figured it was the database's job to sort out which column belonged to which table. For example:

     SELECT emp_nm, dept_nm 
     FROM emp_mst, dept_mst
     WHERE emp_dept_id = dept_id;

The database has all that metadata, so let the database figure out that emp_nm and emp_dept_id are columns of emp_mst, and that dept_id and dept_nm are columns of dept_mst. In the arrogance and ignorance of my youth this attitude made perfect sense, but I've since learned humility, and I've learned to use table aliases consistently too.

Many potential problems lurk when you do not fully-qualify column names using either table names or table aliases. In this article I'm going to focus on just one such problem recently brought to my attention by a perplexed reader. To begin with, I have two tables, emp_mst and dept_mst, with the following columns and data:

     SQL> SELECT * FROM emp_mst;

         EMP_ID EMP_NM                         EMP_DEPT_ID
     ---------- ------------------------------ -----------
              1 Jonathan Gennick                         1
              2 Mayank Agarwal                           2

     SQL> SELECT * FROM dept_mst;

        DEPT_ID DEPT_NM                        DEPT_MGR_ID
     ---------- ------------------------------ -----------
              1 Accounts Payable                         1
              2 Data Processing                          0

Let's say I wish to write a query that returns emp_mst data for all department managers. My CEO needs this information for a presentation he's making to the board of directors, and he asked me for it at the very last minute, so I'm a bit distracted with other thoughts, but this is a simple enough query, so I sit down and quickly write the following:

     SELECT * 
     FROM emp_mst
     WHERE emp_id IN (
          SELECT emp_id 
          FROM dept_mst);

This looks good. I'll test my new query by executing it in SQL*Plus:

     SQL> SELECT * 
       2       FROM emp_mst
       3       WHERE emp_id IN (
       4            SELECT emp_id 
       5            FROM dept_mst);

         EMP_ID EMP_NM
     ---------- ------------------------------
              1 Jonathan Gennick
              2 Mayank Agarwal

Looks good! The query works. I quickly run the query again to generate a printed report, hand that off to my CEO, and away he goes to the board meeting to give his presentation. On his way out the door he gives me an "atta-boy" for getting the report done so quickly; I go home and sleep that night with visions of promotions, raises, and year-end bonuses dancing in my head.

The very next day I arrive at work only to get chewed out by my boss, and then later by the CEO himself. I gave the CEO a report full of bad data, and as a result his presentation to the board of directors was a complete flop. I'm embarrassed and humiliated. I feel like a two-bit junior programmer fresh from college, and I think I can forget about that promotion.

What happened? What went wrong? In my hurry to generate the report for my CEO, I'd mistakenly referenced the wrong column in my subquery. I'd intended to write the following subquery to return a list of manager IDs from the dept_mst table:

     SELECT dept_mgr_id FROM dept_mst

Instead, being a bit distracted, I'd written the following, incorrect subquery:

     SELECT emp_id FROM dept_mst

Realizing now that I'd written emp_id where I'd intended to write dept_mgr_id, I sit in my cube, nursing my now-cold cup of coffee, wondering how I could have made such an elementary blunder. Then I begin to wonder why my query even worked to begin with. After all, I selected a column from dept_mst that didn't exist in the dept_mst table. Why oh why did the database return results? What business did the database have executing what was obviously a bad query? I set my coffee cup aside, type in my subquery, and get the following results:

     SQL> SELECT emp_id FROM dept_mst;
     SELECT emp_id FROM dept_mst
     ERROR at line 1:
     ORA-00904: "EMP_ID": invalid identifier

Sure enough, the database recognized that the emp_id column didn't exist in the dept_mst table and returned a corresponding error message. Well <dirty words>! Why did the same SELECT not return an error when used as a subquery when I generated that report for my CEO? Why did the database not catch my mistake when it counted? Why indeed? The answer to these questions lies in an understanding of the ANSI SQL standard's prescribed behavior for resolving column names used in an SQL statement. When we don't explicitly link column names to their corresponding table, the database software must, in accordance with the ANSI SQL standard, attempt to "divine" our intent. In the case of the query which caused me so much grief, the database did the following:

    1. Checked dept_mst, the table listed in the subquery's FROM clause, for an emp_id column. No emp_id column was found in dept_mst, so the database continued its search.

    2. Checked emp_mst, found an emp_id column there, and consequently presumed that I was referencing that column, the emp_mst.emp_id column, from my subquery.

The database had no way of knowing that I intended to write dept_mgr_id in the subquery rather than emp_id. My intent was to write a non-correlated subquery, executed just once, that returned a list of department manager IDs from the dept_mst table. Because I didn't use table aliases to fully-qualify my column names, the database interpreted my subquery as a correlated subquery linked to the outer query via the emp_id column. The resulting subquery was executed once for each row retrieved by the outer query, and generated erroneous results. Consequently, my outer query also returned erroneous results. The outer query retrieved all records from emp_mst. For the first record retrieved, emp_id was 1, so the following version of the subquery was executed:

     SELECT 1 FROM dept_mst

For the second emp_mst record, emp_id was 2, so the following subquery was executed:

     SELECT 2 FROM dept_mst

In each case, the subquery was selecting the current emp_id value from dept_mst, so of course emp_id was in the list of values returned by the subquery. Duh! For each row returned by the outer query, the IN predicate returned TRUE, so the query returned all rows from emp_mst when my intent was to return rows only for department managers.

Understanding now why my query executed and returned incorrect data rather than generating an error message, I pour myself another cup of coffee and sit back to ponder what I can do in the future to reduce the risk of ever making such a mistake again. I come up with two practices to adopt:

    1. Test each subquery independently of the main query

    2. Use table aliases to fully qualify column names in a query

My subquery was very simple, so simple that I didn't see the need to test it independently of the main query. Not testing it independently was a mistake. Had I been a bit more rigorous in my approach, I would have tested the subquery apart from the main query, and the resulting error message would have clued me in to my blunder. In the future, I'll take care to avoid such "shortcuts".

The other practice I resolve to adopt is to use table aliases in any query involving more than one table to explicitly declare my intentions as to which column is associated with which table. I intended, in my subquery, to select an employee ID column from the dept_mst table. The problem was that my memory failed me, and I thought emp_id was the correct column name. Had I used table aliases to make clear my intentions, my query would have failed straight-away:

     SQL> SELECT * 
       2  FROM emp_mst e
       3  WHERE e.emp_id IN (
       4     SELECT d.emp_id
       5     FROM dept_mst d);
     SELECT d.emp_id
     ERROR at line 4:
     ORA-00904: "D"."EMP_ID": invalid identifier

The alias in front of emp_id in the subquery removes all doubt as to which table I am selecting from, and the error message makes it painfully clear that I used the wrong column name. Had I encountered this error, I would have immediately realized my mistake, and would have used the following, correct query to generate that report for my CEO:

     SELECT * 
     FROM emp_mst e
     WHERE e.emp_id IN (
        SELECT d.dept_mgr_id
        FROM dept_mst d);

As I sip the last of my coffee, I'm inspired with the idea of writing a story about my experience and posting it on the company Intranet for the benefit of my fellow developers. My boss reads the story and tells his boss. Word gets up to my CEO, who appreciates that I can learn from my mistakes, and he is especially pleased that I'm sharing my newfound knowledge. He even takes me to lunch. I drift off to sleep that night (again) with visions of promotions and bonuses dancing in my head...