If you’ve done any amount of SQL development, you’ve probably seen an
EXISTS clause. It might have been part of an
IF statement, or a
WHERE clause, but I’m sure you’ve seen it, and maybe used it yourself. One thing you’ll notice is that all developers have their own way of writing it. Some developers swear by
... EXISTS( SELECT 1 ..., another might use
... EXISTS( SELECT TOP(1) * .... So what’s the deal? Which method is best?
First things first, lets see some of the different variations of the
EXISTS clause, then we’ll take a look at the execution plans and see what’s going on behind the scenes.
*** For these demo scripts we’ll be using the AdventureWorks database.
Ex. 1: SELECT *
Ex. 2: SELECT TOP(1) *
Ex. 3: SELECT TOP(1) with defined column
Ex. 4: SELECT 1
I’m sure there are more examples, but this will be enough for our purposes here. Let’s take a look at the plans and io stats for these queries now and see how each performed.
Below is the plan for the first query in our list. Nothing too exciting here:
|Scan Count||Logical Reads||Physical Reads|
The interesting thing happens when we take a look at the other query plans. Because of optimizations to the
EXISTS clause, all 4 of these queries have an identical execution plan (other that the statement text included in the plan XML of course). From these tests it would seem that the optimizer doesn’t care what you select, it’s just looking at the query predicates to determine if something exists.
That sounds fine and good, but what if we make things a little more complicated? Here we are going to create a function that does some pointless work:
To test this function I ran the following:
On my system this ran for 60 seconds before I stopped it. So what happens if we add this to our
EXISTS? What if we also add a CPU heavy
Lets see the plan:
Notice anything familiar? Again we have the exact same plan. If you run this script yourself you’ll see that it also takes the same amount of time to execute as the rest.
To take this a little further, lets try something we know shouldn’t work (Thanks Kenneth Fischer for this great example!):
Again, same plan, same execution time, and no error. This example just further illustrates that the optimizer doesn’t care what you are selecting in an
EXISTS clause, it only cares about the predicate.
We looked at a few variations of the
EXISTS clause, and then saw how SQL Server handles them. Due to some nice optimizations SQL seems to do the least amount of work possible to check if something exists. This means that it is really up to you and your team to decide which is the most readable.
Personally I like to avoid
* whenever I can, even if it doesn’t cause a performance hit, so I tend to use the syntax in example 4
... EXISTS( SELECT 1 ....