The Beatles versus the Stones

19 years ago

Here’s a nice brain teaser, just before the end of the year. Despite the title, it is related to SQL Server, not to music!

A common misconception amongst SQL Server users is that a clustered index on a table will ensure that data is returned in the order implied by that index. I have lost count of the number of times I had to disprove this notion.

Of course, there are many cases where the rows returned by a query will be in the order of the clustered index. Here’s a quick illustration, using the lineup that The Beatles had during most of the 60s.

CREATE TABLE Beatles

(LastName varchar(20) NOT NULL PRIMARY KEY CLUSTERED,

FirstName varchar(20) NOT NULL UNIQUE NONCLUSTERED);

INSERT INTO Beatles (LastName, FirstName)

SELECT ‘Lennon’, ‘John’

UNION ALL

SELECT ‘McCartney’, ‘Paul’

UNION ALL

SELECT ‘Harrison’, ‘George’

UNION ALL

SELECT ‘Starr’, ‘Ringo’;

SELECT LastName FROM Beatles;

DROP TABLE Beatles;

LastName

——————–

Harrison

Lennon

McCartney

Starr

The results of this query are in alphabetical order of last name, the column used in the clustered index. Apparently, this is one of the very many cases where the order of the rows is implied by the clustered index, allowing the misconception that this is always the case to spread even further.

But an interesting thing happens if I use the exact same table definition to old and query the lineup of that other famous rock group of the 60s:

CREATE TABLE Stones

(LastName varchar(20) NOT NULL PRIMARY KEY CLUSTERED,

FirstName varchar(20) NOT NULL UNIQUE NONCLUSTERED);

INSERT INTO Stones (LastName, FirstName)

SELECT ‘Jagger’, ‘Mick’

UNION ALL

SELECT ‘Jones’, ‘Brian’

UNION ALL

SELECT ‘Richards’, ‘Keith’

UNION ALL

SELECT ‘Watts’, ‘Charlie’

UNION ALL

SELECT ‘Wyman’, ‘Bill’;

SELECT LastName FROM Stones;

DROP TABLE Stones;

LastName

——————–

Wyman

Jones

Watts

Richards

Jagger

In this case, the names are returned in random order. That makes this a great example to really disprove the notion of a clustered index guaranteeing any output order. (Dare I say that we now finally have solid proof that the Stones are better than the Beatles? Or will that make me subject to loads of flames?)

What’s intriguing in this case is the difference in behaviour for the two examples. Apart from the table name, the two code snippets are exactly the same – and even renaming the tables won’t change the results. So here’s the brain teaser that I’ll leave you to ponder over your glass of champagne: what is the reason that the Beatles are, but the Stones are not returned in clustered index order?

I’ll post the answer in a few days.

On a more personal note, I want to apologize for not posting any new stuff during the last two months. I still have some good ideas in my scratchpad, but I need some time to polish them up to blog quality – and time is the one thing I have been lacking for the past two months. The bad news is that I will probably be short on time for the next month as well, but things are looking more sunny after that.

And with this being my last post of the year, I’ll also grab this opportunity to wish all readers of sqlblog.com a very great 2007, with lots of love and luck in your personal lifes, and lots of interesting SQL challenges and enticing performance gains at work.

Hugo Kornelis

Hugo Kornelis is an established SQL Server Community expert. He writes, blogs, speaks, tech edits, and researches, focusing mostly on SQL Server performance and execution plans. He was the technical editor for the third edition of Grant Fritchey's "SQL Server Execution Plans". In 2018 he started a project to document all behaviour of SQL Server execution plans at his website, "the SQL Server Execution Plan Reference" (https://sqlserverfast.com/epr) When not working for the community, Hugo is busy at his day job: freelance database developer/consultant. Hugo has over 20 years of experience on SQL Server in various roles. He has a strong database design background but has since specialized into query tuning and execution plans.

The table scan from hell

The Beatles versus the Stones – the explanation

No results found.

23 Comments. Leave new

cinahcaM madA
December 31, 2006 18:31

Nice one, Hugo — that is a great example! Took me a moment to figure out, but it makes perfect sense. Have a great New Years!

Loading...

Reply
Kalen Delaney
December 31, 2006 19:18

This is great! I’ll admit, I didn’t figure it out by just looking at the code. I had to run it, and start doing some of my usual query analysis, but I got it pretty quickly after that. I’d love to use this as an example in my classes, if that’s ok with you? It makes me wish I was still a college teacher, and I could actually use this as a pop quiz. 😉

Loading...

Reply
Denis The SQl Menace
December 31, 2006 20:02

Oh yes, I remember this thread “insert into tbl1 select * from tbl2order by field1 doesnt work!”

from the microsoft.public.sqlserver.programming newsgroup(http://tinyurl.com/y8umhm)

There were 121 messages in total going back and forth until Hugo set the op straight

Denis

Denis

—

Edited by admin to remove long link –

Loading...

Reply
AaronBertrand
December 31, 2006 23:49

The answer isn’t too obvious, unless you’ve been dealing with this misconception a lot. 🙂 It’s a very good, succinct example that I’m sure will be referenced in many such discussions in the future.

Loading...

Reply
Dave Markle
January 1, 2007 16:56

Awesome. 3 minutes! Let me pat myself on the back… There. Done. I feel a lot better.

Honestly, I haven’t been fighting this misconception much. I find myself fighting far too many "wtfs" — subqueries like "(SELECT TOP 100 PERCENT FROM foo ORDER BY bar)". I am a bit shocked every time I see that one. And the fact that it usually seems to "work". And the fact that the SQL 2000 EM query designer generates these sorts of hideous abominations.

Loading...

Reply
Hugo Kornelis
January 2, 2007 00:56

Kalen: Of course you can use this in your classes. The pop quiz idea is closer than you may think. I had intended to demonstrate these queries at the end of a session I delivered last month and give a prize to the first to correctly explain the behaviour, but I didn’t have enough time left 🙁 That’s when I decided to post it here instead.

Dave: Allow me to add my back-pattings as well. Figuring this out in just 3 minutes is really good! re "TOP 100 PERCENT … ORDER BY": you’ll probably stop complaining about EM’s view designer adding this once you find out that SSMS’ view designer does the same – even though the "trick" doesn’t even work anymore on SQL Server 2005!

Loading...

Reply
Dave Markle
January 2, 2007 04:07

Wow. Thanks Hugo! A truly disturbing revelation indeed. Since it’s so totally off-topic, I did some more investigation of it on my blog (http://executioniseverything.blogspot.com/2007/01/order-by-in-views-sql-server-2005.html). If you play with the numbers a bit in the TOP clause, it really makes you wonder what the "proper" behavior should be!

Loading...

Reply
Hugo Kornelis
January 2, 2007 16:05

Hi Dave,

The "proper" behaviour is to use the ORDER BY only to determine which rows are included in the TOP xxx subcollection. A view can’t be ordered, by definition, so SQL Server is free to return rows in any order it sees fit.

SSMS should not even allow you to attempt to define an order for a view. I have submitted a bug for this to Connect: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=249248.

Loading...

Reply
Denis The SQL Menace
January 2, 2007 18:53

Yes it does make sense but I think (some) people will say

No problem I will just add

where lastname > ”;

and that fixes that ‘problem’

Loading...

Reply
Ahmed Charles
January 3, 2007 22:10

Well, I thought I’d try to understand why the optimizer does what it does and came up with the following:

CREATE TABLE Stones

(LastName varchar(20) NOT NULL PRIMARY KEY CLUSTERED,

FirstName varchar(20),

LastNameAgain AS CAST(LastName AS char(1597)));

CREATE UNIQUE INDEX IX_FirstName ON Stones(FirstName) INCLUDE (LastNameAgain);

In SQL Server 2005 (9.00.3033.00), the adding the char(1597) column to the nonclustered index, but not the clustered index changes the behavior. A char(1596) column doesn’t. I guess this approach is another way to "solve the problem" without requiring that you change all your queries. 🙂

Loading...

Reply
Hugo Kornelis
January 4, 2007 01:37

Denis, Ahmed,
For the record – there is no “problem”, so nothing needs to be fixed. The query has no ORDER BY, so any ordering in the returned data is correct. My post is just about the strange observation that the Beatles “happen to be” returned in clustered index order, while the Stones are not.
I’ll get back to the reasons why Denis’ where clause and Ahmed’s (very impressive!! – kudos for finding that) computed column change the behaviour of the Stones at a later time. I can’t get into that right now without giving away the answer, and I’ll postpone that a bit, so that new readers can scratch their heads a bit (and others can continue to tear out what’s left of their hair <evil grin>).

Loading...

Reply
Ahmed Charles
January 4, 2007 02:52

I agree, there is no "problem" (unfortunately sarcasm is fairly hard to express in text). I also agree that people should scratch there heads for awhile, though I’m sure it’ll make sense in the end.

Loading...

Reply
Chuck Boyce
January 5, 2007 01:22

Hugo, my friend,

There’s simply no comparison between the Beatles and the Stones. I love the Stones, btw…"Jumping Jack Flash", "Happy", "Miss You". The Stones rock, man.

But the Beatles??? Are you kidding me???????

"Dear Prudence"

"Rain"

"Baby, You’re a Rich man"

That awesome Rickenbacker John Lennon played?

The Stones are at the top of the 20th century and certainly one of the best bands of the 60’s and 70’s, but the Beatles are the most significant Pop group in a millennia.

Loading...

Reply
Hugo Kornelis
January 5, 2007 02:21

Hi Chuck,

Heheh! I expected comments llike yours – in fact, I am surprised I didn’t get a big flame war about which band was better within hours of posting <g>.

I won’t get into this debate. In fact, I even carefully avoided any indication that either band might be better or worse, or even that either behaviour of any of the tables might be better or worse!

Loading...

Reply
ShimMaster
January 16, 2007 18:01

Results are being returned in order by the non clustered index.

Remove the non clustered index and the results are returned in the correct order.

Optimizer chooses to do an index scan when clustered index only exists and via non clustered index when both indexes exist?

Very interesting.

Loading...

Reply
Hugo Kornelis
January 16, 2007 22:21

Hi ShimMaster,

Congratulations! You figured out the correct answer.

But I must object to the usage of the words "correct order" when refering to a query that has no ORDER BY. All orders are correct – they are only different (or rather, they <b>appear</b> to be different).

I tried to publish a new blog post with a detailed explanations and some follow-up to the observations made by Denis and Ahmed, but experienced some technical difficulties. I’ll try again later today.

Loading...

Reply
Pollus
December 28, 2007 21:12

Very nice article ! Short and using a nice title !

Loading...

Reply
Sanchay
March 21, 2013 10:27

Got it 1 minute, but don’t know the reason for behavior, why its considering FirstName at all?

Loading...

Reply
Maniac
November 21, 2014 15:55

This fixes the problem:

SELECT LastName FROM Stones WITH (Index(1));

Loading...

Reply
How about #temp table?
April 5, 2016 21:30

if I already have Stones table and I copy data I need into temp table, will the data in temp table always sorted

CREATE TABLE Stones

(LastName varchar(20) NOT NULL PRIMARY KEY CLUSTERED,

FirstName varchar(20) NOT NULL UNIQUE NONCLUSTERED);

INSERT INTO Stones (LastName, FirstName)

SELECT ‘Jagger’, ‘Mick’

UNION ALL

SELECT ‘Jones’, ‘Brian’

UNION ALL

SELECT ‘Richards’, ‘Keith’

UNION ALL

SELECT ‘Watts’, ‘Charlie’

UNION ALL

SELECT ‘Wyman’, ‘Bill’;

go

SELECT LastName

INTO #MyTemp

FROM Stones Order by LastName

SELECT * from #MyTemp –will this data always be ordered?

DROP TABLE #MyTemp

Also, is the order guaranteed with SELECT LastName FROM Stones WITH (Index(1))"?

Thanks

Loading...

Reply
Hugo Kornelis
April 10, 2016 15:13

Hi, "How about #temp table?"!

Your question:

> SELECT * from #MyTemp –will this data always be ordered?

has a simple answer: No. It might be, but there is no guarantee. If you want to be sure that data is returned in any specific order, then you will always have to add an ORDER BY clause. Nothing else will ever provide guarantees.

Loading...

Reply
Andy Hogg
December 31, 2016 19:20

Great post.

Disappointing to see that the ability to create a supposedly ordered view is still present within the view designer in the latest version of SSMS.

I see the comments in your Connect item.

It doesn’t look like there is any appetite on Microsoft’s part to address this.

Loading...

Reply
Pablone
February 15, 2017 18:38

In the first query you don’t know if it is ordering ascending by first or last name because in both cases the result is the same. In the second query it is not random, it seems it is ordering for first name, the non clustered index. It seems in both cases it is ordering ascending for the non-clustered index, but why it is not using the clusted index?

Loading...

Reply

The Beatles versus the Stones

The Beatles versus the Stones

Like this:

Related Posts

23 Comments. Leave new

Leave a ReplyCancel reply

Hugo Kornelis

Subscribe via Email

Meta

Archives

The Beatles versus the Stones

The Beatles versus the Stones

Share this:

Like this:

Related Posts

23 Comments. Leave new

Leave a ReplyCancel reply

Hugo Kornelis

Subscribe via Email

Meta

Archives