18 May 2007 11:37
tonyrogerson
Using TOP and ORDER BY with UNION / UNION ALL
Got asked this from a friend, I have a column treeID that has values 1 - 5 and for each value I require 20 random rows, so 20 for 1, 20 for 2 etc... Is there a query that will do this other than calling the stored procedure 5 times?
First thought is to use 5 queries and a UNION ALL, however the syntax does not support it ->
select top 20 somedata
from #blah
order by newid()
union all
select top 20 somedata = somedata + 1
from #blah
order by newid()
Gives this error ->
Msg 156, Level 15, State 1, Line 4
Incorrect syntax near the keyword 'union'.
To get round this problem we can use derived tables, think of these as a self-contained table, a bit like a temporary table within the query.
select somedata
from (
select top 20 somedata
from #blah
order by newid() ) as d
union all
select *
from (
select top 20 somedata = somedata + 1
from #blah
order by newid() ) as d
The NEWID() gives a different GUID per row, unlike CURRENT_TIMESTAMP for instance which is evaluated once regardless of how many rows (you always get the same value), NEWID() evaluates for every row.
You can probably do something with a Common Table Expression as well, any takers?
Congrats to Wanderer (see comments) on the first to give me a CTE example, nice!...
WITH Top20Blah (SomeDataOut) as (
select top 20 somedata
from #blah
order by newid() )
select 1, *
from Top20Blah
union all
select 2, *
from Top20Blah
union all
select 3, *
from Top20Blah
union all
select 4, *
from Top20Blah
union all
select 5, *
from Top20Blah
It just gets better, here is a CROSS APPLY example (see comments below) from Adam Machanic, I never even thought about CROSS APPLY - perfect use for it.
select p.somedata, x.treeid
from ( select 1
union all
select 2
union all
select 3
union all
select 4
union all
select 5
) x (treeid)
cross apply (
select top 20 somedata
from #blah
where treeid = x.treeid
order by checksum(newid())
) p
Got another I've just worked out myself...
select *
from (
select somedata,
treeid,
row_number() over ( partition by treeid order by newid() ) as rownumber
from #blah
) as r
where r.rownumber <= 20
order by treeid
Filed under: SQL Server