When can you use select *

Alex has a great post questioning whether you should ever go against the "never use select * " mantra.

I for one think Alex has a valid point. The only point against select * is that you have to look elsewhere to find out the columns that are being returned.

However if you already have the list of columns in your query then its just a cut and paste to put them in the top level select.

and finally its only going to save you a few seconds. Far more than its taken to read Alex's post and for me to write this one. So just be a good citizen of planet SQL and "don't use select *".



-
Published 22 October 2007 13:15 by simonsabin

Comments

22 October 2007 14:05 by DB

# re: When can you use select *

Whilst we are at it , when we present production code snipits - why not utilise all best practices ? (Object owner qualification, aliasing etc).

[Agree: select * - is not a great idea, in any circumstance]

22 October 2007 15:41 by mgroves

# re: When can you use select *

I wouldn't say it's <em>never</em> a good idea.  But I think it's very, very rare when it is a good idea.

22 October 2007 16:36 by Jamie Thomson

# re: When can you use select *

I wholeheartedly agree. "SELECT *" has no business being in delivered code.

-Jamie

22 October 2007 18:07 by Alex_Kuznetsov

# re: When can you use select *

Well if the half life of your code is just a month or two, your copy and paste approach is perfectly acceptable, why not.

However if you consider maintenance, copy and paste approach starts looking less optimal.

One year later it will take you more than just a few seconds to verify that two column lists are identical.

Also if you ever need to modify your column list, you'll have to do it in more than one place, and that is prone to errors.

So your original few seconds might cause much higher maintenance costs later.

This is why in most modern programming approaches "cut-and-paste" is frowned upon.

This is why at the time of this writing many C#/C++/Java programmers prefer to reuse code rather than cut-and-paste it.

So from where I sit it looks like sometimes "being a good SQL citizen" means writing a difficult-to-maintain code.

What for?

23 October 2007 00:34 by Scott Frigard

# re: When can you use select *

WTF? As a contractor the Select * has been one the greatest banes of my life. Too many times I am faced with code embedded in applications designed to live for a couple months, but are still lingering around years later with the dreaded *. The database must change, but wait, can't do it 'cause it will break production code. So we end up creating other objects that hold these new attributes or some other kludge. Why? Because some developer couldn't be bothered with putting in the list of attributes they need. And don't talk to me about where clauses.

23 October 2007 12:15 by GrumpyOldDBA

# re: When can you use select *

totally agree with scott

23 October 2007 14:38 by Alex_Kuznetsov

# re: When can you use select *

I am not completely sure if anybody has read my original post, so let me repeat it:

If you google up 'never use SELECT * in production code' you will get many many hits.

Clearly in many cases using SELECT * makes your code vulnerable to changes in underlying table(s) and as such should be avoided.

Yet I don't think the common rule of thumb 'never use SELECT * in production code' should be used

without thinking, I don't think it should be blindly applied in all the situations.

For instanse, consider a request to display top five sales for every region and a simple query that satisfies it

SELECT SalesAmount, RegionName, LongListOfOtherColumns

FROM(

SELECT SalesAmount, RegionName, LongListOfOtherColumns,

 ROW_NUMBER() OVER(PARTITION BY RegionName ORDER BY SalesAmount DESC) AS rn

FROM Sales.Sales

) t WHERE rn < 6

Does repeating SalesAmount, RegionName, LongListOfOtherColumns twice make your code nay better/safer/readable?

I don't think so. Consider the following alternative:

SELECT *

FROM(

SELECT SalesAmount, RegionName, LongListOfOtherColumns,

 ROW_NUMBER() OVER(PARTITION BY RegionName ORDER BY SalesAmount DESC) AS rn

FROM Sales.Sales

) t WHERE rn < 6

It is shorter, easier to maintain, and just as robust as the original query.

You have explicitly listed the columns

in your subquery, so you are already insulated from any changes in Sales.Sales table.

You don't need an additional layer of protection.

Repeating the list of columns twice only makes your code more prone to errors.

Scott, without arguing that SELECT * has been the greatest bane of your life, can you explain why your rule of thumb applies _in_this_particular_case_.

23 October 2007 20:31 by Scott Frigard

# re: When can you use select *

Actually I did read your comment and I did understand what you meant. I will concede that the risk is low when using Select * in conjunction with a derived table. That doesn't mean it should be done. On this subject I am very passionate. Easy and safe doesn’t make it right.

One of the keys to having maintainable code is having code that very readable. This means taking all the steps necessary to help those poor souls who might inherit this code. This included always listing the attributes return in the select statement, formatting the code and avoiding other similar shortcuts.

I was recently faced with attempting to update an application intended to only be used for six to nine months that had been in place for over five years and was going to stay in place for another two years. This code made great use of Select *, Order by attribute position (not attribute name) and other interesting techniques. In addition, the previous developer wrote each SQL call as one long line because it took up less space in his IDE.

I was faced with the requirement of adding attributes to the tables, but couldn’t because it broke the code. Require to rename some attributes, but couldn’t because it broke the code. In the end I ended up renaming many table and creating a large number of views to support the legacy code. At this point I was able to enhance the database to support all the changes required. Why did I go through all this work, because it was easier to just use that d**m Select *.

I take pride in that all my code is documented, consistently formatted and written in such a manner as to make it as easy as possible for the next developer who has to work on this code. I can honestly say that when I come back to a previous location or talk to developers who have worked on something I have produced, that they never have a bad thing to say about my code. That’s worth its weight in gold when you are a contractor.

24 October 2007 11:37 by DamianMulvena

# re: When can you use select *

I thought it was preferable to use select * within

if exists (select * ...)

As I understand it, using a column list here might force a particular branch in the optimiser, while * leaves the optimiser to choose the best option.

Does anyone know the relative differences of "select *" vs "select 1" in this construct?

30 December 2007 06:31 by Mike C.

# re: When can you use select *

Look at the following code snippet for 5 seconds, and tell us exactly what is being returned by the outer SELECT query:

SELECT *

FROM(

SELECT SalesOrderID, OrderDate, (SELECT CustomerType FROM Sales.Customer WHERE Sales.Customer.CustomerID = Sales.SalesOrderHeader.CustomerID) AS CustomerType, (SELECT LastName + ', ' + FirstName FROM Person.Contact WHERE Person.Contact.ContactID = Sales.SalesOrderHeader.ContactID) AS CustomerName, (SELECT Name FROM Person.StateProvince WHERE Person.StateProvince.StateProvinceID = (SELECT StateProvinceID FROM Person.Address WHERE Person.Address.AddressID = Sales.SalesOrderHeader.ShipToAddressID) ) AS StateProvinceName, (SELECT Name FROM Person.CountryRegion WHERE Person.CountryRegion.CountryRegionCode = ( SELECT CountryRegionCode FROM Person.StateProvince WHERE Person.StateProvince.StateProvinceID = ( SELECT StateProvinceID FROM Person.Address WHERE Person.Address.AddressID = Sales.SalesOrderHeader.ShipToAddressID )  ) )AS CountryRegion,

ROW_NUMBER() OVER(PARTITION BY TerritoryID ORDER BY SubTotal DESC) AS rn

FROM Sales.SalesOrderHeader

) t WHERE rn < 6;