20 June 2006 10:41 tonyrogerson

Turning stuff into "Camel Case" without loops

My good friend Barry Dorrans IM'd me asking if I had a function to convert text into Camel Case but he found one before I could reply, anyway until 2005 we've had to pretty much resort to loops or cursors - but no longer! We can now take advantage of [TEXT()], PATH('') and FOR XML and do it in a single SELECT statement in a set based fashion!

We do need a sequence table for this but it can be a permanent table instead of the table variable i'm using just to show the action...

First we need our sequence table, again - probably best this being permanent as you don't want to build this each time...

set nocount on

declare @seq table (
   seq int not null primary
key
)

declare @i int
set
@i = 1
while @i <= 50
begin
   insert @seq values( @i
)
   
set @i = @i + 1

end

Now we can get on and do the logic, notice its really only a single statement, to break this down I will first demonstrate it using a local variable... 

declare @word varchar(100)

set @word = 'tony rogerson'

select camelcase = replace( word, '&#x20;', ' ' )
from
(
   select case when seq = 1 or substring( @word, seq-1, 1 ) = ' ' then upper( substring( @word, seq, 1 )
)
                     
else substring( @word, seq, 1 ) 
            end as [text()]
   from @seq
   where seq <= len( @word
)
   
for xml path( ''
)
      
) as t( word )

Thats it! If you have a table you want to CamelCase then we simply do this...

declare @names table (
   
word varchar(50) not
null

)

insert @names ( word ) values( 'tony rogerson' )
insert @names ( word ) values( 'barry dorrans'
)
insert @names ( word ) values( 'trevor dwyer'
)
insert @names ( word ) values( 'simon sabin' )

select camelcase = replace( 
         ( select case when seq = 1 or substring( n.word, seq-1, 1 ) = ' ' then upper( substring( n.word, seq, 1 )
)
                             
else substring( n.word, seq, 1 ) 
                     end as [text()]
            from @seq
            where seq <= len( n.word
)
            
for xml path( ''
)
            )
      
, '&#x20;', ' ' )

from @names n

So, what are we doing here? Well, we are taking advantage of the new [text()] feature of the FOR XML that allows values concatenation, unfortunetly it turns the single space into a control tag hence I do a replace on &#x20 to a single space.

Elegant in its simplicity and will scale extremely well - and of course, no need for a UDF anymore!!

Ok, so after Colin Leversuch-Roberts beating me up over not support '-' and 'von' I've expanded the logic thus...

declare @names table (
   word varchar(50) not
null

)

declare @breaks table (
   break_on_character char(1) not
null

)

declare @exclude table (
   
subtext varchar(20) not
null

)

insert @names ( word ) values( 'colin leversuch-roberts being a pain lol' )
insert @names ( word ) values( 'jim von trapp' )

insert @breaks ( break_on_character ) values( ' ' )
insert @breaks ( break_on_character ) values( '-' )

insert @exclude ( subtext ) values( ' von' )

select CamelCase = replace( 
      
( select case when seq =
                              or ( substring( n.word, seq-1, 1 ) IN ( select break_on_character from @breaks ) 
                                 and not exists ( select
*
                                                         
from @exclude e
                                                         where subtext = substring( n.word, seq-1, len( subtext ) )

                                    ) )
                           
then upper( substring( n.word, seq, 1 )
)
                           
else substring( n.word, seq, 1 ) 
                  end as [text()]
   from @seq
   where seq <= len( n.word
)
   for xml path( '' )

   )
   , '&#x20;', ' ' )

from @names n

Make sure you check out my more recent entry on removing the control characters, e.g. &amp; and &#x20; from the output instead of relying on lots of REPLACE's -> http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/07/06/871.aspx

 

Filed under:

Comments

# re: Turning stuff into &quot;Camel Case&quot; without loops

20 June 2006 13:42 by Colin Leversuch-Roberts

those of us with double barrelled names will be disapointed!!

Colin Leversuch-roberts  !!!

# re: Turning stuff into &quot;Camel Case&quot; without loops

20 June 2006 15:09 by tonyrogerson

I think this will do the trick, of course you could have a table of control characters and cross join to that as well if you don't want to hard code the IN.

declare @names table (
   word    varchar(50) not null
)
insert @names ( word ) values( 'colin leversuch-roberts being a pain lol' )

select camelcase = replace(
       (   select case when seq = 1 or substring( n.word, seq-1, 1 ) IN ( ' ', '-' ) then upper( substring( n.word, seq, 1 ) )
                       else substring( n.word, seq, 1 )
                  end as [text()]
           from @seq
           where seq <= len( n.word )
           for xml path( '' )
           )
       , '&#x20;', ' ' )
from @names n

# re: Turning stuff into &quot;Camel Case&quot; without loops

20 June 2006 17:09 by Colin Leversuch-Roberts

my work colleague Jim von Trapp feels a little let down though!!! - sorry !!

# re: Turning stuff into &quot;Camel Case&quot; without loops

20 June 2006 18:04 by tonyrogerson

There you go Colin - fully dynamic now and supports von and probably other combinations.

# re: Turning stuff into &quot;Camel Case&quot; without loops

21 June 2006 17:33 by Colin Leversuch-Roberts

Maria von Trapp and all the little von Trapps thank you, along with myself - I actually sort of think for simplicity of coding a function still might be best - but I guess it all depends upon circumstance, I also wonder if C# would do it better ?

# re: Turning stuff into &quot;Camel Case&quot; without loops

22 June 2006 08:48 by tonyrogerson

A function or CLR function would be procedural whereas the solution talked about here is set based, so, you'd need to call the function and do the looping for each row in your recordset - I'll do some benchmarking to see the differences and report back.

Itzik Ben-Gan rightly points out I'm missing an ORDER BY on the FOR XML subquery to make sure the concatenation takes place in the correct order.

# re: Turning stuff into &quot;Camel Case&quot; without loops

30 June 2006 13:19 by LearnSqlServer.com

A little more work is required on this to make it fully "fit" but what an awesome solution - thanks for sharing, Tony :)  I mean it - so creative :)

Here's my implementation as a UDF. I've added a few things here and there, mostly the LOWER() function in case the user passes in 'TONY ROGERSON RULES' and an argument that optionally removes all spaces (to return 'TonyRogersonRules'). I've also done what we all do: formatted it for my own style! I use (as I'm sure most of us do) a permanent Sequences table and I've included the creation script at the top.

SET NOCOUNT ON
CREATE TABLE SequenceNumbers (SeqNo INT IDENTITY(1,1) PRIMARY KEY)
GO
WHILE ISNULL(IDENT_CURRENT('SequenceNumbers'), 0) < 8000
INSERT SequenceNumbers DEFAULT VALUES
GO
CREATE FUNCTION CamelCase (
@InputString NVARCHAR(1024)
, @StripOutAllSpaces BIT = 0
)
RETURNS NVARCHAR(1024)
AS
BEGIN
-- Formula written by Tony Rogerson: http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/06/20/832.aspx

DECLARE @InputStringInTableFormat TABLE ( InputString VARCHAR(50) NOT NULL )
DECLARE @Delimiters TABLE ( Delimiter char(1) NOT NULL )
DECLARE @InputStringsToExclude TABLE ( ExcludedWord VARCHAR(20) NOT NULL )
DECLARE @CamelCase NVARCHAR(1024)

INSERT @InputStringInTableFormat VALUES (@InputString)

-- Define any breaking characters (any character after these characters will be converted to upper case)
INSERT @Delimiters ( Delimiter )
SELECT ' ' UNION
SELECT '-' UNION
SELECT ''''

-- Define any InputStrings that should not be put in CamelCase
-- Insert a space preceeding the InputString
INSERT @InputStringsToExclude ( ExcludedWord )
SELECT ' von' UNION
SELECT ' van' UNION
SELECT ' to'

SELECT @CamelCase = REPLACE (
 ( SELECT CASE WHEN SeqNo = 1
OR ( SUBSTRING ( n.InputString, SeqNo-1, 1 ) IN ( SELECT Delimiter FROM @Delimiters )
AND NOT EXISTS ( SELECT *
FROM @InputStringsToExclude e
WHERE ExcludedWord = SUBSTRING( n.InputString, SeqNo-1, LEN( ExcludedWord ) )
)
)
THEN UPPER( SUBSTRING( n.InputString, SeqNo, 1 ) )
ELSE LOWER(SUBSTRING( n.InputString, SeqNo, 1 ))
END AS [text()]
  FROM SequenceNumbers
  WHERE SeqNo <= LEN( n.InputString )
  ORDER BY SeqNo
  FOR XML PATH( '' )
 ) , '&#x20;', ' ' ) -- the control character in XML
FROM @InputStringInTableFormat n

RETURN CASE @StripOutAllSpaces WHEN 1 THEN REPLACE(@CamelCase, ' ', '') ELSE @CamelCase END
END
GO
SELECT dbo.CamelCase('TONY ROGERSON RULES', 0), dbo.CamelCase('TONY ROGERSON RULES', 1)

# re: Turning stuff into &quot;Camel Case&quot; without loops

30 June 2006 13:21 by LearnSqlServer.com

I do think that this would be better done as a CLR function though. It certainly seems more logical to do this in C# than SQL. I'm anxious to see what your performance benchmarks show.

Scott Whigham

# re: Turning stuff into &quot;Camel Case&quot; without loops

30 June 2006 13:22 by LearnSqlServer.com

Dang - it stripped off all my formatting lol

# re: Turning stuff into &quot;Camel Case&quot; without loops

02 July 2006 09:10 by tonyrogerson

By using UDF and CLR you move away from a set orientated solution and as such performance will drop.

The solution I give can be used within a set, in our instance @names - this can be a huge table; with the UDF approach and CLR you'd need to call the linear procedure for every row you process which would be very time consuimg.

I've done a test using your UDF against the above query on the same set of data using dbcc dropcleanbuffers and dbcc freeproccache between iterations on a machine that would otherwise be idle, the UDF approach took 690 seconds whereas the set way took just 192 seconds, it would be interesting to see what effect CLR makes in this quarter.