Just stuff it!

Just stuff it!

I’m sure you’ve all heard it, and probably even said it, many times: “SQL Server sucks at string manipulation”. And with good reason – it is true. But not quite as true as many seem to believe.

 

I notice that many people who complain about SQL Server lacking string manipulation are themselves unaware of the string functions that SQL Server does have. Most know LIKE, LEFT, RIGHT, and SUBSTRING. Some also know CHARINDEX, maybe even REPLACE. But how many of you know and use PATINDEX, REVERSE, QUOTENAME, and LTRIM, to name just a few?

 

The string function that appears to be the most overlooked has to be STUFF. And yet, this function can prove to be an invaluable tool. I’ll give a short description first, then show some examples of how STUFF can be used for string manipulation that would otherwise be much harder to achieve.

 

What STUFF does, basically, is cut a specified portion from a string and replace it with a new string. The replacement string can be shorter, longer, or exactly as long as the part that has been cut out. The STUFF function takes four arguments, none of which is optional. The first is of character data type and specifies the input string; the second and third are integers specifying the starting position and length of the substring to remove, and the fourth is the replacement string. Here are some examples to illustrate the use of STUFF:

 

DECLARE @Test varchar(15)

SET @Test = ‘Hugo Kornelis’

SELECT STUFF (@Test, 1, 4, ‘I am’),   — Basic usage example

       STUFF (@Test, 6, 0, ‘SQL ‘),   — Replaced string can be empty

       STUFF (@Test, 2, 10, ”)       — Replacement string can be empty

 

                                   

————— ——————- ——

I am Kornelis   Hugo SQL Kornelis   His

 

 

The first real world example of using STUFF is based on a newsgroup question. The question was how to replace a substring of a string. Here’s the simplified and non-working example given by the poster (based on the pubs sample database):

 

UPDATE authors

SET    SUBSTRING(phone,5,3) = ‘888’

WHERE  SUBSTRING(phone,5,3) = ‘826’;

 

One of the regulars in the group replied with this suggestion:

 

UPDATE authors

SET    phone = REPLACE(phone, ‘826’, ‘888’)

WHERE  SUBSTRING(phone,5,3) = ‘826’;

 

That looks clean and tidy, and runs fine on the data in pubs – but what if one of the authors in pubs has phone number ‘801 826-0826’? Indeed, it would be changed to ‘801 888-0888’ instead of ‘801 888-0826’, as requested.

 

Another way to do this that I often see recommended (though not in this particular thread) is to use LEFT and SUBSTRING to cut the string in pieces, then mend them together after making the change. Like this:

 

UPDATE authors

SET    phone = LEFT(phone, 4) + ‘888’ + SUBSTRING(phone, 8, LEN(phone) – 7)

WHERE  SUBSTRING(phone,5,3) = ‘826’;

 

That works – but ugh!, how ugly. This would be a lot easier with STUFF:

 

UPDATE authors

SET    phone = STUFF(phone, 5, 3, ‘888’)

WHERE  SUBSTRING(phone,5,3) = ‘826’;

 

 

Here’s another example. A common question in the newsgroups is how to find individual parts in a string. For instance, a bulk import results in a table of names in the form “Lastname, Firstname”. How to get individual Lastname and Firstname values from this?

For the first part of the string (Lastname in this case), this is easy. Use POSINDEX to find the starting position of the separator, use that to calculate the argument for the LEFT function.

The second part is harder. Answers typically given in the groups to these questions use techniques such as finding the starting position, calculating remaining length and feeding those as parameters to the SUBSTRING (version 1 below); using REVERSE to change the task to another task of finding the first part in a string (version 2 below) or using RIGHT instead of LEFT, with again the use of REVERSE to calculate length (version 3). However, with STUFF (version 4), this becomes much easier! See version 4, below:

 

CREATE TABLE BadData (FullName varchar(20) NOT NULL);

INSERT INTO BadData (FullName)

SELECT ‘Clinton, Bill’ UNION ALL

SELECT ‘Johnson, Lyndon’ UNION ALL

SELECT ‘Bush, George’;

— Version 1, using SUBSTRING

SELECT LEFT(FullName, CHARINDEX(‘, ‘, FullName) – 1) AS LastName,

       SUBSTRING(FullName,

                 CHARINDEX(‘, ‘, FullName) + 2,

                 LEN(FullName) – CHARINDEX(‘, ‘, FullName) – 1) AS FirstName

FROM   BadData

— Version 2, using REVERSE, LEFT, and REVERSE again

SELECT LEFT(FullName, CHARINDEX(‘, ‘, FullName) – 1) AS LastName,

       REVERSE(LEFT(REVERSE(FullName),

              CHARINDEX(‘ ,’, REVERSE(FullName)) – 1)) AS FirstName

FROM   BadData

— Version 3, using RIGHT and REVERSE

SELECT LEFT(FullName, CHARINDEX(‘, ‘, FullName) – 1) AS LastName,

       RIGHT(FullName, CHARINDEX(‘ ,’, REVERSE(FullName)) – 1) AS FirstName

FROM   BadData

— Version 4, using STUFF

SELECT LEFT(FullName, CHARINDEX(‘, ‘, FullName) – 1) AS LastName,

       STUFF(FullName, 1, CHARINDEX(‘, ‘, FullName) + 1, ”) AS FirstName

FROM   BadData

 

You might say that there’s not actually that much difference between versions 3 and 4 above, so why am I so enthusiastic about STUFF? To see that, let’s take this to the next level. What if the names in the imported data are of the form “Lastname, Firstname MiddleInitial”, with MiddleInitial being optional? Below, you’ll find the best I was able to do without STUFF, and than a version with STUFF – still not pretty, but not quite as awful as the first version, I’d say. If anyone is able to write a shorter version, I’d love to hear it!

 

CREATE TABLE BadData (FullName varchar(20) NOT NULL);

INSERT INTO BadData (FullName)

SELECT ‘Clinton, Bill’ UNION ALL

SELECT ‘Johnson, Lyndon, B.’ UNION ALL

SELECT ‘Bush, George, H.W.’;

— Version 1, without STUFF

SELECT FullName,

       LEFT(FullName, CHARINDEX(‘, ‘, FullName) – 1) AS LastName,

       SUBSTRING(FullName,

                 CHARINDEX(‘, ‘, FullName) + 2,

                 CHARINDEX(‘, ‘, FullName + ‘, ‘,

                           CHARINDEX(‘, ‘, FullName) + 2)

               – CHARINDEX(‘, ‘, FullName) – 2) AS FirstName,

       CASE WHEN FullName LIKE ‘%, %, %’

            THEN RIGHT(FullName, CHARINDEX(‘ ,’, REVERSE(FullName)) – 1)

            ELSE ” END AS MiddleInitial

FROM   BadData

— Version 2, with STUFF

SELECT FullName,

       LEFT(FullName, CHARINDEX(‘, ‘, FullName) – 1) AS LastName,

       STUFF(LEFT(FullName, CHARINDEX(‘, ‘, FullName + ‘, ‘,

                                      CHARINDEX(‘, ‘, FullName) + 2) – 1),

             1, CHARINDEX(‘, ‘, FullName) + 1, ”) AS FirstName,

       STUFF(FullName, 1,

             CHARINDEX(‘, ‘, FullName + ‘, ‘,

                       CHARINDEX(‘, ‘, FullName) + 2), ”) AS MiddleInitial

FROM   BadData

 

As you have seen, learning to use STUFF when appropriate can make hard tasks easy, complex queries simple, and extremely complicated queries somewhat less complicated. This string function is a tool that every SQL Server developer should know. So the next time someone complains how SQL Server is severely lacking adequate string manipulation tools, you know what to do – just tell’m to stuff it!

The prime number challenge – great waste of time!
The table scan from hell

Related Posts

No results found.

12 Comments. Leave new

  • Reply
  • Denis the SQL Menace
    October 13, 2006 17:08

    — Version 5, using PARSENAME
    SELECT LTRIM(PARSENAME(REPLACE(FullName,’,’,’.’),2))AS LastName,
    LTRIM(PARSENAME(REPLACE(FullName,’,’,’.’),1))AS FirstName
    FROM   BadData

    Denis

    Reply
  • Denis the SQL Menace
    October 13, 2006 17:23

    hugi,

    I didn’t see the second part (3 names) before, here is a solution for that too by using PARSENAME

    select PARSENAME(FullName,NameLen+1) LastName,
    PARSENAME(FullName,NameLen) FirstName,
    COALESCE(PARSENAME(FullName,NameLen-1),”) as MiddleInitial
    from(
    select len(FullName) -len(replace(FullName,’,’,”)) as NameLen,REPLACE(REPLACE(FullName,’.’,’~’),’, ‘,’.’) as FullName
    from BadData) x

    Reply
  • Denis the SQL Menace
    October 13, 2006 17:27

    Argh, who put the i next to the o

    Hugo I meant to say Hugo not hugi, please don’t call me Denise now  😉

    Reply
  • Denis the SQL Menace
    October 13, 2006 17:40

    Small problem, I forgot to replace ~ with . in the result

    Here it is, final version (I promise)

    SELECT PARSENAME(FullName,NameLen+1) AS LastName,
    PARSENAME(FullName,NameLen) AS FirstName,
    COALESCE(REPLACE(PARSENAME(FullName,NameLen-1),’~’,’.’),”) AS MiddleInitial
    FROM(
    SELECT LEN(FullName) -LEN(REPLACE(FullName,’,’,”)) AS NameLen,
    REPLACE(REPLACE(FullName,’.’,’~’),’, ‘,’.’) AS FullName
    FROM BadData) x

    316 characters (according to edit plus)

    Reply
  • Peter W. DeBetta
    October 13, 2006 18:34

    Denis,
    Nice touch using ParseName. I’m jealous I didn’t think of it first!

    Worth metioning, however, is that you’re are limited to a 4 items being parsed. ParseName can take a string with the following formats – ‘a’, ‘a.b’, ‘a.b.c’, ‘a.b.c.d’ – but cannot parse ‘a.b.c.d.e’

    So for a name with only three parts (last, first, middle) it works great, but for a 5 part name (prefix/title, first, middle, last, suffix) such as ‘Dr., John, Q., Public, III’, the technique could no longer be used, as Parsename returns NULL if the string being parsed has more than four items.

    Reply
  • Denis the SQL Menace
    October 13, 2006 18:44

    Peter, you can’t change requirements on me, this is not work  😉

    But you are right,  PARSENAME is used to split (SERVERNAME,DBNAME, ObjectOwner, Object)

    Quite handy with IP adresses also

    for 5 part we can always do a join with a number table, something like

    DECLARE @chvGroupNumbers VARCHAR(1000)
    SELECT @chvGroupNumbers =’1,4,77,88,4546,234,2,3,54,87,9,6,4,36,6,9,9,6,4,4,68,9,0,5,3,2,’

    SELECT SUBSTRING(‘,’ + @chvGroupNumbers + ‘,’, NumberID + 1,
    CHARINDEX(‘,’, ‘,’ + @chvGroupNumbers + ‘,’, NumberID + 1) – NumberID -1)AS Value
    FROM NumberPivot
    WHERE NumberID <= LEN(‘,’ + @chvGroupNumbers + ‘,’) – 1
    AND SUBSTRING(‘,’ + @chvGroupNumbers + ‘,’, NumberID, 1) = ‘,’
    GO

    Or make a UDF

    Denis

    Reply
  • Hugo Kornelis
    October 14, 2006 20:01

    please don’t call me Denise now

    I won’t, Denos. 😛

    Nice solution using PARSENAME – kudos!

    I also liked the extended version using a numbers table, but this one can be simplified to
    SELECT SUBSTRING(@chvGroupNumbers + ‘,’,
                    Number,
                    CHARINDEX(‘,’, @chvGroupNumbers + ‘,’, Number + 1) – Number) AS Value
    FROM   dbo.Numbers
    WHERE  Number <= LEN(@chvGroupNumbers + ‘,’)
    AND    SUBSTRING(‘,’ + @chvGroupNumbers + ‘,’, Number, 1) = ‘,’

    But of course, you’ll have to add a ROW_NUMBER() to preserve the order of the elements, otherwise we’d end up switching first and last names or day and month numbers and looking all American <g>

    SELECT SUBSTRING(@chvGroupNumbers + ‘,’,
                    Number,
                    CHARINDEX(‘,’, @chvGroupNumbers + ‘,’, Number + 1) – Number) AS Value,
          ROW_NUMBER() OVER (ORDER BY Number) AS Position
    FROM   dbo.Numbers
    WHERE  Number <= LEN(@chvGroupNumbers + ‘,’)
    AND    SUBSTRING(‘,’ + @chvGroupNumbers + ‘,’, Number, 1) = ‘,’

    Reply
  • ok hugo, here s one more use of stuff (and PATINDEX) .

    if there is a need to remove a character of defined length at the end of a string ,you could put to use the following:

    lets say @str = someone and something and

    select stuff(@str,PATINDEX(‘%and’,@str),3,”)

    here the result would be

    ‘someone and something’

    cool uh?!

    Reply
  • Now guys …….anyone willing to help……………its kinda simple……..but my head is kinda numb………..so here s the question……………..

    I need to make a report in the following manner below

    year | RG1 | RG2            | value

    07     Plan    Executed        3

    07     Plan    Abandoned     1

    07     Plan    Complete       1

    this is to be done using sql 2000 (wihtout making use of reporting service)

    So who’s got the answer????

    Reply
  • Hugo Kornelis
    December 24, 2007 17:57

    Hi Daniels,

    Sorry for the late reply. This is quite an old post, so I missed the new comment.

    To answer your question, I need to know what the data you start with looks like. So I suggest that you post CREATE TABLE statements for the tbales involved (with all constraints, properties, and indexes included), INSERT statements with some well-chosen rows of sample data, and expected output.

    However, don’t post it here. Post it to a newsgroup, for instance microsoft.public.sqlserver.programming. There are many people in that group, always willing to help you solve such tasks, so you’ll get a reply much sooner than when you post here! 🙂

    Happy holidays!

    Reply
  • Ronald Beuker
    May 25, 2008 19:33

    Great article Hugo, thx! There’s just a small error: POSINDEX should be CHARINDEX 🙂

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close