SQL Substring

SQL Substring

This BlogPost was written by Andre Speek. Click here for other BlogPosts.

There is a nice function in SQL to get a specific part of a string, called Substring. It comes in handy when you need to extract a value that is between 2 pre-defined characters. But the use of the Substrinf function might seem a bit useless at first glance.

In this post I will show you how you can use the Substring function to do just that. All you need to do is make the Substring function a little bit more dynamic.

The basic use of Substring.

The Substring function requires 3 parameters: The text as a string, the start position and the length. Pretty straight forward, so it seems.

So let's say we have this text: "Hello World!" If we want to extract the word World from this string, we just need the start position and the length. Now that is easy! World is at position 7 in the string. And World has 5 characters.

So we could extract that word with this example:

declare @input nvarchar(100) = 'Hello World!'
select substring(@input, 7, 5)

That wasn't too difficult, was it?

But unfortunately it is never this easy. Most of the time the length of the text will vary, so it is impossible to predict the start position or the length.

So to bring order into the chaos, some ground rules need to be set here.

A common practice when a certain value needs to be extracted from a string is to specify an opening and closing character. For this example we will use this text: Send an update on Project [PRJ001] a.s.a.p

Now we want to extract the Project Number which is in this case PRJ001. A string of 6 characters, but for other Projects the length might vary.

First find the positions

Ok, now we want to get that substring, but for that we need a starting position and a length. But all we know is that it is between the square brackets. So let's see at what positions they are. And for that we can use the Charindex function:

declare @input nvarchar(1000) = 'Send an update on Project [PRJ001] a.s.a.p'

select charindex('[', @input)
select charindex(']', @input)

Now do some basic calculations

With this we can find out that we need what is between 27 and 34. That way we can calculate the length: 34 - 27 = 7. Plus 1 because we want to include the closing bracket as well.

That will give you this query:

declare @input nvarchar(1000) = 'Send an update on Project [PRJ0010] a.s.a.p

select substring(
charindex('[', @input),
charindex(']', @input) - charindex('[', @input) + len(']')

As you can see I did not use +1 for that extra character. Instead I used its length, which happens to be 1. But more on that later on.

If you try this query, you will see it returns [PRJ001]. Now if you would change that to a longer or shorter Project Number, it will still return the correct value. But that value will be between the square brackets.

To remove those, we could put 2 Replace functions around it. But that becomes messy, so let's do it the correct way. So start 1 position later and make the length 2 positions shorter:

declare @input nvarchar(1000) = 'Send an update on Project [PRJ0010] a.s.a.p'

select substring(
charindex('[', @input) + len('['),
charindex(']', @input) - charindex('[', @input) - len('[')

Again I did not us 1 and 2, but rather the length of the 2 designated characters. Here is why I did that.

Using Tags to encapsulate the value

Another way of encapsulating the value, especialy from an HTML text of XML file, is to use Tags. For example, we need to find the first word that is cursive so between <i> and </i>. Or someone might have thought up an unknown Tag like <project> and </project>. You can do that but it will not be visible to the user who reads the HTML text on screen.

Let's see how we can extract the value when the <project> and </project> Tags are used:

declare @input nvarchar(1000) =
'Send an update on Project <project>PRJ0010</project> a.s.a.p'

select substring(
charindex('<project>', @input) + len('<project>'),
charindex('</project>',@input) - charindex('<project>', @input) - len('<project>')

As you can see, this is exactly the same query as before. I just replaced the square brackets by the Tags.

Now it becomes clear why I did not put in the 1 and 2 as fixed lengths. This way, the whole query is dynamic.

That's it for today!

So here is a nice trick you can use to extract a specific value from a string. Provided that you know what is right in front of it and what is right behind it. And it is completely dynamic.

So this could make a nice SQL Function, don't you think?

Yeah, it could be. But you would need some Dynamic SQL for this. And that is one for a later post!

Do you want to comment on this post? Leave a response here and I will get in touch with you!

You can leave a comment or send a question to andre@andrespeek.com. Let me know if you got inspired and stay safe and healthy always!

An error has occurred. This application may no longer respond until reloaded. Reload 🗙