SQL NTile()
Some of the common questions I get when building reports have to do with ranking.
For example, show me a Top 10 list of the Customers with the most Revenue.
If you are a bit familiar with SQL Queries, that is an easy one.
But what if a more complex ranking is required?
Like split the Customers into 5 Categories based on Revenue and then rank the Categories.
SQL has a great build in Function for that.
Let's have a look at NTile.
When to use this
The NTile Function when you want to group the results based on a Column or Columns.
An example would be to split Customers into High, Medium and Low Revenue Customers based on Revenue.
Or to rank students into a grade A to F based on their results.
There are many other examples you can think of.
And the syntax is pretty simple, so it is easy to use.
Let's create some Test Data
To show some examples, let's first create a Table and insert some data:
CustomerNr int identity primary key,
Name nvarchar(100),
Year int,
Revenue float
)
insert into CustomerRevenues
(Name, Year, Revenue)
Values
('Customer A', 2023, 25500),
('Customer B', 2023, 33000),
('Customer C', 2023, 37000),
('Customer D', 2023, 49500),
('Customer E', 2023, 29000),
('Customer F', 2023, 21500),
('Customer G', 2023, 48700),
('Customer H', 2023, 63500),
('Customer I', 2023, 57500),
('Customer J', 2023, 39500),
('Customer A', 2022, 45500),
('Customer B', 2022, 63000),
('Customer C', 2022, 57000),
('Customer D', 2022, 89500),
('Customer E', 2022, 59000),
('Customer F', 2022, 49500),
('Customer G', 2022, 68700),
('Customer H', 2022, 93500),
('Customer I', 2022, 97500),
('Customer J', 2022, 69500)
NTile is a Window Function like you would use for Lag ∧ Lead or Running Totals. If you are familiar with those, you will be familiar with the basic syntax:
The [groups] is the number in which you want to divide the results. So if you want to group the results in High, Medium and Low Revenue Customers, you would use ntile(3). Would you rather go to 4 groups? No worries, just change it to ntile(4).
A working Example
First let's see what is in the Table you created with the above Query:
from CustomerRevenues
As you can see I put in 10 Customers with a random Revenue for both 2022 and 2023. Now if we would want to see the Customers divided into 3 groups based on their Revenue, we can use this one:
ntile(3) over (partition by Year order by Revenue desc) as Ranking
from CustomerRevenues
order by Year, Revenue desc
Pretty easy, huh? Told you so!
The NTile Function magially sums up the Revenue, divides it by the given number to create the corresponding Ranges.
And then assigns the Rows to one of these Ranges.
If you do not want to split it up by Year, but instead want an overall ranking?
Just leave out the Partition By, and use a Group By:
ntile(3) over (order by sum(Revenue) desc) as Ranking
from CustomerRevenues
group by Name
order by sum(Revenue) desc
Or you could use a Common Table Expression and a Case Statement to give a name to the Ranking:
select *,
ntile(3) over
(partition by Year order by Revenue desc)
as Ranking
from CustomerRevenues
)
select Year,
case Ranking
when 1 then 'High'
when 2 then 'Medium'
when 3 then 'Low'
end as Ranking,
count(*) as Customers,
sum(Revenue) as TotalRevenue
from Years
group by Year, Ranking
That's it for Today!
As you can see, the NTile Function is a nice way to divide the results into any number of groups you want, and assign a Ranking to it.
There are many cases in which this Function is very useful, and it is fairly simple to use as well.
And since it is a Window Function, the overall performance is pretty good.
So, there you have it. A nice little Function you can add to your Toolbox.
Enjoy, and let me know in what cases you found it to be useful!
You can leave a comment or send a question to andre@andrespeek.com. Let me know if you got inspired and stay safe and healthy always!