Removing Function Calls for Better Performance in SQL Server

By:   |   Comments (11)   |   Related: > Performance Tuning


Problem

Everyone knows it's best not to use function calls in your WHERE clause as it will affect your performance (if you did not know this already, this article will convince you not to), but what about using them say in the SELECT list. This tip will look at how removing function calls from the SELECT list can dramatically improve performance, especially when returning larger data sets i.e. search query results.

Solution

Sample SQL Server Table and Functions

For this example we will setup two sample tables and two functions that access these tables. As for populating the tables you will have to use a tool like Visual Studio to populate them with some reasonably realistic data so as not to skew the results. In my case I populated each table with about 200,000 records. One thing to note with these examples is that there is a requirement for the functions to only return one record per sale, even if there is actually multiple buyers.

Here is the DDL code:

-- Table creation logic
CREATE TABLE [dbo].[CarSale](
 [CarSaleID] [int] IDENTITY(1,1) NOT NULL,
 [PurchaseDate] [smalldatetime] NOT NULL,
    CONSTRAINT [PK_CarSale] PRIMARY KEY CLUSTERED ([CarSaleID] ASC)
);
CREATE TABLE [dbo].[Buyer](
 [BuyerID] [int] IDENTITY(1,1) NOT NULL,
 [CarSaleID] [int] NOT NULL,
 [LastName] [varchar](50) NULL,
 [FirstName] [varchar](100) NULL,
 [CompanyName] [varchar](200) NULL,
 CONSTRAINT [PK_Buyer] PRIMARY KEY NONCLUSTERED ([BuyerID] ASC)
);
ALTER TABLE [dbo].[Buyer]  WITH CHECK ADD CONSTRAINT [FK_Buyer_CarSale] FOREIGN KEY([CarSaleID])
REFERENCES [dbo].[CarSale] ([CarSaleID]) ON UPDATE CASCADE ON DELETE CASCADE;
CREATE CLUSTERED INDEX [IX_Buyer_CarSalelID] ON [dbo].[Buyer]([CarSaleID] ASC);
-- Function creation logic
CREATE FUNCTION [dbo].[fnGetBuyerFirstName]
(@CarSaleID INT)
RETURNS VARCHAR (500)
AS
BEGIN
RETURN (SELECT Top 1 FirstName
FROM Buyer
WHERE CarSaleID= @CarSaleID
ORDER BY BuyerID)
END
GO
CREATE FUNCTION [dbo].[fnGetBuyerLastName]
(@CarSaleID INT)
RETURNS VARCHAR (500)
AS
BEGIN
RETURN (SELECT Top 1 coalesce(LastName,CompanyName)
FROM Buyer
WHERE CarSaleID= @CarSaleID
ORDER BY BuyerID)
END
GO

Original Query Example

Our original query is pretty straightforward. It does a simple SELECT and uses two functions to retrieve the first and last name from the other table.

Here is the code:

SELECT cs.PurchaseDate,
       dbo.fnGetBuyerFirstName(cs.CarSaleID),
       dbo.fnGetBuyerLastName(cs.CarSaleID)
FROM CarSale cs
ORDER BY CarSaleID;

Looking at the explain plan for our original query we can see that for each record returned by the query we are going to call these functions and basically do a lookup on the Buyer table twice (once for each function). This is not going to be very efficient if the CarSale table contains a large number of records.

Even if we restrict our query to a very specific WHERE clause and retrieve only one record and check the explain plan for this updated query, we still see that it's going to have to do two lookups on the Buyer table.

Below is the code for the updated query:

SELECT cs.PurchaseDate,
       dbo.fnGetBuyerFirstName(cs.CarSaleID),
       dbo.fnGetBuyerLastName(cs.CarSaleID)
FROM CarSale cs
WHERE CarSaleID=5
ORDER BY CarSaleID;

One thing to note here is that this is the best case scenario as this query is only going to return one record. Any query with a wider WHERE clause returning more data will become slower as the data set returned grows.

Query Update to Remove Function Calls Example

Now let's try removing the function calls and getting our query result using a table join and see what happens. Here are both new queries, the first without a WHERE clause and the second with one.

SELECT cs.PurchaseDate,FirstName,LastName
FROM CarSale cs 
INNER JOIN (SELECT CarSaleID,MIN(BuyerID) AS SingleBuyerID FROM Buyer GROUP BY CarSaleID) m2 
          ON cs.CarSaleID=m2.CarSaleID
INNER JOIN Buyer m ON m2.CarSaleID=cs.CarSaleID AND m2.SingleBuyerID=m.BuyerID
ORDER BY cs.CarSaleID;

SELECT cs.PurchaseDate,FirstName,LastName FROM CarSale cs INNER JOIN (SELECT CarSaleID,MIN(BuyerID) AS SingleBuyerID FROM Buyer GROUP BY CarSaleID) m2 ON cs.CarSaleID=m2.CarSaleID INNER JOIN Buyer m ON m2.CarSaleID=cs.CarSaleID AND m2.SingleBuyerID=m.BuyerID WHERE cs.CarSaleID=5 ORDER BY cs.CarSaleID;

Looking at the explain plans for the query without the WHERE clause and the query with the WHERE clause we can see that in both cases the new explain plans should perform better since we no longer have to perform a lookup for each record returned by the query. This is now handled by the MERGE join.

To confirm this and see just how much improvement we get from removing the function calls I ran a SQL Profiler trace and ran each of these queries. Here are the results:

Query WHERE Clause CPU (ms) Reads Writes Duration
Original NO 10734 1239655 0 25879
YES 0 9 0 0
No Function Call NO 578 16337 0 2457
YES 0 11 0 0

Looking at the SQL Profiler trace results we can see that by removing the function calls we get a considerable benefit when the result set of the query is large in all resource areas, read, cpu and total duration. When just returning a single record though the original query actually performs slightly better. Again, this WHERE clause is a best case scenario, as the data set returned grows the query without function calls will start to outperform the one with them.

Final Update Adding Common Table Expression (CTE) Example

Because we have this requirement from the function for only returning a single buyer we can use a CTE to further optimize this query.

Here is the code both with and without the WHERE clause:

WITH summary AS (SELECT CarSaleID,            
                        BuyerID,           
                        FirstName,
                        LastName,
                        ROW_NUMBER() OVER(PARTITION BY CarSaleID ORDER BY CarSaleID) AS rk      
                 FROM Buyer) 
SELECT PurchaseDate,s.FirstName,s.LastName FROM CarSale cs INNER JOIN summary s  
ON s.CarSaleID=cs.CarSaleID WHERE s.rk = 1;

WITH summary AS (SELECT CarSaleID, BuyerID, FirstName, LastName, ROW_NUMBER() OVER(PARTITION BY CarSaleID ORDER BY CarSaleID) AS rk FROM Buyer) SELECT PurchaseDate,s.FirstName,s.LastName FROM CarSale cs INNER JOIN summary s ON s.CarSaleID=cs.CarSaleID WHERE s.rk = 1 AND cs.CarSaleID=5;

After generating the explain plans for both of these queries, no WHERE clause and WHERE clause, it's easy to see how much more streamlined these two queries are compared to the originals. As well, looking at the trace results below we can see that this version of the query outperforms all previous examples both with and without the WHERE clause.

Query WHERE Clause CPU (ms) Reads Writes Duration
No Function Call add WITH statement NO 266 15796 0 1931
YES 0 6 0 0

Summary

I would agree with anyone that wanted to argue the first example is much easier to read and understand, but given the performance improvements we get by removing these function calls I think it's pretty obvious that it's worth eliminating them and saving on valuable CPU and I/O cycles.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Ben Snaidero Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Friday, August 9, 2013 - 1:44:15 PM - Jeremy Kadlec Back To Top (26195)

Ben,

Congrats on your 25th tip!

Thank you,
Jeremy Kadlec
Community Co-Leader


Wednesday, July 11, 2012 - 10:50:55 PM - 8080_Diver Back To Top (18456)

While the difference between 1/2 second and 10 seconds may not sound like a whole lot, what happens whan that particular query gets executed a few thousand times a day?  Since I work in the world of ETL where we are dealing with, quite literally, millions of rows being processed, if I have a function call that ads a few unnecessary microseconds to each row's processing time, then I will opt for the more efficient, faster code almost every time.  The function call code may look "cleaner" but it hides what is happening in the function.  That means that, unless you also know the code in the function, you don't really know what the query is doing.

I fully understand the need for maintenance and readability, to say nothing of the Business Logic issue; however, I have usually found ways around the issue of "hrd-coding" vs function calls.  If nothing else, I have been known to parameterize the queries and to break them into component sections that are stored in a database and pulled together with another query on the database.  That, too, sounds complicated but it allows the precise calculations to be changed as needed (as long as the "contract" of inputs and outputs is not violated).

 


Wednesday, July 11, 2012 - 12:02:06 PM - Jeremy Kadlec Back To Top (18449)

Dom,

I am one of the founders of MSSQLTips.com and moderate the posts on the tips.

Thank you,
Jeremy Kadlec


Wednesday, July 11, 2012 - 11:16:22 AM - Ben Snaidero Back To Top (18448)

To answer a few of the questions above.

The example was more just to illustrate how removing a function call can speed up a query.  The example was meant to be simple so as to get the point across more easily.  Real world examples would definitely be more complicated and in most cases no two would be the same (as Ralph had an example doing currency conversion to human readable words, I've never had to do anything like that).

draggs, while I agree with your statement completely we all know that in the real world best practices are not always followed and sometimes you inherit someones elses mess.

Thanks for reading

Ben.


Wednesday, July 11, 2012 - 9:16:14 AM - draggs Back To Top (18444)

This is would hardcode business logic into every statement, heaven forbid a logic change. This would be a nightmare to update or find every place it is embedded.

 


Wednesday, July 11, 2012 - 9:11:04 AM - Dom Back To Top (18442)

You have to balance clean code with fast code.  I generally go for clean code, unless it results in very slow execution time.  A change from 10 seconds to half-a-second is impressive, and understanding why is certainly educational, but in the real world?  No, the function call version is much cleaner and easier to understand, and besides, functions are re-usable.

BTW, why are the posts being answered by Jeremy Kadlec when the tip was written by Ben Snaidero?


Wednesday, July 11, 2012 - 4:48:44 AM - Allan Hansen Back To Top (18435)

I do use function calls in the WHERE or SELECT at times, simply because - while it might be bad in that scenario - the alternative can be even worse.

Using a CTE can provide an even worse execution (memory issues, tempdb etc); if the code is reused in a number of procedures, it's nice to have it isolated in one place (object orientated mindset); if the code is complex it might not be easily wrapped in other objects and so on.....

So as with almost everything - it depends fully on the situation.


Tuesday, July 10, 2012 - 3:53:52 PM - Jeremy Kadlec Back To Top (18427)

Ralph,

Thanks for the feedback.  Actually this was put together to show how this could cause issues with function calls as well as showing beginners things that should not be done.  I agree there are several different examples that could have been used to show the issues with using functions.

Thank you,
Jeremy Kadlec


Tuesday, July 10, 2012 - 3:00:36 PM - Ralph Wilson Back To Top (18426)

I don't think that I have ever encountered the use of a UDF to obtain the first and/or last name from some table.  So, I am not sure how common it might be to make this sort of correction to actual queries.

I was hoping to see something along the lines of dealing with a UDF for converting currency ammounts to human-consumable words.  I know of a specific instance where such a UDF is being called something like 26 times in the course of importing data into a database.  The results of the UDF are needed in order to provide them to a Web Site (without causing delays for the Web Site); however, the currency values are requied in order to provide not only the basis of the verbage but also the basis for calculations (e.g. taxes, discounts, etc.).

Although this is a nice article, it would have been better, IMHO, if it had provided a real-life example that wasn't based upon what I would consider to be a basic "SQL 101" coding bad habit.


Tuesday, July 10, 2012 - 11:48:53 AM - Jeremy Kadlec Back To Top (18425)

Dom,

The tip has been updated.

Thank you,
Jeremy Kadlec


Tuesday, July 10, 2012 - 9:46:31 AM - Dom Back To Top (18422)

Good article.  But I think in the first code-listing, you meant to give a CREATE FUNCTION for fnGetBuyerFirstName.  Instead, you just gave two copies of fnGetBuyerLastName.















get free sql tips
agree to terms