Seeing some odd behavior. I know that UDFs are generally not a good development approach however in this case we are using them appropriately. They contain only math functions, are expected to be called for every row, and they do not perform any select statements. We see the following symptoms:
- Inexplicable hang, where the query will never return. We have a case where it is only 2400 rows and thus even the most suboptimal UDF should return.
- SQL server reports no activity, no locks, no I/O, just a RUNNABLE process while it is hanging
- We can cancel the query.
- It inexplicably affects some data sets and not others despite being the same size, shape, and range of values.
- Bypassing the UDF and putting the same math into the query works (i.e. the math itself doesn't appear to cause a hang)
- NOW for the truly bizarre symptom -- if we put an ORDER BY clause at the end of the query, it no longer hangs. So it is obviously not UDF performance related as an ORDER BY should make performance worse, not better.
- Nothing strange in the query plan. There is a table scan but we expect the table to be scanned as this is basically an ETL step and expected to execute for every row.
This is machine learning application where UDFs are generated for each machine learning job. The reason for the UDF is to make some significant amount of dynamic SQL easier to implement (the symptoms are present even without dynamic SQL -- dynamic SQL is not the cause). So while we can use a different approach by embedding the functions into the dynamic SQL vs UDFs we would much prefer UDF. I don't accept that UDFs are completely useless, particularly in our case where they are math only.
Here is an example of one such function:
CREATE FUNCTION [dbo].[MyFunction] ( @x numeric(21,7) ) RETURNS numeric(21,7) AS BEGIN DECLARE @fx numeric(21,7) select @fx = exp(@x)-1 RETURN @fx END GO
The above hangs even if I change the select to: select @fx = @x. Further I can put exp(value)-1 directly into the query and it runs fine, this there appears to be no data value issue.
Using the above UDF in this query will, depending on the dataset, cause a hang:
select PredictedValue, dbo.MyFunction(PredictedValue) from MyTable
But this one runs fine (only change is ORDER BY clause added):
select PredictedValue, dbo.MyFunction(PredictedValue) from MyTable order by PredictedValue
SQL Server Version:
Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64)
Oct 19 2012 13:38:57
Copyright (c) Microsoft Corporation
Enterprise Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor)