The code I am using
create table largecsv (val nchar(1000)) insert into largecsv(val) values ('aaa,bbb,ccc,ddd,eee,aaa,bbb,ccc,ddd,eee,aaa,bbb,ccc,ddd,eee') select * from dbo.SqlFunction1('aaa,bbb,ccc,ddd,eee,aaa,bbb,ccc,ddd,eee,aaa,bbb,ccc,ddd,eee') -- 23% select * from csvtotable('aaa,bbb,ccc,ddd,eee,aaa,bbb,ccc,ddd,eee,aaa,bbb,ccc,ddd,eee',',') -- 77% select F.Val from largecsv v CROSS APPLY dbo.SqlFunction1(val) F --72% select F.RESULT from largecsv CROSS APPLY csvtotable(val,',') F -- 28%
`csvtotable` comes from http://www.codeproject.com/Tips/625872/Convert-a-CSV-delimited-string-to-table-column-in where I changed the `BIGINT` to nchar(1000)
CREATE FUNCTION [dbo].[CSVtoTable] ( @LIST nchar(1000), @Delimeter varchar(10) ) RETURNS @RET1 TABLE (RESULT nchar(1000)) AS BEGIN DECLARE @RET TABLE(RESULT nchar(1000)) IF LTRIM(RTRIM(@LIST))='' RETURN DECLARE @START BIGINT DECLARE @LASTSTART BIGINT SET @LASTSTART=0 SET @START=CHARINDEX(@Delimeter,@LIST,0) IF @START=0 INSERT INTO @RET VALUES(SUBSTRING(@LIST,0,LEN(@LIST)+1)) WHILE(@START >0) BEGIN INSERT INTO @RET VALUES(SUBSTRING(@LIST,@LASTSTART,@START-@LASTSTART)) SET @LASTSTART=@START+1 SET @START=CHARINDEX(@Delimeter,@LIST,@START+1) IF(@START=0) INSERT INTO @RET VALUES(SUBSTRING(@LIST,@LASTSTART,LEN(@LIST)+1)) END INSERT INTO @RET1 SELECT * FROM @RET RETURN END
and `SqlFunction1` is a CLR function with the definition
[SqlFunction(Name="SqlFunction1",FillRowMethodName = "FillRow",TableDefinition="val nvarchar(1000)")] public static IEnumerable SqlFunction1(SqlString val) { string[] splitStr = val.Value.Split(','); return splitStr; } private static void FillRow(Object obj, out SqlString str) { str = new SqlString((string)obj); }
Why is it that when I pass a static string, I get significantly better performance with the CLR function, but if I do a cross apply, even on a table with just a single row, I get much better performance with the TSQL counterpart?
The relative execution cost is:
Static string: CLR:23% (Query 1),TSQL:77% (Query 2)
Cross apply from table with single row: CLR:72% (Query 3),TSQL:28%(Query 4)
This is running on SQL Server 2012
Looking at the Execution plan, with the cross apply there is a Nested loop join which has a relative cost of 49% in the CLRUDF, but 0% in the SQL UDF.