SELECT *
FROM fn_results() fn
join other_table o.pkey = fn.keyfield
5. Creating and Using CLR Functions
Prior
to SQL Server 2005, the only way to extend the functionality of SQL
Server beyond what was available using the T-SQL language was to create
extended stored procedures or Component Object Model (COM) components.
The main problem with these types of extensions was that if not written
very carefully, they could have an adverse impact on the reliability and
security of SQL Server. For example, extended stored procedures are
implemented as DLLs that run in the same memory space as SQL Server. An
access violation raised in a poorly written extended stored procedure
could crash SQL Server itself.
In addition, neither extended stored procedures nor
COM components allow you to create custom user-defined functions that
can be written in any programming language other than T-SQL, which has a
limited command set for operations such as complex string comparison
and manipulation and complex numeric computations.
In SQL Server 2008, you can write custom
user-defined functions in any Microsoft .NET Framework programming
language, such as Microsoft Visual Basic .NET or Microsoft Visual C#.
SQL Server supports both scalar and table-valued CLR functions, as well
as CLR user-defined aggregate functions. These extensions written in the
CLR are much more secure and reliable than extended stored procedures
or COM components.
Note
The CLR function examples presented in the following
sections are provided as illustrations only. The sample code will not
execute successfully because the underlying CLR assemblies have not been
provided.
Adding CLR Functions to a Database
If you’ve already created and compiled a CLR
function, your next task is to install that CLR function in the
database. The first step in this process is to copy the .NET assembly to
a location that SQL Server can access, and then you need to load it
into SQL Server by creating an assembly. The syntax for the CREATE ASSEMBLY command is as follows:
CREATE ASSEMBLY AssemblyName [AUTHORIZATION Owner_name]
FROM { <client_assembly_specifier> | <assembly_bits> [ ,...n ] }
[WITH PERMISSION_SET = (SAFE | EXTERNAL_ACCESS | UNSAFE) ]
AssemblyName is the name of the assembly. client_assembly_specifier
specifies the local path or network location where the assembly being
uploaded is located, and also the manifest filename that corresponds to
the assembly. It can be expressed as a fixed string or an expression
evaluating to a fixed string, with variables. The path can be a local
path, but often the path is a network share. assembly_bits is the list of binary values that make up the assembly and its dependent assemblies.
The WITH clause is optional, and it defaults to SAFE. Marking an assembly with the SAFE
permission set indicates that no external resources (for example, the
Registry, Web services, file I/O) are going to be accessed.
The CREATE ASSEMBLY command fails if it is marked as SAFE and assemblies like System.IO
are referenced. Also, if anything causes a permission demand for
executing similar operations, an exception is thrown at runtime.
Marking an assembly with the EXTERNAL_ACCESS permission set tells SQL Server that it will use resources such as networking, files, and so forth. Assemblies such as System.Web.Services (but not System.Web) can be referenced with this set. To create an EXTERNAL_ACCESS assembly, the creator must have EXTERN ACCESS_permission.
Marking an assembly with the UNSAFE
permission set tells SQL Server that not only might external resources
be used, but unmanaged code may be invoked from managed code. An UNSAFE assembly can potentially undermine the security of either SQL Server or the CLR. Only members of the sysadmin role can create UNSAFE assemblies.
After the assembly is created, the next step is to
associate the method within the assembly with a user-defined function.
You do this with the CREATE FUNCTION command, using the following syntax:
CREATE FUNCTION [ schema_name. ] function_name
( [ { @parameter_name [AS] [ schema_name.]scalar_datatype [ = default ] }
[ ,...n ] ] )
RETURNS { return_data_type | TABLE ( { column_name data_type } [ ,...n ] ) }
[ WITH { [ , RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ]
[ , EXECUTE_AS_Clause ] } ]
[ AS ] EXTERNAL NAME assembly_name.class_name.method_name
After creating the CLR function successfully, you can
use it just as you would a T-SQL function. The following example shows
how to manually deploy a table-valued CLR function:
CREATE ASSEMBLY fn_EventLog
FROM 'F:\assemblies\fn_EventLog\fn_eventlog.dll'
WITH PERMISSION_SET = SAFE
GO
CREATE FUNCTION ShowEventLog(@logname nvarchar(100))
RETURNS TABLE (logTime datetime,
Message nvarchar(4000),
Category nvarchar(4000),
InstanceId bigint)
AS
EXTERNAL NAME fn_EventLog.TabularEventLog.InitMethod
GO
SELECT * FROM dbo.ReadEventLog(N'System') as T
go
Note
The
preceding examples show the steps involved in manually registering an
assembly and creating a CLR function. If you use Visual Studio’s Deploy feature, the CREATE/ALTER ASSEMBLY and CREATE FUNCTION
commands are issued automatically by Visual Studio.
Deciding Between Using T-SQL or CLR Functions
One question that often comes up regarding
user-defined functions is whether it’s better to develop functions in
T-SQL or in the CLR. The answer really depends on the situation and what
the function will be doing.
The general rule of thumb is that if the function
will be performing data access or large set-oriented operations with
little or no complex procedural logic, it’s better to create that
function in T-SQL to get the best performance. The reason is that T-SQL
works more closely with the data and doesn’t require multiple
transitions between the CLR and SQL Server engine.
On the other hand, most benchmarks have shown that
the CLR performs better than T-SQL for functions that require a high
level of computation or text manipulation. The CLR offers much richer
APIs that provide capabilities not available in T-SQL for operations
such as text manipulation, cryptography, I/O operations, data
formatting, and invoking of web services. For example, T-SQL provides
only rudimentary string manipulation capabilities, whereas the .NET
Framework supports capabilities such as regular expressions, which are
much more powerful for pattern matching and replacement than the T-SQL replace() function.
Another good candidate for CLR functions is
user-defined aggregate functions. User-defined aggregate functions
cannot be defined in T-SQL. To compute an aggregate value over a group
in T-SQL, you would have to retrieve the values as a result set and then
enumerate over the result set, using a cursor to generate the
aggregate. This results in slow and complicated code. With CLR
user-defined aggregate functions, you need to implement the code only
for the accumulation logic. The query processor manages the iteration,
and any user-defined aggregates referenced by the query are
automatically accumulated and returned with the query result set. This
approach can be orders of magnitude faster than using cursors, and it is
comparable to using SQL Server built-in aggregate functions. For
example, the following shows how you might use a user-defined aggregate
function that aggregates all the authors for a specific BookId into a comma-separated list:
use bigpubs2008
go
SELECT t.Title_ID, count(*), dbo.CommaList(a.au_lname) as AuthorNames
FROM Authors a
JOIN titleauthor ta on a.au_id = ta.au_id
JOIN Titles t on ta.title_id = t.title_id
GROUP BY t.title_id
having count(*) > 2
go
Title_ID AuthorNames
-------- ---------------------------------------------------------------------
TC7777 O'Leary, Gringlesby, Yokomoto
Note
The preceding example will not execute successfully because we have not created the CommaList() CLR function. It is provided merely as an example showing how such a function could be used if it was created.
In a nutshell, performance tests have
generally shown that T-SQL generally performs better for standard CRUD
(create, read, update, delete) operations, whereas CLR code performs
better for complex math, string manipulation, and other tasks that go
beyond data access.