3. FileStream data
Offering the advantages of both file system and in-database storage is the FileStream data type, introduced in SQL Server 2008.
Overview
FileStream provides these advantages:
BLOBs
can be stored in the file system. The size of each BLOB is limited only
by the NTFS volume size limitations. This overcomes the 2GB limit of
previous in-database BLOB storage techniques, which prevented SQL
Server from storing certain BLOB types such as large video files.
Full transactional consistency exists between the BLOB and the database record to which it's attached.
BLOBs are included in backup and restore operations.
BLOB objects are accessible via both T-SQL and NTFS streaming APIs.
Superior streaming performance is provided for large BLOB types such as MPEG video.
The
Windows system cache is used for caching the BLOB data, thus freeing up
the SQL Server buffer cache required for previous in-database BLOB
storage techniques.
FileStream
data combines the transactional strength of SQL Server with the file
management and streaming performance strengths of NTFS. Further, the
ability to place FileStream BLOBs on separate, NTFS-compressed volumes
provides opportunities to significantly lower overall storage costs.
Unfortunately,
there are some limitations with FileStream data, which we'll come to
shortly. In the meantime, let's run through the process of enabling and
using FileStream.
Enabling filestream
One of the steps involved
choosing to enable FileStream data. Once it's installed, you can enable
or disable FileStream using SQL Server Configuration Manager. Just
right-click the SQL Server service for a selected instance and choose
Properties, and then select the FILESTREAM tab (as shown in figure 3). Here you can enable FileStream for T-SQL access and optionally for file I/O streaming access.
Once
enabled through Configuration Manager (or as part of the initial
installation), the SQL Server instance must then be configured as a
secondary step using the sp_configure command. For example, to configure an instance for both T-SQL and Windows streaming access:
-- Enable FileStream Access for both T-SQL and Windows Streaming
EXEC sp_configure 'filestream access level', 2
GO
RECONFIGURE
GO
Here, we used 2 as the parameter value. 1 will enable FileStream access for T-SQL only, and 0 will disable FileStream for the instance. Let's take a look now at the process of creating a table containing FileStream data.
Using filestream
When
creating a database containing FileStream data, the first thing we need
to do is ensure there is a FileStream filegroup. In our next example,
we'll create the database with a SalesFileStreamFG filegroup by specifying CONTAINS FILESTREAM.
We also use a directory name (G:\FSDATA\SALES in this example) to
specify the location of the FileStream data. For optimal performance
and minimal fragmentation, disks storing FileStream data should be
formatted with a 64K allocation unit size, and be placed on disk(s)
separate from both data and transaction log files.
-- Create a database with a FILESTREAM filegroup
CREATE DATABASE [SALES] ON PRIMARY
( NAME = Sales1
, FILENAME = 'M:\MSSQL\Data\salesData.mdf'
)
, FILEGROUP [SalesFileStreamFG] CONTAINS FILESTREAM
( NAME = Sales2
, FILENAME = 'G:\FSDATA\SALES'
)
LOG ON
( NAME = SalesLog
, FILENAME = 'L:\MSSQL\Data\salesLog.ldf'
)
GO
Next
up, we'll create a table containing a column that will store FileStream
data. In this example, the Photo column contains the FILESTREAM attribute with the varbinary(max) data type. Note that we're adding a UNIQUEIDENTIFIER column to the table with the ROWGUIDCOL attribute and marking it as UNIQUE. Such columns are mandatory for tables containing FileStream data. Also note the use of the FILESTREAM_ON clause, which specifies the filegroup to use for FileStream data.
-- Create a table with a FILESTREAM column
CREATE TABLE Sales.dbo.Customer
(
[CustomerId] INT IDENTITY(1,1) PRIMARY KEY
, [DOB] DATETIME NULL
, [Photo] VARBINARY(MAX) FILESTREAM NULL
, [CGUID] UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL UNIQUE DEFAULT NEWID()
) FILESTREAM_ON [SalesFileStreamFG];
GO
At
this point, we're ready to insert data into the column. For the
purposes of this example, we'll insert a simple text fragment. A more
realistic example would be an
application that allows a user to specify a local JPEG image that would
be streamed into the column:
INSERT INTO Sales.dbo.Customer (DOB, Photo)
VALUES ('21 Jan 1975', CAST ('{Photo}' as varbinary(max)));
GO
After
inserting this record, inspection of the file system directory
specified for the FileStream filegroup will reveal something similar to
that shown in figure 4.
As you can see in figure 4,
there is no obvious correlation between database records and FileStream
file or directory names. It's not the intention of FileStream to enable
direct access to the resulting FileStream data using Windows Explorer.
The important thing is that SQL Server maintains transactional
consistency with the data and includes it in backup and restore
commands.
As mentioned earlier, there are some limitations with the FileStream data type that you should consider before implementing it.
Filestream limitations
Despite the obvious advantages covered earlier, FileStream has some restrictions that limit its use as a BLOB storage technique:
Database mirroring, can't be enabled on databases containing FileStream data.
Database
snapshots, aren't capable of including
FileStream data. You can create a snapshot of a database containing
FileStream data, but only if you exclude the FileStream filegroup.
FileStream data can't be encrypted; a database that uses transparent data encryption won't encrypt the FileStream data.
Depending
on the BLOB size and update pattern, you may achieve better performance
by storing the BLOB inside the database, particularly for BLOBs smaller
than 1MB and when partial updates are required (for example, when
you're updating a small section of a large document).
Of
these limitations, perhaps the biggest is the inability to use database
mirroring on databases containing FileStream data. In such cases,
alternate BLOB storage techniques such as those covered earlier in this
section are required.
Despite its
limitations, FileStream is a powerful new feature introduced in SQL
Server 2008. The same can be said for data compression, our next topic.