A table
is defined as a collection of columns. Each column represents an
attribute of the database table and has characteristics that define its
scope and the type of data it can contain. In defining a column, you
must assign a name and a data type. For consistency and readability, the
column names should adhere to a naming convention that you define for
your environment. Naming conventions often use a set of standard
suffixes that indicate the type of data the column will contain. For
example, you can add the Date suffix to a column name (for example, OrderDate) to identify it as a column that contains date/time data, or you can add the suffix ID (for example, PrinterID) to indicate that the column contains a unique identifier.
When creating and naming columns, you need to keep the following restrictions in mind:
You can define up to
1,024 columns (nonsparse + computed) for each table. This number is
increased to 30,000 columns if the table has a defined column set using
sparse columns.
Column names must be unique within a table.
A
row can hold a maximum of 8,060 bytes. Some data types can be stored
off the 8KB data page to allow a row to exceed this limit.
A data type must be assigned to each column.
These restrictions
provide a framework for a column definition. The next consideration in
defining a column is the data type. The following section discusses the
various data types.
Data Types
SQL Server 2008 has an
extensive list of data types to choose from, including some that are new
to SQL Server 2008. New data types include date, time, datetime2, datetimeoffset, filestream, and geometry. Each data type is geared toward a specific type of data that will be stored in the column. Table 1 provides a complete list of the data types available in SQL Server 2008.
Table 1. Table Data Types
Data Type | Range/Description | Storage |
---|
bigint | −263 (−9,223,372,036,854,775,808) to 263-1 (9,223,372,036,854,775,807) | 8 bytes |
binary (n) | | Binary data with a length of n bytes The number of bytes defined by n, up to 8,000 |
bit | An integer data type that can take a value of 1, 0, or NULL | 1 byte for every eight columns that are defined as bits on the table |
char | Up to 8,000 characters | 1 byte per character |
date | 0001-01-01 through 9999-12-31 | 3 bytes |
datetime through | 8 bytes; accurate to December 31, 9999 | January 1, 1753, 3.33 milliseconds |
datetime2 through | 8 bytes; accurate to December 31, 9999 | January 1, 0001, 100 nanoseconds |
datetimeoffset through | 10 bytes December 31, a time zone offset | January 1, 0001, 9999; includes |
decimal | Based on the precision | −1038+1 to 1038-1 |
float | −1.79E + 38 to −2.23E − 38, 0 and 2.23E − 38 to 1.79E + 38 | 4 or 8 bytes, depending on the allocation mantissa |
geography representing | round-earth data such as GPS latitude and longitude coordinates | .NET CLR data type |
geometry CLR data type representing | data in a Euclidean (flat) coordinate system | .NET |
hierarchyid levels | Up to 892 bytes | User defined nodes and |
image | Variable-length binary data | Up to 231-1 (2,147,483,647) bytes |
int | −231 (−2,147,483,648) to 231-1 (2,147,483,647) | 4 bytes |
money | −922,337,203,685,477.5808 to 922,337,203,685,477.5807 | 8 bytes |
nchar | Up to 4,000 Unicode characters | Two times the number of characters entered |
ntext | Up to 230-1 (1,073,741,823) characters | Two times the number of characters entered |
numeric (p,s) 1038-1 | Based on the precision | −1038+1 through |
nvarchar(n) | | Up to 4,000 Unicode characters Two times the number of characters entered |
nvarchar(max) | Unicode characters up to the maximum storage capacity of | Two times the number plus 2 bytes, up to 230-1 |
real | −1.18E − 38, 0 and 1.18E − 38 to 3.40E + 38 | 4 bytes |
smalldatetime | | January 1, 1900, through June 6, 2079 4 bytes; accurate to 1 minute |
smallint to 215-1 (32,767) | 2 bytes | −215 (−32,768) |
smallmoney 214,748.3647 | 4 bytes | −214,748.3648 to |
sql_variant values of | Up to 8,016 bytes various SQL Server 2008–supported data types, except text, ntext, image, timestamp, and sql_variant | A data type that stores |
text | Up to 231-1 | 2,147,483,647) characters Up to 2,147,483,647 bytes |
time | 00:00:00.0000000 to 23:59:59.9999999 | 5 bytes |
timestamp/rowversion generated, unique binary | numbers within a database; generally used for version stamping rows | Automatically 8 bytes |
tinyint | 0 to 255 | 1 byte |
uniqueidentifier unique identifier | 16 bytes (GUID) | A 16-byte globally |
varbinary(n) | | Binary data with a length of n bytes The number of bytes defined by n, up to 8,000 |
varbinary(max) | Binary data up to the maximum storage capacity | Two times the number of characters entered plus 2 bytes, up to 230-1 |
varchar (n) | 1 byte per character | Up to 8,000 characters |
varchar (max) | Non-Unicode characters up to the maximum storage capacity | 1 byte per character; maximum 231-1 bytes |
xml | XML instances or a variable of XML type | 2GB |
The data type you select is important because it provides scope for the column. For example, if you define a column as type int,
you can be assured that only integer data will be stored in the column
and that character data will not be allowed. The advantages of data
typing are fairly obvious but sometimes overlooked.
You should avoid defining most of your columns with a single data type, such as varchar. The visual tools provide a great
way for you to select a data type: you simply select a data type from a
drop-down selection box that lists the available data types.
Tip
The Object Explorer has a categorized list of all the system data types. To get to it, you open the Programmability node under your database and then expand the Types node. You then see a node named System Data Types that lists all the data type categories, including Exact Numbers, Approximate Numbers, and Date and Time.
The data types for each category are listed under each category node.
If you mouse over the particular data type, you see a brief description,
including the valid range of values.
Several data types in
SQL Server 2008 deserve special attention. Some of these data types are
new to SQL Server 2008 and some of them were introduced in SQL Server
2005. The following sections discuss these data types.
New Date/Time Data Types
Several
new date/time data types were added in SQL Server 2008. These data
types were added to enhance SQL Server’s date/time capabilities. The date and time data types were added to separate these two date/time components. The date data type contains only the month, day, and year components, whereas the time data type contains only the time components. The separation of date and time was planned for SQL Server 2005 but never made it to the final release.
The precision and scale of date/time data types has been expanded in SQL Server 2008 as well. The datetime2 data type is similar to the datetime
data type, but it has a larger range of dates (January 1, 0001, through
December 31, 9999), and the time portion of this data type contains
fractional seconds with seven digits of precision. The datetime data type is accurate only to within 3 milliseconds, whereas the new datetime2 data type is accurate to 100 nanoseconds.
Finally, SQL Server introduces time zone support in a new data type named datetimeoffset. This data type has precision in fractional seconds (like datetime2),
but it also contains an extra date/time component that defines the time
zone offset for the date. The time zone offset is two digits that
represent the offset hours and two digits that represent the offset
minutes. The offset is used against the UTC date. The following example
shows how this new data type can be used:
select CAST('2009-07-08 11:33:22.1234567-04:00' AS datetimeoffset(7))
The xml Data Type
The xml data type
(introduced in SQL Server 2005) enables you to store XML documents and
XML fragments in a SQL Server database. (An XML fragment is an XML
instance that is missing a single top-level element.)
The hierarchyid Data Type
The hierarchyid data type is new in SQL Server 2008. The hierarchyid data type is a variable-length system data type used to represent a position in a tree hierarchy. A column of type hierarchyid does not automatically represent a tree. It is up to the application to generate and assign hierarchyid
values in such a way that the desired relationship between rows is
reflected in the values.
Spatial Data Types
SQL Server 2008
introduces support for storing geographical data with the inclusion of
new spatial data types. Spatial data types provide a comprehensive,
high-performance, and extensible data storage solution for spatial data,
enabling organizations of any scale to integrate geospatial features
into their applications and services.
Spatial data types can be used to store and manipulate location-based information and come in the form of two new data types: geography and geometry. The geography data type is a .NET CLR data type that provides a storage structure for geodetic data, sometimes referred
to as round earth data because it assumes a roughly spherical model of
the world. It provides a storage structure for spatial data that is
defined by latitude and longitude coordinates using an industry standard
ellipsoid such as WGS84, the projection method used by Global
Positioning System (GPS) applications. The geometry
data type is a .NET CLR data type that supports the planar model/data,
which assumes a flat projection and is therefore sometimes called flat
earth. geometry data is represented as
points, lines, and polygons on a flat surface, such as maps and interior
floor plans where the curvature of the earth does not need to be taken
into account.
Large-Value Data Types
Three large-value data types
added in SQL Server 2005 allow you to store a significant amount of data
in a single column. They allow you to store up to 231 bytes of non-Unicode data and 230 bytes of Unicode data. All these data types have the (max) designator: varchar(max), nvarchar(max), and varbinary(max). The varchar, nvarchar, and varbinary data types were available prior to SQL Server 2005, but the max parameter gave these types additional scope.
The great thing about these
data types is that they are much easier to work with than large object
(LOB) data types. LOB data types (which include text, ntext, and image)
require special programming when retrieving and storing data. The
large-value data types do not have these restrictions. They can be used
much like their smaller counterparts varchar(n), nvarchar(n), and varbinary(n) that are defined without the max keyword. So if you want to select data from a varchar(max) column, you can simply execute a SELECT statement against it, regardless of the amount of data stored in it. Consider, for example, the following SELECT statement, executed against a varchar(max) column named DocumentSummary in the AdventureWorks2008.Production.Document table:
select Title, substring(DocumentSummary,1,30) 'DocumentSummary'
from production.document
where LEFT(DocumentSummary,30) like 'Reflector%'
/* results from previous select statement
Title DocumentSummary
-------------------------------------------------- ------------------------------
Front Reflector Bracket Installation Reflectors are vital safety co
*/
This works fine with the varchar(max) column, but the LEFT function used in the WHERE clause would cause an error if the column were a text column instead.
The large-value data types can be stored in the data row or in a separate data page, based on the setting of the sp_tableoption 'large value types out of row' option. If the option is set to OFF, up to 8,000 characters can be stored in this column in the actual data row. If the option is set to ON,
data for this column is stored in a separate data page if its length
would result in the data row exceeding 8,060 bytes. The actual location
of the column data is transparent to any user accessing the table.
Large Row Support
In
SQL Server 2000, there was a strict limit of 8,060 bytes that could be
stored in a single row. If the total amount of data exceeded this limit,
the update or insert would fail. Enhancements were made in SQL Server
2005 to dynamically manage rows that exceed the 8,060-byte limit. This
dynamic behavior is designed for columns that are defined as varchar, nvarchar, varbinary, or sql_variant.
If the values in these columns cause the total size of the row to go
beyond the 8,060-byte limit, SQL Server moves one or more of the
variable-length columns to pages in the ROW_OVERFLOW_DATA
allocation unit. A pointer to this separate storage location, rather
than the actual data, is kept in the data row. If the data row shrinks
below the 8,060-byte limit at a later time, SQL Server dynamically moves
the data from the ROW_OVERFLOW_DATA allocation unit back into the data page.
The following example creates a table that has columns that could exceed the 8,060-byte limit, with a total of 9,000 characters:
CREATE TABLE t1
(col1 varchar(4000), col2 varchar(5000))
insert t1
select replicate('x', 4000),replicate('x', 5000)
If you execute the CREATE TABLE
statement, you do not get any warning message related to the 8,060-byte
limit. After the table is created, you can execute an insert into the
table that exceeds the 8,060-byte limit. The insert succeeds, and the
dynamic allocation previously described is handled automatically.
User-Defined Data Types
User-defined data types allow
you to create custom data types that are based on the existing system
data types. These data types are also called alias data types
in SQL Server 2008. You create a user-defined data type and give it a
unique name that you can then use in the definitions of tables. For
example, you can create a user-defined data type named ShortDescription, defined as varchar(20), and assign it to any column. This promotes data type consistency across your tables.
You can create user-defined data types by using T-SQL in a couple of different ways. Using the sp_addtype system stored procedure and using the new CREATE TYPE command are two possibilities. The sp_addtype system stored procedure is slated to be removed in a future version of SQL Server, so using the CREATE TYPE command is preferred. The following example shows how to create the ShortDescription user-defined data type:
CREATE TYPE [dbo].[ShortDescription] FROM [varchar](20) NOT NULL
After a user-defined data type
is created, you can use it in the definition of tables. The following is
an example of a table created with the new ShortDescription user-defined data type:
CREATE TABLE [dbo].CodeTable
(TableId int identity,
TableDesc ShortDescription)
When you look at the definition of the CodeTable table in Object Explorer, you see the TableDesc column displayed with the ShortDescription data type as well as the underlying data type varchar(20).
You can use the Object Explorer to create user-defined data types as well. To do so, you right-click the User-Defined Data Types node, then select Programmability, and then select Types.
Then you choose the New User-Defined Data Type option, and you can
create a new user-defined data type through a friendly GUI screen. If
you create a user-defined data type in the model database, this
user-defined data type is created in any newly created database.
CLR User-Defined Types
SQL Server 2008 continues
support for user-defined types (UDTs) implemented with the Microsoft
.NET Framework common language runtime (CLR). CLR UDTs enable you to
extend the type system of the database and also enable you to define
complex structured types.
A UDT may be simple or
structured and of any degree of complexity. A UDT can encapsulate
complex, user-defined behaviors. You can use CLR UDTs in all contexts
where you can use a system type in SQL Server, including in columns in
tables, in variables in batches, in functions or stored procedures, as
arguments of functions or stored procedures, or as return values from
functions.
A UDT must first be implemented
as a managed class or structure in any one of the CLR languages and
compiled into a .NET Framework assembly. You can then register it with
SQL Server by using the CREATE ASSEMBLY command, as in the following example:
CREATE ASSEMBLY latlong FROM 'c:\samplepath\latlong.dll'
After registering the assembly, you can create the CLR UDTs by using a variation of the CREATE TYPE command shown previously:
CREATE TYPE latitude EXTERNAL NAME latlong.latitude
CREATE TYPE longitude EXTERNAL NAME latlong.longitude
When a CLR UDT is created, you can use it in the definition of tables. The following example shows a table created with the new latitude and longitude UDTs:
CREATE TABLE [dbo].StoreLocation
(StoreID int NOT NULL,
StoreLatitude latitude,
StoreLongitude longitude)