Microsoft Dynamic AX 2009 : The Database Layer - Unicode Support

8/22/2013 9:34:22 AM

In Dynamics AX 2009, the application runtime completely supports Unicode and multiple-locale input and output without the risk of data loss. The version prior to Dynamics AX 4.0 provided support for data storage in the database as Unicode data and handled Asian characters in double-byte character sets, but the application runtime didn’t support multiple codepage characters or Unicode. In any given installation, only one character set was supported because data from one character set written to the database might not get correctly converted into another character set. Data could be lost when incorrectly converted data was eventually written back to the database.

This problem was eliminated from Dynamics AX 4.0, but developers and users of Dynamics AX 2009 should still be aware that Unicode support doesn’t imply multiple-locale sorting and comparison or other features such as multiple time zone functionality or multiple country-specific functionality.

Databases

The Dynamics AX application runtime supports only Unicode data types in the database, so all data persists in the N-prefixed versions of the data types in SQL Server and Oracle. These are the NVARCHAR and NTEXT data types in SQL Server and the NVARCHAR2 and NCLOB data types in Oracle. When you upgrade to Dynamics AX 2009 from versions prior to Dynamics AX 4.0, the conversion from non-Unicode to Unicode is handled as part of the upgrade process.

Note

Although the upgrade process handles the conversion of text stored in VARCHAR, TEXT, and the equivalent Oracle data types, text could still be stored in fields of type container, which persists in columns in the database of type IMAGE in SQL Server and BLOB in Oracle. These values are not converted during the upgrade process, but the Dynamics AX application runtime converts non-Unicode data to Unicode data when the values are read from the database and extracted from the container field.

SQL Server 2005 and SQL Server 2008 store Unicode data using the UCS-2 encoding scheme, and Oracle Database 10g stores Unicode data using the UTF-16 encoding scheme. Every Unicode character generally uses 2 bytes, but in special cases, 4 bytes, to store the single character. The required disk space to store the database is therefore higher for a Dynamics AX 2009 installation than it is for installations of versions prior to Dynamics AX 4.0, given the same amount of tables and data. The required disk space isn’t doubled, however, because only string data is affected by the conversion to Unicode; the int, real, date, and container data types don’t consume additional space in the database.

As the amount of space needed to store the data increases, so does the time required to read and write data because more bytes have to be read and written. Obviously, the size of packages sent between the client tier and the server tier, and on to the database tier, is affected as well.

When you create the database to be used for the Dynamics AX installation, you can specify a collation. Collation determines the sorting order for data retrieved from the database and the comparison rules used when searching for the data.

Note

Although SQL Server 2005, SQL Server 2008, and Oracle Database 10g support the specification of collations at lower levels than the database instance (such as at the column level), the Dynamics AX application runtime does not.

Because the collation is specified at the database instance level, the Dynamics AX application runtime supports sorting using the collation setting only; it doesn’t support sorting using a different locale. Dynamics AX supports input and output according to multiple locales, but not sorting and comparison according to multiple locales.

Application Runtime

The Dynamics AX application runtime supports Unicode through the use of UTF-16 encoding, which is also the primary encoding scheme used by Windows 2000, Windows XP, Windows Vista, Windows Server 2003, and Windows Server 2008. The use of UTF-16 encoding makes the Dynamics AX application surrogate-aware; it can handle more than 65,536 Unicode characters, which is the maximum number of Unicode characters supported by the UCS-2 encoding scheme. Dynamics AX generally uses only 2 bytes to store the Unicode character, but it uses 4 bytes when it needs to store supplementary Unicode characters. Supplementary characters are stored as surrogate pairs of 2 bytes each. An example of a supplementary character is the treble clef music symbol shown in Figure 1. The treble clef symbol has the Unicode code point 01D120 expressed as a hexadecimal number.

Figure 1. Example of a supplementary character

Although the application runtime uses UTF-16 encoding and the SQL Server back-end database uses UCS-2 encoding, you won’t experience loss of data because the SQL Server database is surrogate safe; it stores a Unicode character occupying 4 bytes of data as two unknown 2-byte Unicode characters. It retrieves the character in this manner as well, and returns it intact to the application runtime.

The maximum string length of a table field is, however, passed directly as the string length to use when creating the NVARCHAR type column in the database. A string field with a maximum length of 10 characters results in a new column in the SQL Server database with a maximum length of 10 double bytes. A maximum length of 10, therefore, doesn’t necessarily mean that the field can contain 10 Unicode characters. For example, a string field can store a maximum of 5 treble clef symbols, with each occupying 4 bytes, totaling 20 bytes, which is equivalent to the maximum length of 10 double bytes declared for the column in the database. No problems result, though, because the expected use of supplementary characters is minimal, especially in an application such as Dynamics AX 2009. Supplementary characters are currently used, for example, for mathematical symbols, music symbols, and rare Han characters.

The Dynamics AX application runtime also supports the use of temporary tables that are stored either in memory or in files. The temporary tables use an indexed sequential access method (ISAM)–based architecture, which doesn’t support the specific setting of collations, so data stored in temporary tables is sorted locale invariant and case insensitive. The indexes on the temporary tables have a similar behavior, so searching for data in the temporary table is also locale invariant and case insensitive.

The application runtime also performs string comparisons in a locale-invariant and case-insensitive manner. However, some string functions, such as strlwr and strupr, use the user’s locale.

Important

String comparison was changed slightly in Dynamics AX 4.0. Dynamics AX 2009 ignores case when comparing strings, but it doesn’t ignore diacritics, meaning that the letter A is different from the letter Ä. The versions prior to Dynamics AX 4.0 ignored most, but not all, diacritics. For example, the letter A was equal to Ä, but not equal to Å.

MorphX Development Environment

The MorphX development environment also supports Unicode. You can write X++ code and define metadata that contains Unicode characters. However, you can define elements only in the Data Dictionary, which conforms to the ASCII character set, and you can declare variables only in X++, which also conforms to the ASCII character set. The remaining metadata and language elements allow the use of all Unicode characters. So you can write comments in X++ using Unicode characters as well as string constants in X++ and in metadata.

All strings and string functions in X++ support Unicode characters, so the strlen function returns the number of Unicode characters in a string, not the number of bytes or double bytes used to store the Unicode characters. Therefore, a string that contains only the treble clef symbol, as shown earlier, has a string length of 1 rather than 2, even though it uses 2 double bytes to store the single Unicode character.

Important

Because SQL Server stores Unicode characters using UCS-2 encoding, it could return a different value when using the LEN function in Transact-SQL (T-SQL). A column that contains a single treble clef symbol stored by the Dynamics AX application would return a length of 2 when using the LEN function because the treble clef symbol is stored as two unknown Unicode characters in the database. The Dynamics AX application runtime doesn’t use or expose the LEN function, so this behavior isn’t an issue for users of the Dynamics AX application; an issue arises only if the database is accessed directly from other programs or if direct SQL statements are written from within X++, thereby circumventing the database access layer.

Files

Dynamics AX 2009 supports reading, creation, and writing of Unicode files. All text files written by the Dynamics AX application runtime are created as Unicode files, and all text files that are part of the Dynamics AX installation are Unicode files. The application runtime also supports reading of non-Unicode files.

Two file I/O classes exist that allow you to implement X++ code that reads and writes Unicode text files: TextIO and CommaTextIO. These classes are equivalent to the AsciiIO and CommaIO ASCII character set classes. You should use these classes instead of the ASCII file I/O classes to avoid losing data when writing to files. However, you might encounter scenarios in which market, legal, or proprietary requirements demand the use of the ASCII file I/O classes.

DLLs and COM Components

All areas of Dynamics AX 2009 that use DLLs and COM components use the Unicode-enabled versions of the DLLs. The createFile method in the WinApi class has been replaced with the CreateFileW implementation, rather than the CreateFileA implementation of the createFile function, because CreateFileW supports Unicode and CreateFileA supports ANSI. When passing parameters to the functions in X++ code, the parameters are defined as ExtTypes::WString when passing in Unicode characters, whereas the ExtTypes::String expects non-Unicode characters to be passed.

The Binary helper class used for COM interoperability and DLL function calls has also been changed. A wString function is available to support Unicode characters to complement the existing string function.

Others

- Microsoft Dynamic AX 2009 : The Database Layer - Company Accounts

- Microsoft Dynamic AX 2009 : The Database Layer - Record Identifiers

- SQL Server 2008 R2 : Filtered Indexes and Statistics - Creating and Using Filtered Indexes

- SQL Server 2008 R2 : Index Design Guidelines - Indexed Views, Indexes on Computed Columns

- SQL Server 2008 R2 : Index Design Guidelines

- Windows Server 2008 : Manipulating Users and Groups with the net Command, Modifying NTFS Permissions with icacls

- Windows Server 2008 : Manipulating Shares with net share, Mapping Drives with net use

- Windows Server 2008 : Manipulating Shadow Copies with vssadmin

- Windows 7 : Disk Management (part 3) - Creating a Striped Volume, Creating and Attaching VHDs

- Windows 7 : Disk Management (part 2) - Dynamic Disk Management, Extending a Disk, Creating a Spanned Volume