Unicode Support in Oracle9i Database
Unicode Support in Oracle9i Database
®
Topics
®
New Unicode features in Oracle9i
®
Length Semantics Support in Oracle9i
• A new semantics
– CHAR ( size [BYTE | CHAR] )
– VARCHAR2 ( size [BYTE | CHAR] )
• It meets Ansi SQL standard
– the size is defined in character in the standard, but
most vender implemented in byte
• It fulfills Customer’s requirement
– Portable database schema
– Character set independent
– Same data size across server, client, and third
middle tier
– Easy migration to Unicode support
®
Character Semantics Support in 9i - Cont.
®
Character Semantics Support in 9i - Cont.
• UTF-16 semantics
– UTF8 encodes surrogate by a pair of three bytes
– It has the same semantics as UTF-16 and has the match
between varchar2(10 char) and wchar(10)
– It has the same binary sorting order as UTF-16
• UTF-32 semantics
– AL32UTF8 follows UTF-8 standard by encoding
surrogate in 4 bytes
– It has the same semantics as UTF-32 in coding point
and the same binary order
• Conversion between UTF8 and AL32UTF8
– AL32UTF8 can be used at client for the UTF-8
compliance
®
Reliable Unicode Data Type Support
®
Inter-operability With Other
Data Types
• Explicit Conversion Functions
- TO_NCHAR()
- TO_CHAR()
- ROWIDTONCHAR()
- CHARTOROWID()
- TO_CLOB()
- TO_NCLOB()
- TO_NUMBER()
- TO_DATE()
- TO_TIMESTAMP()
- TO_TIMESTAMP_TZ()
- TO_YMINTERVAL()
…...
®
Inter-operability - cont.
• Implicit Conversion
- Between NCHAR and CHAR types
- Between NCHAR and NUMBER, DATE, ROWID, RAW,
CLOBs etc.
• Conversion Direction:
- Insert/select into/update/assignment operations:
convert to target
- Comparison, concatenation: SQL CHAR to SQL
NCHAR avoid any data loss
- SQL function: convert to first string parameter
• Makes migration to SQL NCHAR much
easier
®
Data Loss Exception Handling
• NLS Parameter:
- NLS_NCHAR_CONV_EXCP
- Dynamically changed in each session
- Effective for both explicit and implicit conversions
• Smoothness of operation vs. accuracy of
operation
®
SQL Unicode String
Processing
• Same level of support as CHAR
- Can use NCHAR same way as CHAR.
• SQL functions support for NCHAR
- SUBSTR, LENGTH, INSTR, LIKE, CONCAT,
LPAD/RPAD, LTRIM, RTRIM, NLS_SORT,
NLS_UPPER, NLS_LOWER etc.
- UNISTR, ASCIISTR
• Mixed type arguments
- CONCAT(nchar,char) - result type is based on first string
parameter
• Easy programming
®
Unicode Database vs. Unicode Data Type
®
NCHAR Choice between
UTF8 and AL16UTF16
• UTF-8
- ASCII compatible
- Internet friendly: HTML, XML etc.
- More space efficient for western languages
• UTF-16
- More space efficient for Asian languages
- Faster in string processing
- Supported by JAVA, WINDOWS etc.
®
Programming Interfaces
• OCI Unicode Support
- Support UTF-16 bind/define buffers
- Unicode meta data, SQL_TEXT, error
messages through mode parameter
- Unicode interface support independent on
server or client character set
- Character length semantics
• PL/SQL
• Pro*C/C++: Unicode support through UCHAR,
UVARCHAR
• JDBC
• ODBC/OLEDB
®
Migration, Conversion and
Compatibility
• Old NCHAR to 9i NCHAR migration
• Migration to Unicode Columns
ALTER TABLE tname MODIFY col (NCHAR(n))
®
Migration, Conversion and
Compatibility
• Character length semantics
- Database schema
ALTER TABLE tname MODIFY col (CHAR(n CHAR))
- Modify application to be in sync with database length
semantics
®
Summary