You are on page 1of 30

Globalization

Copyright 2008, Oracle. All rights reserved.

Objectives
After completing this lesson, you should be able to:
Determine a correct database character set that meets your
business requirements
Obtain globalization support configuration information
Customize language-dependent behavior for the database
and individual sessions
Specify different linguistic sorts for queries
Retrieve data that matches a search string ignoring case or
accent differences

20 - 2

Copyright 2008, Oracle. All rights reserved.

Globalization Support Features

20 - 3

Language support
Territory support
Character set support
Linguistic sorting
Message support
Date and time formats
Numeric formats
Monetary formats

French
data

Copyright 2008, Oracle. All rights reserved.

Japanese
data

What Every DBA Needs to Know

20 - 4

What is a character set?


How are character sets used?
Problems to avoid
Choosing your character set
Obtaining character set information
Specifying language-dependent behavior
Using linguistic searching and sorting
Using data conversion

Copyright 2008, Oracle. All rights reserved.

What Is a Character Set?


The Oracle database supports different classes of characterencoding schemes:
Single-byte character sets
7-bit
8-bit

Multibyte character sets, including Unicode

20 - 5

Copyright 2008, Oracle. All rights reserved.

Understanding Unicode
AL32UTF8

63
C3 91
74

AL16UTF16

Supplementary
characters

EE AA 9E
F0 9D 84 9E
64
C3 B6
D0 A4

0063
00E1

0074
A89E
D834 DD1E
0064
00F6
0424

Encoding: Representing characters with byte sequences

20 - 7

Copyright 2008, Oracle. All rights reserved.

How Are Character Sets Used?


Oracle Net compares the client NLS_LANG setting to the
character set on the server.
If needed, conversion occurs automatically and
transparently.

NLS_LANG
Oracle Net
Client

20 - 9

Server

Copyright 2008, Oracle. All rights reserved.

Problems to Avoid
Example:

NLS_LANG:
AL32UTF8

Client
Windows English
Code page: WE8MSWIN1252

Oracle Net
Server
Database character set:
AL32UTF8

No conversion occurs, because it does not seem to be


required.
Issue: Invalid data are entered into the database.

20 - 10

Copyright 2008, Oracle. All rights reserved.

Another Sample Problem

CREATE DATABASE ...


CHARACTER SET US7ASCII
NATIONAL CHARACTER SET
UTF8 ...

% export NLS_LANG=SIMPLIFIED
CHINESE_HONG KONG.ZHS16GBK

20 - 11

Copyright 2008, Oracle. All rights reserved.

Choosing Your Character Set


Trade-offs to consider
Choosing the correct character set that meets your business
requirements now and in the future
Specifying the character set
Changing the character set after database creation

20 - 12

Copyright 2008, Oracle. All rights reserved.

Database Character Sets and


National Character Sets
Database Character Sets

National Character Sets

Defined at creation time

Defined at creation time

Cannot be changed without


re-creation (exceptions in certain
configurations)

Can be exchanged

Store data columns of type CHAR, Store data columns of type


VARCHAR2, CLOB, LONG
NCHAR, NVARCHAR2, NCLOB
Can store varying-width character Can store Unicode using either
AL16UTF16 or UTF8
sets

20 - 13

Copyright 2008, Oracle. All rights reserved.

Obtaining Character Set Information

SQL> SELECT parameter, value


2
FROM nls_database_parameters
3
WHERE parameter LIKE '%CHARACTERSET%';
PARAMETER
VALUE
----------------------- ------------NLS_CHARACTERSET
WE8ISO8859P1
NLS_NCHAR_CHARACTERSET
AL16UTF16
2 rows selected.

20 - 14

Copyright 2008, Oracle. All rights reserved.

Specifying Language-Dependent Behavior

Initialization parameters for the database server


Environment variables for the clients
ALTER SESSION command
SQL function

SELECT sysdate FROM dual;


20 - 15

Copyright 2008, Oracle. All rights reserved.

Specifying Language-Dependent
Behavior for the Session
Specify the locale behavior with the NLS_LANG environment
variable:
Language
Territory
Character set
NLS_LANG=FRENCH_CANADA.WE8ISO8859P1

Set other NLS environment variables to:


Override database initialization parameter settings for all
sessions
Customize the locale behavior
Change the default location of the NLS library files

20 - 16

Copyright 2008, Oracle. All rights reserved.

Language-Dependent and Territory-Dependent


Parameters
Parameter

Default Values

NLS_LANGUAGE
NLS_DATE_LANGUAGE
NLS_SORT

AMERICAN
AMERICAN
BINARY

NLS_TERRITORY
NLS_CURRENCY
NLS_DUAL_CURRENCY
NLS_ISO_CURRENCY
NLS_DATE_FORMAT
NLS_NUMERIC_CHARACTERS
NLS_TIMESTAMP_FORMAT
NLS_TIMESTAMP_TZ_FORMAT

AMERICA
$
$
AMERICA
DD-MON-RR
.,
DD-MON-RRHH.MI.SSXFF AM
DD-MON-RRHH.MI.SSXFF AM TZR

20 - 17

Copyright 2008, Oracle. All rights reserved.

Specifying Language-Dependent Behavior


Using NLS parameters in SQL functions:
ALTER SESSION SET NLS_DATE_FORMAT='DD.MM.YYYY';
DBMS_SESSION.SET_NLS('NLS_DATE_FORMAT',
'''DD.MM.YYYY''') ;

SELECT TO_CHAR(hire_date,'DD.Mon.YYYY',
'NLS_DATE_LANGUAGE=FRENCH')
FROM employees
WHERE hire_date > '01-JAN-2000';

20 - 19

Copyright 2008, Oracle. All rights reserved.

Linguistic Searching and Sorting


Sort order can be affected by:
Case-sensitivity
Diacritics or accent characters
Combination of characters that is treated as a single
character
Phonetics or character appearance
Cultural preferences

20 - 20

Copyright 2008, Oracle. All rights reserved.

Linguistic Searching and Sorting


Three types of sorting:
Binary sorting
Sorted according to the binary values of the encoded
characters

Monolingual linguistic sorting


A two-pass sort based on a characters assigned major and
minor values

Multilingual linguistic sorting


Based on the ISO standard (ISO 14651), and the Unicode 3.2
Standard for multilingual collation
Ordered by the number of strokes, PinYin, or radicals for
Chinese characters

20 - 21

Copyright 2008, Oracle. All rights reserved.

Using Linguistic Searching and Sorting


You can specify the type of sort used for character data with
the:
NLS_SORT parameter
Default value derived from the NLS_LANG environment
variable, if set
Can be specified for the session, client, or server

NLSSORT function
Defines the sorting method at the query level

20 - 22

Copyright 2008, Oracle. All rights reserved.

Case-Insensitive and Accent-Insensitive


Search and Sort
Specify the linguistic name:
NLS_SORT = <NLS_sort_name>[_AI | _CI]

Examples:
NLS_SORT = FRENCH_M_AI
NLS_SORT = XGERMAN_CI

Specify the sort action for WHERE clauses and PL/SQL


blocks:
NLS_COMP = BINARY | ANSI

Useful for migrated databases

20 - 24

Copyright 2008, Oracle. All rights reserved.

Support in SQL and Functions


The following SQL clauses support NLS_SORT and
NLS_COMP settings:

WHERE
ORDER BY
START WITH
HAVING
IN/NOT IN
BETWEEN
CASE-WHEN

The NLSSORT() function supports the case-insensitive and


accent-insensitive functionality.

20 - 25

Copyright 2008, Oracle. All rights reserved.

Linguistic Index Support


Create an index on linguistically sorted values.
Rapidly query data without having to specify ORDER BY
clause and NLSSORT:
CREATE INDEX list_word ON
list (NLSSORT(word, 'NLS_SORT=French_M'));
SELECT word FROM list;

Set the NLS_SORT parameter to match the linguistic


definition that you want to use for the linguistic sort when
creating the index.

20 - 26

Copyright 2008, Oracle. All rights reserved.

Customizing Linguistic
Searching and Sorting
You can customize linguistic sorting for:
Ignorable characters
Contracting or expanding characters
Special combination letters or special letters
Expanding characters or special letters
Special uppercase and lowercase letters
Context-sensitive characters
Reverse secondary sorting
Canonical equivalence

20 - 27

Copyright 2008, Oracle. All rights reserved.

Implicit Conversion
Between CLOB and NCLOB
Transparent implicit conversion is supported in:
SQL IN and OUT bind variables for query and DML

PL/SQL functions and procedure parameter passing


PL/SQL variable assignment

20 - 29

Copyright 2008, Oracle. All rights reserved.

NLS Data Conversion with Oracle Utilities


Multiple data conversions can take place when data is
exported from one database and imported into another if the
same character sets are not used.
External tables use the NLS settings on the server for
determining the data character set.
SQL*Loader:
Conventional path: Data is converted into the session
character set specified by NLS_LANG.
Direct path: Data is converted using client-side directives.

20 - 30

Copyright 2008, Oracle. All rights reserved.

NLS Data Conversion with Data Pump


Data Pump Export always saves data in the same character
set as the database from which the data originates.
Data Pump Import converts the data to the character set of
the target database, if needed.
The Data Pump log file is written in the language specified
by NLS_LANG for the session that started Data Pump.

20 - 32

Copyright 2008, Oracle. All rights reserved.

Language and Character Set File Scanner


(LCSSCAN)

Character set

20 - 33

Copyright 2008, Oracle. All rights reserved.

Setting the Database Time Zone


The current time zone in the database is determined by the
following:
The SET TIME_ZONE clause of the CREATE DATABASE
statement
The time zone of the operating system on the database
server host
The time zone specified by the ALTER SESSION SET
TIME_ZONE command
CREATE DATABASE ... SET TIME_ZONE='-04:00';

20 - 34

Copyright 2008, Oracle. All rights reserved.

Summary
In this lesson, you should have learned how to:
Determine a correct database character set that meets your
business requirements
Obtain globalization support configuration information
Customize language-dependent behavior for the database
and individual sessions
Specify different linguistic sorts for queries
Retrieve data that matches a search string ignoring case or
accent differences

20 - 35

Copyright 2008, Oracle. All rights reserved.

Practice 20 Overview:
Using Globalization Support
This practice covers the following topics:
Determining the database character set
Setting the NLS_SORT variable

20 - 36

Copyright 2008, Oracle. All rights reserved.

You might also like