Character set Conversion  

Posted by Mawahid

We are now changing NLS_CHARACTERSET to AL32UTF8 in our case.
For reference please check Doc ID - 260192.1
Before starting this operation, take a complete database cold backup.
1.       Cross-Check existing NLS character of the database.

SELECT value FROM NLS_DATABASE_PARAMETERS WHERE parameter =  'NLS_CHARACTERSET';

The NLS_CHARACTERSET is defining the characterset of the CHAR, VARCHAR2, LONG and CLOB datatypes.

2.       Do NOT use Expdp/Impdp when going to (AL32)UTF8 or an other multibyte characterset on ALL 10g versions lower then 10.2.0.4 (including 10.1.0.5). Also 11.1.0.6 is affected.

3.       Check for any Invalid Objects.

SELECT owner,
object_name,
object_type,
status
FROM dba_objects
WHERE status ='INVALID’;

If there are any invalid objects run utlrp.sql

                SQL> @?/rdbms/admin/utlrp.sql

If there are any left after running utlrp.sql then please manually resolve

4.       Check for Orphaned Datapump master tables (10g and up)

SELECT o.status,
o.object_id,
o.object_type,
o.owner
||'.'
||object_name "OWNER.OBJECT"
FROM dba_objects o,
dba_datapump_jobs j
WHERE o.owner =j.owner_name
AND o.object_name=j.job_name
AND j.job_name NOT LIKE 'BIN$%'
ORDER BY 4,2;
                If there are any, then refer Doc ID - 336014.1 for cleaning up Orphaned Datapump Jobs.
5.       Leftover Temporary tables using CHAR semantics.

SELECT C.owner
||'.'
|| C.table_name
||'.'
|| C.column_name
||' ('
|| C.data_type
||' '
|| C.char_length
||' CHAR)'
FROM all_tab_columns C
WHERE C.char_used = 'C'
AND C.table_name IN
(SELECT table_name FROM dba_tables WHERE TEMPORARY='Y'
)
AND C.data_type IN ('VARCHAR2', 'CHAR')
ORDER BY 1;
These tables may give ERROR during Alter database Charter Set or Csalter
ORA-00604: error occurred at recursive SQL level 1
ORA-14450: attempt to access a transactional temp table already in use.

6.       Check the Source database for "Lossy" (invalid code points in the current source character set).

csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y FROMCHAR=<current NLS_CHARACTERSET> TOCHAR=<current NLS_CHARACTERSET> LOG=dbcheck CAPTURE=N ARRAY=1000000 PROCESS=2

eg: 

csscan userid=\'mshaik as sysdba\' FULL=Y FROMCHAR=WE8ISO8859P1 TOCHAR=WE8ISO8859P1 LOG=dbcheck CAPTURE=N ARRAY=1000000 PROCESS=5


If csscan fails with ORA-00904: "CNVTYPE": invalid identifier, then try to re-install csscan (@?/rdbms/admin/csminst.sql)
                Always run Csscan connecting with a 'sysdba' connection/user, do not use "system" or "csmig" user.

              If there is any "Lossy" data then those rows contain code points that are not currently defined correctly and they should be cleared up before you can continue.
7.       Once Csscan is actually scanning the tables v$session_longops can be used to see the progress of scans of big tables.

SELECT target,
TO_CHAR(start_time,'HH24:MI:SS - DD-MM-YY'),
time_remaining,
sofar,
totalwork,
sid,
serial#,
opname
FROM v$session_longops
WHERE sid IN
(SELECT sid FROM v$session WHERE upper(program) LIKE 'CSSCAN%'
)
AND sofar < totalwork
ORDER BY start_time;

8.       Csscan will create 3 files :

dbcheck.out a log of the output of csscan
dbcheck.txt a Database Scan Summary Report
dbcheck.err contains the rowid's of the Lossy rows reported in dbcheck.txt (if any).

This is to check if all data is stored correctly in the current character set. Because the TOCHAR and FROMCHAR character sets as the same there cannot be any "Convertible" or "Truncation" data reported in dbcheck.txt.

9.       Now, Check for "Convertible" and "Truncation" data when going to our desired characterset i.e, AL32UTF8 in our case

csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y FROMCHAR=<current NLS_CHARACTERSET> TOCHAR= TOCHAR=AL32UTF8 LOG=TOUTF8 CAPTURE=Y ARRAY=1000000 PROCESS=2

There should be NO entries under "Lossy" in “.txt” file.
Please check Doc ID - 260192.1 for dealing with truncation or convertible data.
10.   In our case, there were few tables reported under Lossy. Therefore took the export of those few tables and truncated them. Also check point 14th.

11.   Run again csscan described in step 9, to ensure there is no lossy data reported.

12.   Before using Csalter / Alter Database Character Set, check the database for

a)      Partitions using CHAR semantics:

SELECT C.owner,
C.table_name,
C.column_name,
C.data_type,
C.char_length
FROM all_tab_columns C,
all_tables T
WHERE C.owner = T.owner
AND C.table_name = T.table_name
AND C.char_used = 'C'
AND T.PARTITIONED ='YES'
AND C.table_name NOT IN
(SELECT table_name FROM all_external_tables
)
AND C.data_type IN ('VARCHAR2', 'CHAR')
ORDER BY 1,2 ;
If this select gives rows back then the change to AL32UTF8 will fail with "ORA-14265: data type or length of a table subpartitioning column may not be changed" or " ORA-14060: data type or length of a table partitioning column may not be changed" if those columns using CHAR semantics are used as partitioning key or subpartition key seen the change to AL32UTF8 will adapt the actual byte length of CHAR semantic columns ( data_lengh in all_tab_columns).
If there are columns using CHAR semantics used as partitioning key or subpartition key the partitioned tables need to be exported, dropped and , after the change, imported. An other solution is to temporarily go back to BYTE semantics for these columns before the change to AL32UTF8 and then go back to CHAR semantics afterwards..
Note 330964.1 provides more background on this issue.
b)      Functional indexes on CHAR semantics columns

SELECT OWNER,
INDEX_NAME,
TABLE_OWNER,
TABLE_NAME,
STATUS,
INDEX_TYPE,
FUNCIDX_STATUS
FROM DBA_INDEXES
WHERE INDEX_TYPE LIKE 'FUNCTION-BASED%'
AND TABLE_NAME IN
(SELECT UNIQUE (table_name) FROM dba_tab_columns WHERE char_used ='C'
)
ORDER BY 1,2 ;
If this gives rows back then the change to AL32UTF8 will fail with "ORA-30556: functional index is defined on the column to be modified" or with "ORA-02262: ORA-904 occurs while type-checking column default value expression" . If there are functional indexes on columns using CHAR semantics (this is including Nchar, Nvarchar2 columns) the index need to be dropped and recreated after the change.
Note that a disable will not be enough. The DDL of all those indexes can be found using:
                        conn / AS sysdba
SET LONG 2000000
EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM,'STORAGE',false);
SELECT DBMS_METADATA.GET_DDL('INDEX',u.index_name,u.owner)
FROM DBA_INDEXES u WHERE u.INDEX_TYPE LIKE  'FUNCTION-BASED%' AND u.TABLE_NAME IN (SELECT UNIQUE (x.TABLE_NAME) FROM DBA_TAB_COLUMNS x WHERE x.char_used ='C' );
EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM,'DEFAULT')
/
c)       SYSTIMESTAMP in the DEFAULT value clause for tables using CHAR semantics

conn / AS sysdba
SET serveroutput ON
BEGIN
FOR rec IN
(SELECT OWNER,
TABLE_NAME,
COLUMN_NAME,
DATA_DEFAULT
FROM dba_tab_columns
WHERE CHAR_USED='C'
)
LOOP
IF UPPER(rec.DATA_DEFAULT) LIKE '%TIMESTAMP%' THEN
DBMS_OUTPUT.PUT_LINE(rec.OWNER ||'.'|| rec.TABLE_NAME ||'.'|| rec. COLUMN_NAME);
END IF;
END LOOP;
END;
/
This will give ORA-604 error occurred at recursive SQL level %s , ORA-1866 the datetime class is invalid during the change to AL32UTF8. The workaround is to temporary change affected tables to use a DEFAULT NULL clause eg: ALTER TABLE tab MODIFY ( col ... DEFAULT NULL NOT NULL ); After the character set change the default clause can be restored.
d)      Clusters using CHAR semantics

SELECT OWNER,
OBJECT_NAME
FROM ALL_OBJECTS
WHERE OBJECT_TYPE = 'CLUSTER'
AND OBJECT_NAME IN
(SELECT UNIQUE (TABLE_NAME) FROM DBA_TAB_COLUMNS WHERE char_used ='C'
)
ORDER BY 1,2
/
If this gives rows back then the change will fail with "ORA-01447: ALTER TABLE does not operate on clustered columns". Those clusters need to be dropped and recreated after the change.
e)      Unused columns using CHAR semantics

SELECT OWNER,
TABLE_NAME
FROM DBA_UNUSED_COL_TABS
WHERE TABLE_NAME IN
(SELECT UNIQUE (TABLE_NAME) FROM DBA_TAB_COLUMNS WHERE char_used ='C'
)
ORDER BY 1,2
/
Unused columns using CHAR semantics will give an ORA-00604: error occurred at recursive SQL level 1 with an "ORA-00904: "SYS_C00002_09031813:50:03$": invalid identifier". Note that the "SYS_C00002_09031813:50:03$" will change for each column. These unused columns need to be dropped.
                SQL> ALTER TABLE table_name DROP UNUSED COLUMNS;
f)       Ensure we have enough tablespace space available.

g)      Check if there are any objects in recyclebin.

SQL> PURGE DBA_RECYCLEBIN;
                This will remove unneeded objects and otherwise during CSALTER an ORA-38301 will be seen.
h)      Check if the compatible parameter is set to your base version

SQL>  show parameter compatible;
                Do not try to migrate for example an 10g database with compatible=9.2
13.   If everything is ready Run Csscan again as final check.

For 10g and up the Csscan output needs to be

* "Changeless" for all CHAR VARCHAR2, and LONG data (Data Dictionary and User/Application data )
* "Changeless" for all User/Application data CLOB
* "Changeless" and/or "Convertible" for all Data Dictionary CLOB

And in order to run Csalter you need to see in the “.txt” file under [Scan Summary] these 2 messages:

******************************************************************************
                All character type application data remain the same in the new character set
and
                The data dictionary can be safely migrated using the CSALTER script
******************************************************************************

14.   Summary of steps needed to use Alter Database Character Set / Csalter:

For 10g and up:

a)      Export all the "Convertible" User/Application Data (make sure that the character set part of the NLS_LANG is set to the current database character set during the export session)

b)      Fix any non-CLOB Data Dictionary "Convertible" using Note 258904.1, All "9i only" fixes in Note 258904.1 Convertible data in Data Dictionary: Workarounds when changing character set should NOT be done in 10g and up - those are sys-CLOB objects handled by Csalter.

c)       Truncate the exported "Convertible" User/Application Data tables.

d)      Run csscan to verify you only have "convertible" CLOB in the Data Dictionary and all other data is "Changeless".

15.   Running Csalter/Alter Database Character Set

For 10g and up

Csalter.plb needs to be used within 7 days after the Csscan run, otherwise you will get a 'The CSSCAN result has expired' message.

Shutdown the listener and any application that connects locally to the database.
There should be only ONE connection the database during the WHOLE time and that's the sqlplus session where the change is done. RAC systems need to be started as single instance.

If more then one session is connected Csalter will fail and report "Sorry only one session is allowed to run this script"
Run in sqlplus connected as "/ AS SYSDBA":

Set job_queue_processes to zero

SQL> Shutdown immediate;
SQL> startup restrict
SQL> SPOOL Nswitch.log
-- do this alter system or you might run into "ORA-22839: Direct updates on SYS_NC columns are disallowed"
-- This is only needed in 11.1.0.6, fixed in 11.1.0.7, not applicable to 10.2 or lower

SQL> ALTER SYSTEM SET EVENTS '22838 TRACE NAME CONTEXT LEVEL 1,FOREVER';

Now, run csalter script,

SQL> @?/rdbms/admin/csalter.plb
SQL> Shutdown immediate;
SQL> startup

and the database will be AL32UTF8.

16.   Now import the lossy tables truncated in our above step.

Issues Faced:
In our case, we encountered import failed error for one of the table, thus have changed semantics/datatype of table at which error occurred to byte and imported and later changed to original.

This entry was posted on Wednesday, December 8, 2010 at 11:42 AM . You can follow any responses to this entry through the comments feed .

0 comments

Post a Comment