Database and Datascience

lundi 8 juillet 2019

Oracle Flex... - Partie 1

Qu'est-ce que Oracle Flex Cluster ?

Monday, July 8, 2019

4:33 PM

Oracle Flex Cluster est une nouvelle fonctionnalité introduite dans la version 12c, pour fournir une infrastructure de Cluster évolutive et a haute disponibilité pour le Cloud de bases de donnes et d'applications. Dans les anciennes versions d'Oracle, le nombre de nœud peur aller jusqu'à 64. Quelques larges cluster allant jusqu'à 100 nœuds ont été tout de déployés. Oracle Flex Cluster est constitué de deux types de nœuds : les nœuds Hub et les nœuds Leaf.

Les nœuds Hub sont similaires aux nœuds classiques, c'est-à-dire qu'ils sont étroitement connectés et ont un accès direct à un stockage partagé.
Les nœuds Leaf sont différents des Noeux standards en ce sens qu'ils ne nécessitent pas d'accès direct au stockage partage, mais demandent des données via les nœuds Hub.

Les nœuds Hub peuvent exister dans une configuration Flex Cluster sans avoir de nœuds Leaf, en tant que membre du cluster, mais les nœuds Leaf doivent être membre d'un cluster comprenant au moins un nœud Hub.

Dans Oracle Flex ASM, seul un nombre restreint de nœuds exécutent des instances Oracle ASM. Ces instances ASM sont connectées par toutes les instances de bases de données du cluster, pour fournir un accès au stockage à ces derniers. Lorsqu'une instance échoue, une instance de bases de données connectant l'instance ASM défaillante bascule vers une autre instance ASM. Oracle Flex ASM est une option dans Oracle 12c ASM que vous pouvez activer ou non. Si un cluster standard est déjà installé, vous pouvez activer ou désactiver Flex ASM à tout moment.

Avec Oracle 12c, les instances des bases de données et les instances ASM, peuvent désormais se trouver sur des serveurs différents. Par conséquent, les instances ASM nécessitent un fichier de mot de passe pour authentifier une instance de bases de données, se connectant a une instance ASM du cluster. Pour la plupart des configurations, un fichier de mot de passe ASM par défaut est créé lors de l'installation et stocke dans une groupes de disques ASM.

Les Avantages d'Oracle Flex Cluster ?

Oracle Clusterware 11gR2 et les versions antérieures, ne prennent en charge que l'architecture de cluster étroitement connectée:

Chaque nœud du cluster est connecté à d'autres nœuds via l'interconnexion privée

Chaque nœud du cluster est directement connecte au stockage partagé.

Le problème dans ce type de cluster est que l'interconnexion connecte chaque paire de nœuds et chaque nœud est connecte au stockage partage. Ainsi un cluster à nœuds aura chemins d'interconnexion possibles pour les échanges de donnees entre noeuds et chemins possibles d'acces au stockage partagé. Ce qui signifie par exemple, que si on un a un RAC de 16noeuds, on aura 120 chemis d'interconnexion differents et 16 connexions au stockage. Ca reste gérable. Si nous voulons dimensionner un cluster a une portee beacoup plus grande, par exemple 100 noeuds, on a aura 4950 chemins d'interconnexion et 100 connexions au stockage; La complexite du nombre de chemin d'interconnexion et de stockage rend le cluster difficile a gerer.

Oracle Flex Cluster augmente l'évolutivité des clusters. Dans un cluster de 16 nœuds ou nous avons configure 4 nœuds Hub et 12 nœuds Leaf, avec 3 nœuds Leaf par nœud Hub, le nombre d'interconnexions privées entre 4 nœuds Hub est de 6 et le nombre de connexions entre les nœuds Leaf et leurs nœuds Hub est de 12. Cela donne le nombre total d'interconnexions a 18, compare à 120 dans un cluster standard à 16 nœuds.

Deux impacts fondamentaux découlent de cette architecture :

Limiter la taille des nœuds Hub permettant de réduire les contentions liées à l'OCR et aux disques VOTANTS
Réduire les Hearbeats d'échange de trafic réseau entre les nœuds Seuls les nœuds Hub ont un accès direct à l'OCR et disques votants.

Autres Avantages…

Parmi d'autres avantages on peut citer :

Le nombre maximal de groupes de disques ASM est passé de 63 à 511.
Oracle Flex ASM prend en charge des tailles de LUN plus grandes pour les clients Oracle Database 12c (augmentés à 32 PB).
Il existe maintenant une commande pour renommer un disque ASM dans un groupe de disques.
Toutes les instances d'un cluster ASM s'assurent d'exécuter la même version de code en validant le niveau de correctif à travers le cluster.

dimanche 30 juin 2019

How to Fix ORA-600 [12700] Internal Error Step By Step

######################################################
How to Fix ORA-600 [12700] Internal Error Step By Step
######################################################

Eg. ORA-600 appears in AlertSID.log file :

ORA-00600: internal error code, arguments: [12700], [18], [416476427], [15], [83969031], [25], [], []
ORA-00600: internal error code, arguments: [12700], [18], [416476597], [50], [83969031], [25], [], []
ORA-00600: internal error code, arguments: [12700], [18], [415794709], [9], [176160842], [25], [], []

Error Format: ORA-600 [12700] [a] [b] [c] [d] [e]
--------------------------------------------------------------
From Oracle 9i, there are up to five additional arguments:

Arg [a] Data Object ID

Arg [b] Table Relative Data Block Address (RDBA)

Arg [c] Row Slot Number of the row in the data block

Arg [d] If present - Index Relative Data Block Address (RDBA)

Arg [e] If present - Index Row Source Class

1. Identify the trace file containing the ora-600 error
------------------------------------------------------------------------------------------------

ORA-00600: internal error code, arguments: [12700], [18], [416476427], [15], [83969031], [25], [], []
Sat Jun 22 10:12:15 2019
Errors in file /.../..../diag/rdbms/DB/SID/xxxx_ora_YYYY.trc:

2. Identify the possibly corrupted data object
-----------------------------------------------------------------------------------------------
In the trace file, find ORA-600 Error :

ORA-00600: internal error code, arguments: [12700], [18], [416476597], [50], [83969031], [25], [], []
Current SQL statement for this session:
select u2.name, o2.name from ind$ i, obj$ o1, obj$ o2,    user$ u1, user$ u2 where    i.bo# = o1.obj# and o1.owner#=u1.user# and i.obj#=o2.obj#    and o2.owner#
=u2.user# and u1.name=:1 and o1.name=:2 and i.type#=9

ORA-00600: internal error code, arguments: [12700], [18], [416476427], [15], [83969031], [25], [], []
Current SQL statement for this session:
select obj#,type#,ctime,mtime,stime,status,dataobj#,flags,oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null an
d linkname is null and subname is null

******************************************************************
In Our case we have both ind$ and obj$ objects that are affected.
******************************************************************

3. Convert the ora-600 Oracle arguments into some meaningful information
----------------------------------------------------------------------------------------------------

SQL> set serveroutput on

SQL> execute oerr12700( 18,416476427,15 )
ORA-600 [12700] [18],[416476427],[15]
--------------------------------------------------
there is an index pointing to a row in SYS.OBJ$
row is slot 20 in file 205 block 860
one index entry is pointing to ROWID='AAAAASADNAAAANcAAU'
--------------------------------------------------
You may want to check the integrity of SYS.OBJ$
executing :
dbv file=/..../..../oradata/system001.dbf
blocksize=8192 start=860 end=860
--------------------------------------------------
IF dbv does not show any corruption, you can try to
find the corrupted indexes using the queries proposed
by the procedure oerr12700diag(18,416476427,15)
-------------------------------------------------------

PL/SQL procedure successfully completed.

**************************************************************************
This means there is one index pointing to a non-existent row in SYS.OBJ$
The row is supposed to be in file=205,block=860 , slot=416476427.
Index entry is pointing to ROWID 'AAAAASADNAAAANcAAU'.
**************************************************************************

Another way to identify the possible corrupted data object is to select from
DBA_OBJECTS :

SELECT owner, object_name, object_type, object_id, data_object_id
FROM dba_objects
WHERE data_object_id = 18;

SQL> r
1 SELECT owner, object_name, object_type, object_id, data_object_id
2 FROM dba_objects
3* WHERE data_object_id = 18

OWNER            OBJECT_NAME    OBJECT_TYPE    OBJECT_ID     DATA_OBJECT_ID
------------    --------------    ------------    ------------    --------------
SYS        OBJ$        TABLE            18             18

4. Check if the data object is corrupted or not
-----------------------------------------------------------------------------------------------
At this point, the way to check if the data object is corrupted or not
is to do an ANALYZE TABLE <table_name> VALIDATE STRUCTURE

SQL> ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE

Table analyzed.

SQ
SQL> ANALYZE TABLE sys.ind$ VALIDATE STRUCTURE;

Table analyzed.

If you can not afford to lock the table in exclusive mode, you can
try to check the block reported by the PL/SQL script above using "dbv" :

dbv file=<where-the-data-is> blocksize=<db_block_size> start=<block> end=<block>

Note: <block> is the block number showed by oerr12700 procedure.

If dbv does not return any error message and "Pages Failing" is 0,
then the data is NOT corrupted, so the corruption must be in the indexes.

---------------------------------------------------------------------------------------------------------
$dbv file=/......./oradata/system001.dbf blocksize=8192 start=860 end=860

DBVERIFY: Release xxxxxxx- Production on Fri Jun 21 23:17:04 2019

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

DBVERIFY - Verification starting : FILE = /.../system001.dbf

DBVERIFY - Verification complete

Total Pages Examined         : 1
Total Pages Processed (Data) : 1
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 0
Total Pages Processed (Seg) : 0
Total Pages Failing   (Seg) : 0
Total Pages Empty            : 0
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 10794338932012 (2513.1086117164)

5. Identify the possibly corrupted indexes
-------------------------------------------------------------------------------------------------

At this point you should be confident there is no data corruption.

The problem must be with some corrupted index.

Table identified above can have several indexes.

One way of finding the corrupted indexes is to run the command
"ANALYZE TABLE <tname> VALIDATE STRUCTURE CASCADE" and look at the
trace file it generates when it finds an error.

SQL> ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

the ANALYZE TABLE ... command will lock your table in exclusive mode,
and you may want to check the indexes without locking anything.

In this case you can try to identify the corrupted indexes using the
procedure "oerr12700diag".

SQL> execute oerr12700diag( 18,416476427,15)
--------------------------------------------------
IF dbv did not show any corruption, you can try to
find the corrupted indexes using following queries:
-------------------------------------------------------
If a query returns "no rows selected" index is sane
If a query returns AAAAASABjAAEu0LAAP index is corrupted
..................................................
.
To test SYS.OBJ$ indexes
.
.
To test INDEX I_OBJ1 you run :
.
select rowid "I_OBJ1 corrupted!"
from
(SELECT /*+ INDEX_FFS(OBJ$,I_OBJ1) */
OBJ#,rowid from SYS.OBJ$ where OBJ#=OBJ#)
where rowid='AAAAASABjAAEu0LAAP';
.
To test INDEX I_OBJ2 you run :
.
select rowid "I_OBJ2 corrupted!"
from
(SELECT /*+ INDEX_FFS(OBJ$,I_OBJ2) */
OWNER#,rowid from SYS.OBJ$ where OWNER#=OWNER#)
where rowid='AAAAASABjAAEu0LAAP';
.
To test INDEX I_OBJ3 you run :
.
select rowid "I_OBJ3 corrupted!"
from
(SELECT /*+ INDEX_FFS(OBJ$,I_OBJ3) */
OID$,rowid from SYS.OBJ$ where OID$=OID$)
where rowid='AAAAASABjAAEu0LAAP';

PL/SQL procedure successfully completed.

SQL>

SQL>
1 select rowid "I_OBJ1 corrupted!"
2 from
3 (SELECT /*+ INDEX_FFS(OBJ$,I_OBJ1) */
4 OBJ#,rowid from SYS.OBJ$ where OBJ#=OBJ#)
5* where rowid='AAAAASADNAAAANcAAU'
select rowid "I_OBJ1 corrupted!"
*
ERROR at line 1:
ORA-08103: object no longer exists

SQL> select rowid "I_OBJ2 corrupted!"
2 from
3 (SELECT /*+ INDEX_FFS(OBJ$,I_OBJ2) */
4 OWNER#,rowid from SYS.OBJ$ where OWNER#=OWNER#)
5 where rowid='AAAAASADNAAAANcAAU';

no rows selected

SQL> select rowid "I_OBJ3 corrupted!"
from
2    3 (SELECT /*+ INDEX_FFS(OBJ$,I_OBJ3) */
4 OID$,rowid from SYS.OBJ$ where OID$=OID$)
5 where rowid='AAAAASADNAAAANcAAU';

no rows selected

----------------------------------------------------------
select * /*+ full (tablename) */ from tablename
where rowid='AAAAASADNAAAANcAAU'

We can conclude that the corrupted index is I_OBJ1.

To double check, you can see if table contains a row with the ROWID
reported by "oerr12700" procedure :

SQL> select * /*+ full (tablename) */ from tablename
   2 where rowid='AAAAASADNAAAANcAAU'

SQL> select * /*+ full (sys.obj$) */ from sys.obj$
2 where rowid='AAAAASADNAAAANcAAU';

no rows selected

Since it's Dictionary Index owned by SYS, You may have an error when trying to drop and recreate index

SQL> drop index sys.I_OBJ1;
drop index sys.I_OBJ1
               *
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered

Data Dictionary objects may be not be allowed to be altered by a DDL. Data Dictionary tables stored in
table sys.bootstrap$ are not allowed to be altered with a DDL (drop, alter index, etc) as error ORA-701 is produced.

Since this index is part of sys.bootstrap$ and can not be modified, following are the 2 options suggested by Oracle Support:

A) Supported option is to recreate the database using expdp/create/impdp with or without TTS option
(TTS would take less time due to no physical import of data/creation of indexes).

B) Unsupported option does consist of an update of the DD (DataDictionary) which requires careful testing.
This is an undocumented/unsupported option which does consist of an update of the DD (DataDictionary).

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

mercredi 15 mai 2019

Manually Rebalancing Disk Groups

How to manually Rebalancing Disk Groups

You can manually rebalance the files in a disk group using the REBALANCE clause of the ALTER DISKGROUP statement. This would normally not be required, because Oracle ASM automatically rebalances disk groups when their configuration changes. You might want to do a manual rebalance operation to control the speed of what would otherwise be an automatic rebalance operation.

So, when a disk is dropped, the disk group is supposed to rebalance by moving all of the file extents from the dropped disk to other disks in the disk group. A drop disk operation might fail if not enough space is available on the other disks. The best approach is to perform both the add and drop operation with the same ALTER DISKGROUP statement. This has the benefit of rebalancing data extents only one time and ensuring that there is enough space for the rebalance operation to succeed.

Practice:

We have an existing DATA_DG disk group and we need to rebalance all the data from original ASM disks to the new one and then, drop the former ASM disks.

1. To know the name and path to new ASM disk, you can use the following query:

SELECT name, path from v$asm_disk

WHERE header_status='FORMER';

Depending of ASM_DISKSTRING parameter, you can have different outputs:

NAME PATH

------------------------ ----------------------------------------------------

NEW_DATA_DG_DISK01 /dev/oracleasm/disks/DATADG_DISK01

NEW_DATA_DG_DISK02 /dev/oracleasm/disks/DATADG_DISK02

In this case ASM_DISKTRING parameter is set as /dev/oracleasm/disks/*

NAME PATH

------------------------ ----------------------------------------------------

NEW_DATA_DG_DISK01 ORCL:DATADG_DISK01

NEW_DATA_DG_DISK02 ORCL:DATADG_DISK02

In this case ASM_DISKTRING parameter is set as ORCL:*

2. Now, you have to retrieve the name of ASM disks for the existing disk group.

SELECT a.name from v$asm_disk a, v$asm_diskgroup b

WHERE a.group_number=b.group_number

AND b.name='DATA_DG';

The output can be:

NAME

------------------------

DATA_DG_01

DATA_DG_02

DATA_DG_03

Note: You must that the total size of the new ASM disks is equal or greater than the old one.

3. The syntax to ADD and DROP of disks using ALTER DISKGROUP command is as below :

ALTER diskgroup DATA_DG add disk '/dev/oracleasm/disks/DATADG_DISK01' '/dev/oracleasm/disks/DATADG_DISK02' drop disk DATA_DG_01, DATA_DG_02, DATA_DG_03 rebalance power 10;

Diskgroup altered.

ALTER diskgroup DATA_DG add disk 'ORCL:DATADG_DISK01' 'ORCL:DATADG_DISK02' drop disk DATA_DG_01, DATA_DG_02, DATA_DG_03 rebalance power 10;

Diskgroup altered.

This operation can take time depending of the total size of data to move. You can monitor your activity by querying v$asm_operation view.

SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA PASS STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE CON_ID

------------ ----- --------- ---- ---------- ---------- ---------- ---------- ---------- ----------- -------------------------------------------- ----------

3 REBAL COMPACT WAIT 10 10 0 0 0 0 0

3 REBAL REBALANCE RUN 10 10 3799 1629533 18397 88 0

3 REBAL REBUILD DONE 10 10 0 0 0 0 0

When the operation is completed, the v$asm_operation view is empty and you can check in ASM alert.log file :

NOTE: F1X0 on disk 1 (fmt 1) relocated at fcn 0.716238: AU 0 -> AU 2

NOTE: 05/14/19 17:22:49 DATA_DG.F1X0 copy 1 relocating from 0:2 to 1:2 at FCN 0.716238

2019-05-14T18:19:53.112880+01:00

NOTE: stopping process ARB0

NOTE: stopping process ARBA

NOTE: Starting expel slave for group 2/0x2b706c92 (DATA_DG)

2019-05-14T18:19:53.168804+01:00

NOTE: Group Block outside rolling migration privileged region

NOTE: requesting all-instance membership refresh for group=2

2019-05-14T18:19:53.214540+01:00

SUCCESS: rebalance completed for group 2/0x2b706c92 (DATA_DG)

2019-05-14T18:19:54.277827+01:00

GMON updating for reconfiguration, group 2 at 39 for pid 49, osid 18501

2019-05-14T18:19:54.280704+01:00