Wednesday, December 08, 2010

Oracle11gR2 RAC: Adding Database Nodes to an Existing Cluster

Environment:
  • Oracle RAC Database 11.2.0.1
  • Oracle Grid Infrastructure 11.2.0.1
  • Non-GNS
  • OEL 5.5 or SLES 11.1 (both x86_64)

Adding Nodes to a RAC Administrator-Managed Database

Cloning to Extend an Oracle RAC Database (~20 minutes or less depending)

Phase I - Extending Oracle Clusterware to a new cluster node

1. Make physical connections, and install the OS.

Warning: Follow article “11GR2 GRID INFRASTRUCTURE INSTALLATION FAILS WHEN RUNNING ROOT.SH ON NODE 2 OF RAC [ID 1059847.1]” to ensure successful completion of root.sh!! This is also pointed out in the internal Installation Guide.


2. Create Oracle accounts, and setup SSH among the new node and the existing cluster nodes.

Warning: Follow article “How To Configure SSH for a RAC Installation [ID 300548.1]" for the correct procedure for SSH setup! EACH NODE MUST BE VISITED TO ENSURE THEY ARE ADDED TO KNOWN_HOSTS FILE!! This is also mentioned in the internal Installation Guide.


3. Verify the requirements for cluster node addition using the Cluster Verification Utility (CVU). From an existing cluster node (as ‘grid’ user):

$> $GI_HOME/bin/cluvfy stage -post hwos -n -verbose


4. Compare an existing node (reference node) with the new node(s) to be added (as ‘grid’ user):

$> $GI_HOME/bin/cluvfy comp peer -refnode -n -orainv oinstall -osdba dba -verbose


5. Verify the integrity of the cluster and new node by running from an existing cluster node (as ‘grid’ user):

$GI_HOME/bin/cluvfy stage -pre nodeadd -n -fixup -verbose


6. Add the new node by running the following from an existing cluster node (as ‘grid’ user):

a. Not using GNS

$GI_HOME/oui/bin/addNode.sh -silent “CLUSTER_NEW_NODES={}” “CLUSTER_NEW_VIRTUAL_HOSTNAMES={}}”

b. Using GNS

$GI_HOME/oui/bin/addNode.sh -silent “CLUSTER_NEW_NODES={}}”

Run the root scripts when prompted.

POSSIBLE ERROR:
/oracle/app/11.2.0/grid/root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /oracle/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2010-08-11 16:12:19: Parsing the host name
2010-08-11 16:12:19: Checking for super user privileges
2010-08-11 16:12:19: User has super user privileges
Using configuration parameter file: /oracle/app/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
-ksh: line 1: /bin/env: not found
/oracle/app/11.2.0/grid/bin/cluutil -sourcefile /etc/oracle/ocr.loc -sourcenode ucstst12 -destfile /oracle/app/11.2.0/grid/srvm/admin/ocrloc.tmp -nodelist ucstst12 ... failed
Unable to copy OCR locations
validateOCR failed for +OCR_VOTE at /oracle/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 7979.

CAUSE:
SSH User Equivalency is not properly setup.

SOLUTION:
[1] Correctly setup SSH user equivalency.

[2] Deconfigure cluster on new node:

#> /oracle/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force

[3] Rerun root.sh

POSSIBLE ERROR:
PRCR-1013 : Failed to start resource ora.LISTENER.lsnr
PRCR-1064 : Failed to start resource ora.LISTENER.lsnr on node ucstst13
CRS-2662: Resource 'ora.LISTENER.lsnr' is disabled on server 'ucstst13'

start listener on node=ucstst13 ... failed
Configure Oracle Grid Infrastructure for a Cluster ... failed

CAUSE:
A node is being added that was previously a member of the cluster and Clusterware is aware that the node was a previous member and also that the last listener status was ‘disabled’.

SOLUTION:
Referencing “Bare Metal Restore Procedure for Compute Nodes on an Exadata Environment” (Doc ID 1084360.1), run the following from the node being added as the ‘root’ user to enable and start the local listener (this will complete the operation):

ucstst13:/oracle/app/11.2.0/grid/bin# /oracle/app/11.2.0/grid/bin/srvctl enable listener -l -n
ucstst13:/oracle/app/11.2.0/grid/bin # /oracle/app/11.2.0/grid/bin/srvctl start listener -l
-n


7. Verify that the new node has been added to the cluster (as ‘grid’ user):

$GI_HOME/bin/cluvfy stage -post nodeadd -n -verbose


Phase II - Extending Oracle Database RAC to new cluster node

8. Using the ‘addNode.sh’ script, from an existing node in the cluster as the ‘oracle’ user:

$> $ORACLE_HOME/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={newnode1,…newnodeX}"


9. On the new node run the ‘root.sh’ script as the ‘root’ user as prompted.


10. Set ORACLE_HOME and ensure you are using the ‘oracle’ account user. Ensure permissions for Oracle executable are 6751, if not, then as root user:

cd $ORACLE_HOME/bin
chgrp asmadmin oracle
chmod 6751 oracle
ls -l oracle

OR

as 'grid' user: $GI_HOME/bin/setasmgidwrap o=$RDBMS_HOME/bin/oracle


11. On any existing node, run DBCA ($ORACLE_HOME/bin/dbca) to add/create the new instance (as ‘oracle’ user):

$ORACLE_HOME/bin/dbca -silent -addInstance -nodeList -gdbName -instanceName -sysDBAUserName sys -sysDBAPassword

NOTE: Ensure the command is run from an existing node with the same or less memory than the new nodes otherwise the command will fail due to insufficient memory to support the instance. Also ensure that the log file is checked for actual success since it can differ from what is displayed at the screen.

POSSIBLE ERROR:
DBCA logs with error below (screen indicates success):
Adding instance
DBCA_PROGRESS : 1%
DBCA_PROGRESS : 2%
DBCA_PROGRESS : 6%
DBCA_PROGRESS : 13%
DBCA_PROGRESS : 20%
DBCA_PROGRESS : 26%
DBCA_PROGRESS : 33%
DBCA_PROGRESS : 40%
DBCA_PROGRESS : 46%
DBCA_PROGRESS : 53%
DBCA_PROGRESS : 66%
Completing instance management.
DBCA_PROGRESS : 76%
PRCR-1013 : Failed to start resource ora.racdb.db
PRCR-1064 : Failed to start resource ora.racdb.db on node ucstst13
ORA-01031: insufficient privileges
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Process ID: 0
Session ID: 0 Serial number: 0

ORA-01031: insufficient privileges
ORA-01031: insufficient privileges
CRS-2674: Start of 'ora.racdb.db' on 'ucstst13' failed
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Process ID: 0
Session ID: 0 Serial number: 0

ORA-01031: insufficient privileges
ORA-01031: insufficient privileges

DBCA_PROGRESS : 100%

CAUSE:
The permissions and/or ownership on the ‘$RDBMS_HOME/bin/oracle’ binary are incorrect. References: “ORA-15183 Unable to Create Database on Server using 11.2 ASM and Grid Infrastructure [ID 1054033.1]”, “Incorrect Ownership and Permission after Relinking or Patching 11gR2 Grid Infrastructure [ID 1083982.1]”.

SOLUTION:
As root user:
cd $ORACLE_HOME/bin
chgrp asmadmin oracle
chmod 6751 oracle
ls -l oracle

Ensure the ownership and permission are now like:
-rwsr-s--x 1 oratest asmadmin

Warning: Whenever a patch is applied to the database ORACLE_HOME, please ensure the above ownership and permission are corrected after the patch.


12. Following the above, the EM DB Console may not be configured correctly for the new nodes. The GUI does not show the nodes, however the EM agent does report them to the OMS agent via the command line:

**************** Current Configuration ****************
INSTANCE NODE DBCONTROL_UPLOAD_HOST
---------- ---------- ---------------------

racdb racnode10 racnode10.mydomain.com
racdb racnode11 racnode10.mydomain.com
racdb racnode12 racnode10.mydomain.com
racdb racnode13 racnode10.mydomain.com
racdb racnode14 racnode10.mydomain.com

Also, the listener is not properly configured as starting from the GI_HOME so it will appear incorrectly as down in DB Console. To correctly configure the instance in DB Console:

a. Delete the instance from EM DB Console:

$RDBMS_HOME/bin/emca -deleteInst db

b. Create the EM Agent directories on the new node for all the nodes in the cluster (including the new node) as follows:

mkdir -p $RDBMS_HOME/racnode10_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode11_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode12_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode13_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode10_racdb/sysman/emd
mkdir -p $RDBMS_HOME/racnode11_racdb/sysman/emd
mkdir -p $RDBMS_HOME/racnode12_racdb/sysman/emd
mkdir -p $RDBMS_HOME/racnode13_racdb/sysman/emd

c. Re-add the instance:

$RDBMS_HOME/emca -addInst db

To address the listener incorrectly showing as down in DB Console:

a. On each node as the oracle user, edit the file ‘$RDBMS_HOME/_/sysman/config/targets.xml’.

b. Change the "ListenerOraDir" entry for the node to match the ‘$GI_HOME/network/admin’ location.

c. Save the file and restart the DB Console.


13. Verify the administrator privileges on the new node by running on existing node:

$ORACLE_HOME/bin/cluvfy comp admprv -o db_config -d -n -verbose


14. For an Admin-Managed Cluster, add the new instances to Services, or create additional Services. For a Policy-Managed Cluster, verify the instance has been added to an existing server pool.

15. Setup OCM in the cloned homes (both GI and RDBMS):

a. Delete all subdirectories to remove previously configured host:

$> rm -rf $ORACLE_HOME/ccr/hosts/*

b. Move (do not copy) from the OH the file as shown below:

$> mv $ORACLE_HOME/ccr/inventory/core.jar $ORACLE_HOME/ccr/inventory/pending/core.jar

c. Configure OCM for the cloned home on the new node:

$> $ORACLE_HOME/ccr/bin/configCCR -a



Adding Nodes to a RAC Policy-Managed Database
When adding a node in a cluster running as a Policy-Managed Database, Oracle Clusterware tries to start the new instance before the cloning procedure completes. The following steps should be used to add the node:

1. Run the ‘addNode.sh’ script as the ‘grid’ user for the Oracle Grid Infrastructure for a cluster to add the new node (similar to step 6 above). DO NOT run the root scripts when prompted; you will run them later.


2. Install the Oracle RAC database software using a software-only installation, clone or ‘addNode.sh’ script (same as step 8 above) method. Ensure Oracle is linked with the Oracle RAC option if using the software-only installation.


3. Complete the root script actions for the Database home, similar to steps 9 - 10 above.


4. Complete the root scripts action for the Oracle Clusterware home and then finish the installation. Similar to as mentioned in step 6 above.


5. Verify that EM DB Console is fully operational, similar to step 12 above.


6. Complete the configuration for OCM, similar to step 15 above.

No comments:

Post a Comment