The Lazy Tech's Blog

Wednesday, June 29, 2011

Ensure CELL_OFFLOAD_PROCESSING is FALSE in Non-Exadata

A colleague of mine recently brought to my attention a curious wait event that they were experiencing for 'ASM file metadata operation' which occurs when doing operations such as a DROP TABLESPACE, or in this case when using Data Pump. Though the server was basically not doing anything the CPU usage was 100% and there was the strange wait event on 'ASM file metadata operation' with high waits on 'ksv master wait'. So to Oracle Support they went via opening an SR.

It seems that upon applying a PSU to 11.2.0.2 (as of this writing PSU 2 is that latest so likely this will not occur in PSU 3 but that is only my guess) the parameter cell_offload_processing is set to TRUE. In an Exadata environment this is the appropriate setting, however, in a non-Exadata environment, which it was in this case, this causes performance issues to arise as processes on the RDBMS side await on a reply from the ASM side which is trying to delivery smart-scan results.

The quick fix is of course to simply reset the parameter to FALSE, i.e. 'ALTER SYSTEM SET cell_offload_processing = FALSE'. If you prefer you can instead apply patch 11800170. Per the MOS note, "High 'ksv master wait' And 'ASM File Metadata Operation' Waits In Non-Exadata 11g [ID 1308282.1]" this issue is fixed in 11.2.0.3 and 12.1.

Of course 11.2.0.3 is the likely next patch set which is not yet released but is likely in testing. I've been seeing this 12.1 designation on MOS recently which is a good indication that this is the next version for the database, which is likely in early testing. I think that is an entry for a good blog, namely what features I'd like to see in the next database release, and what is likely to be included by Oracle. Attending OpenWorld 2011 would certainly help with that posting, come on Oracle, where is my Blogger approval ;-).

Monday, June 27, 2011

Oracle OpenWorld 2011

I recently signed up for Oracle OpenWorld 2011 (Blogger registration) just last week. So now I am awaiting to be approved (or not as the case may be). I found out about this additional registration process last year when a friend of mine got through.The requirements are pretty basic, i.e. a blog about Oracle products and/or solutions and updates within the last month. Your application will then get sent through to Oracle Corporate Communications team for approval. My friend signed up again, so we are both hoping to get approved and attend together though with a new baby on the way he is more likely to not attend.

I've been to two OpenWorld events (both in San Francisco), and in all honesty it should be a must visit event for anyone in the industry (at least once). The full attendance fee might be a bit prohibitive for some at over $2,000 (for regular folk) but is well worth the investment, which is exactly how it should be seen. But, in case you don't have the budget for the full attendance there are a few options for attending such as the Blogger registration, or a Discover pass ($50 - $125 depending on registration date) to name a few. I'm not too keen on the Discover pass since you don't get to attend any sessions (or the parties!), but it is an entry way to the other invaluable opportunities such a networking.You can find a comparison of the different packages on the Oracle OpenWorld site here. If you do get approved, there is still of course, the travel and accommodation considerations which many book a full year in advance to avoid the rise in prices, and also availability. My first attendance was on a pass from our Oracle solution provider in 2006, with hotel and flight covered by the company. I was lucky enough to find a surprisingly good motel just 3 blocks away which is walking distance and where you want to be unless you like commuting. Last year I stayed in the Oakland area with a friend and had to commute via a mile walk, train, and then 2 blocks. Lots of good exercise but with all the walking you will do at OpenWorld, your feet will not be thanking you regardless of good shoes!

If you think selling your attendance is difficult (so you can get the company to pay for something to help you out, even if it is just for your time), think again. Oracle is very thoughtful in helping you to write a return on investment letter to your managers or whomever. You can find a link here to that helpful page. I have similar points:

- great educational opportunity to learn about Oracle products and solutions from experts, actual customers with experience, and Oracle themselves. You do need to plan your sessions based on your level of expertise and what you are trying to get out of your trip in order to get the full benefit. From my experience too many of the sessions are not appropriately labeled for the correct expertise level and their contents, so I often find myself moving from room to room within the same time period to catch something of interest based on my schedule (which is why I register for a few concurrent sessions).

- networking with peers, experts, and Oracle staff from all areas including development, support, consulting, managerial, and 'C' levels. This is a big sell since you can get really good contacts within Oracle to aid in addressing immediate or future issues such as the developers that work on GoldenGate, Enterprise Manager, ASM, or the Oracle kernel.

- contact existing and new vendors. While you do get the usual song and dance, there are really good contacts to be made as some vendors (mostly the larger companies) bring out some of their hidden internal talent to answer questions. Seeing the various offerings, especially for something you need but didn't know a company existed that offered a solution is golden.

Of course, the parties are incredible, and the overall experience overwhelming. I do think that for anyone that has ever attended the feedback is the same. You need to know what you want from the event, plan accordingly, and send enough people to get the job done. At least 3 people should be sent, as only one will simply not work unless your needs are really narrow. There is simple too much ground to cover for one person.

The event is from Oct 2 - 6 at the Moscone Center in San Francisco. Do yourself a favor and register now to attend. You will not regret the experience.

Monday, June 13, 2011

RMAN is not removing my logs, why?

I'm not going to go into the details and various scenarios of RMAN and archive logs not being removed. The purpose of this entry is (other than a filler) to provide a head's up, if you will, as to what you are likely to encounter now in an 11g environment using RMAN and GoldenGate (or Streams). I've made mention in my prior post on the relationship between GoldenGate and Streams here.

So it seems that again, due to our usage of GoldenGate, that Streams is also being enabled under the covers. It seems also that there is a 6 hour delay in archive logs being removed via RMAN even if they are not required! Once again, doing the requisite research, we found that this is actually expected behavior. RMAN is looking at the MIN_REQUIRED_CAPTURE_CHANGE# in v$database and not at DBA_CAPTURE which only gets updated every 6 hours. There is a workaround which involves forcefully deleting the archive logs using an RMAN statement similar to:

delete noprompt force archivelog all completed before 'sysdate-10/1440';

I'm not sure why the change in behavior which now leads to this situation but I can't say were too thrilled. Seems like a bug or perhaps not yet fully thought out or completed perhaps? In any case, we've got the backup scripts and everything thought out so no harm, no foul. Moving on with the project...

Oh, you can reference the MOS article: "RMAN-08137 When deleting archivelogs even when Streams CAPTURE does not require them [ID 1079953.1]"

Oracle GoldenGate and Streams Integration?

We are still doing our Exadata migration, apologies for not posting more on this as it progresses but I find it hard to determine what is, and is not acceptable to blog (confidentially, NDA and all that stuff). I'm also quite busy and hardly find the time to blog generally. In any case, we are now on the 3rd environment, that being production support or QA, and have noticed that even though we do not have any Streams configured, we are seeing miscellaneous Streams events as one of the top wait events. It turns out that is not only expected but can be regarded as an idle wait. This happens because we are running GoldenGate Extract with the THREADOPTIONS DBLOGREADER parameter which enables the use of the OCIPOGG module. The wait (OTHER) is being reported while GoldenGate waits for additional logfile data to be made available.

So it would seem with the latest GoldenGate (11.1.1.1) that there is some integration with Streams. This makes sense since they are both Oracle products and per Oracle's Goldengate Statement of Direction GoldenGate is the strategic direction where the best of Streams will be integrated into GoldenGate.

For further information please reference MOS article: "Why do I see Streams Miscellaneous Event in AWR as a top event when Goldengate Extract is running [ID 1317122.1]".

Monday, April 18, 2011

Exadata Series - All Exadata Machines crashed!

On Friday morning we woke to a nasty bit of news. That is, all 4 of our Exadata machines were down! That's right, development/test, production support (QA), production and production prime (standby) all crashed! Talk about Murphy's Law.

The background
We are still doing dry run migrations and testing so fortunately they are not live yet, but it does not give the customer a warm and fuzzy. What if we were live!? So of course questions started being asked, and flags went up as to if this platform migration was a mistake, and we had to do damage control. It did not help matters that we had just signed a managed services contract and it took so longer for notification (almost 24 hours) and resolution (another 6+ hours). In defense, the contract was only signed a few days to a week before this incident and Grid Control was not yet available due to timing and some build issues (but we had signed, so where is our SLA?).

The potential impact was it was to be our final test migration before development cut-over. If the machine was down our already tight timelines would be severely impacted. Fortunately, service was restored and we got through the migration successfully. So all is well that ends well.

So what happened?
It seems the IB network was fluctuating which meant that all communication over this network was effectively down. That of course means no RAC interconnect, no ASM, no Storage Servers, nothing. Given the only thing in common between the 4 machines was the IB network which are connected by each machines IB spine switch and no changes were recently made, our theory was that it had to either be a firmware issue on the IB switches, or a configuration issue (still within the IB network) which finally caught up to us.

I wont get into the details here (again), just thought I'd share what happened, the workaround and the solution. As it turns out we were pretty accurate in our theory. When you have connected machines the spine switch should be configured as the subnet manager (and not the leaf switch), but there is also a firmware bug that arises when you have connected machines which causes a panic. See MOS bug ID 10165319 (unfortunately this is an internal bug). Since we were already at the latest firmware version (1.1.3-2), we will have to wait until 1.3.3-2 is available which is maybe another 2 weeks (reference MOS article "Database Machine and Exadata Storage Server 11g Release 2 (11.2) Supported Versions (Doc ID 888828.1)"). Version 1.3.3-1 would also solve our issue but that was pulled back by Oracle to fix a few more items. We've not made a decision yet but I suspect we will wait until 1.3.3-2 is GA instead of applying 1.3.3-1. In the interim, we will need to monitor the logs and reboot the spine switch before it crashes all the machines again. Oh, I think this will be fun times.

As I've mentioned we did get everything up and running (rebooted each switch, then the Storage Servers, then the DB nodes) so we did get to do another dry run migration (in record time), and without further incident.

Just a note
Things are different if you have only a single machine, two to three machines, and 4 or more machines connected (we fell into the later). For a single machine no issues, this will not happen. For two to three connected machines ensure the spine switches are set as subnet manager with priority 8 and leafs with priority 5, not set as the subnet managers. For 4 or more connected machines ensure the spines are set as the subnet manager (priority 8), and the leafs set to priority 5 with the subnet manager process disabled.

Update note:
I've updated this post as upon re-reading I found too much was said concerning the incident and not the actual issue and root/case. Also, the new firmware (1.3.3-2) has been made available so you can get it on MOS (patch 12373676 which also requires 11891229). Note 888828.1 has been updated to reflect this new version as well.

Monday, February 21, 2011

Exadata Series - Performance Comparison

So how does Exadata compare?

A lot of people have asked this question and in truth although your mileage will vary, Exadata is a definite screamer and should solve most performance challenges. But there is more to compare than just performance! A lot needs to be taken into consideration and I highly recommend considering all the variables before making any decisions since Exadata is very expensive and ROI needs to be proven. Some criteria to include:

Solution Maturity
Pricing (CAPEX and OPEX)
Performance (OLTP, Warehousing, Analytics)
Scalability
Availability
Backup/Recovery Options
Migration Options
Management Options
Integration Requirements
Virtualization Options
Tiering Options
Provisioning Options
Training Requirement
Professional Services Available
Refresh/Replication Options
Cooling
Floor space
Networking

Now going back to the 'U' platform, it is very impressive and when paired with the appropriate storage solution is a strong value proposition. Storage features such as snap and clone can provide instant alternate environments, or backup/recovery options. Such options are not available for Exadata. Some will tell you that Data Guard combined with Flashback does the same thing but there are substantial differences (for example, instant snap or clone vs. build and setup the Data Guard environment). You must determine for yourself what solution will best address your business needs now and in the future.

What were the platforms?

Again, without getting into too much detail (for reasons previously mentioned), lets call use the following:

Platform P5 w/CX type storage (baseline)
Platform P7 w/DS type storage
Platform 'U' blades w/CX type storage - Note that the storage was suboptimal as it was our sandbox storage environment with a known bottleneck in the number of connections; the hardware/software stack was also not optimized for Oracle
Platform HDS w/AMS type storage
Oracle Exadata v2 (1/4 rack w/High Performance disks)

Knowing we were looking at Exadata, the other vendors took the approach of matching Exadata in terms of price/performance taking into consideration the cores and license costs. This was especially relevant to platform P7 since their core factor is 1.0 whereas Intel (used by Exadata and platform 'U') is 0.5. As expected, in the pure CPU processing area, platform P7 was the most efficient CPU. This of course resulted in the vendor (as they knew going in) being able to use less processors to match, or more accurately beat, the Intel processors and hence making the core factor a non-issue. For example, using 6 x P7 cores bested 12 x Intel Nehalem cores. It will be argued of course that Exadata has 96 cores for the DB nodes + 168 cores for the Storage nodes (in a full rack) since processing will also be done by the Storage Servers. That is a valid argument except were the storage servers are not involved which depends a lot on your workload. It must be noted that platform 'U' did quite well given its degraded setup, even besting Exadata in a few individual tests (a real testament to the the platform).

For testing we devised 5 test cases consisting of the same 12 unit tests (i.e. loads, stats collection, and queries):

T1: "As is", i.e. just run without any changes
T2: Hints dropped
T3: Hints & indexes dropped
T4: Same as T3 but using compression (HCC for Exadata & AC for other platforms)
T5: Same as T3 but with artificially induced load on the CPUs (at 100%)

Testing was done off-site by the respective vendors, except platform 'U' which was done on-site by myself. Oracle apparently has a policy against making available performance data so I'd recommend this be discussed upfront if you want access to the AWR and other such information for review. We were unaware of this policy going into the tests and were told the AWR was not captured. As we persisted the explanation changed into it being "company confidential", and recently into such information is not generally made available.

I also recommend ensuring the appropriate Oracle resources are made available. We were less than impressed with the Oracle team running the POC as given the collective resources of Oracle at their disposal it took them until the next day to realize the Exadata machine was improperly configured, had a failed Flash card, and also how to use Data Pump (we had to help them here). Just getting our data (less than 4TB) inside the machine was taking over 4+ hours (operation was killed) until each of these issues were addressed. The load time was still unimpressive though our contention was that the machine was still less than ideally configured:

Exadata: less than 3 hours
Platform 'U': 1 hour 45 minutes
Platform P7: 16 minutes

To share some other performance numbers (ordered by overall best time improvement) see below. Note that I've combined the platform 'U' and 'HDS' results since HDS was the storage piece and U was the compute piece.

T1: P7 (9x), Exadata (7x), U/HDS (4x)
T2: Exadata (13x), P7 (10x), U/HDS (9x)
T3: P7 (21x), Exadata (11x), U/HDS (8x)
T4: P7 (16x), Exadata (12x), U/HDS (8x)
T5: U/HDS (17x), P7 (9x), Exadata (6x)

Of note is that we had a particularly nasty load test with which all the platforms had trouble, so much so in fact that in the case of T1 none of the systems managed to complete the test in a time which bested the baseline (unit tests were stopped once past the baseline based on our discretion).

Curiously, the Exadata DB nodes were 100% CPU utilized by two concurrent IAS streams in T5, while for the other platforms a more artificial stress was required using a CPU heavy PL/SQL function (a session per CPU thread). We found this quite strange given the similar processing power between the Exadata DB nodes and the other Intel platforms, though as we were not given access to any data we were unable to get any answers.

Monday, February 14, 2011

Exadata Series - The beginning

As with any story there is always a beginning, and so it is with this Exadata project. I would say it all started around Q1 2010, when... well, lets just say we decided to look at what our next generation platform would be for our enterprise databases.

The Current Platform

Our existing platform is IBM Power5 with micro-partitions. These servers are very reliable, hold tons of memory, and provide lots of processing power. Although the performance of the Power5 has been eclipsed by Intel's Nehalem and even further by IBM's Power7 (the current generation Power), they are even now more than capable of supporting any workload. Without disclosing any major details which may get me into trouble with my current employer, lets just say there were a few such machines, well populated for CPU, and RAM and various I/O and HBA cards. Capacity on Demand (CoD) also being utilized when required. That is one of the main benefits of the Power platform IMHO, being able to pay as you grow as well as pay as you need/use.

These machines hosted multiple databases (a few sharing the same partition), for all the various environments (i.e. development up to production). Our storage platform was mid-tier (like CLARiion CX3/CX4) of which ~80TB was for the 20+ production Oracle databases to be migrated to Exadata. Of course the storage platform was general purpose so there were PB being used for various purposes (including other Oracle databases).

Database workload was no different from others being a mixture of OLTP, warehousing (reporting), and analytics (the usual 60/40, or 70/30). The database sizes ranged from ~100GB up to 11+TB, with the main distribution in the mid GB range. The larger ones being the most business critical (aren't they always?).

Step 1: Lets test out RAC first

Being as we had a fair number of databases, and the direction was to consolidate, our thinking was to leverage Real Application Clusters (RAC) with Services to provide some separation between workloads and/or applications, provide scalability, and improve availability. So we decided to first test out RAC using 11.2.0.1 on commodity and see how our workload functions in such an environment as a first step. My understanding was that previous tests with 10.1 RAC did not go so well as our applications were not RAC-friendly.

At the time we were finishing up testing on platform 'U' (a relatively new blade server platform) for some other virtualization stuff, so we decided to extend that hardware loan and re-use it for our RAC testing. The virtualization tests went well and we had already decided to use it going forward so it was advantageous to minimize platforms (save costs and all that) should we decide to go with a commodity solution. The RAC tests went well, especially the actual platform 'U' testing. There are real advantages to stateless computing, i.e. using server profiles with a SAN-booted OS image and being able to move from one blade server to another with minimal downtime.

Much later on, we ran into some problems, similar to others, with using 11.2.0.2 due to the multicast requirements for HAIP. I unfortunately did not have time to resolve this problem with the server and networking guys as I was only given an hour. In any case this was a non-issue as we simply used 11.2.0.1. In upcoming entries I'll share some interesting performance comparisons for our workload.

Update note:
Seems there are a few workarounds to addressing the multicast-case issues I faced on platform 'U':

an updated firmware
set the virtual IO card to promiscuous mode
turn of filtering in the Linux OS

New postings (and welcome Exadata)

Once again, it is time for some blog posts. I find it quite hard to blog even though I really should do more (for various reasons). There is just so much already out there on the internet for what I want to say, and then there is the time it takes to do a reasonably good blog posting. I really respect all who do it regularly while having a full time job and family.

Anyways, I'll be putting together a few series on Oracle Enterprise Manager Grid Control 11g,and Oracle11gR2 DB including RAC, and Oracle Exadata.

We've just gotten our first Exadata X2-2 full rack HC machine. This is one of four to be delivered and it has been quite a ride. It has been over a year of decision making involving POC, RFP, finalizing a decision, and ordering. I'll try to document as best as I can keeping in mind NDA. Many thanks must go out to all the various bloggers and pioneers (some of whom I've had the good fortune to meet) of Exadata v1, v2, and X2-2. Without your postings and information, this would have been quite a larger ordeal.

The Importance of Security
A colleague of mine was so excited he forgot all about our campus security policy and took a picture of the delivery truck. I found this a bit much since he attended OpenWorld 2010 and saw the X2-2, X2-8, and Exalogic machines up close and took pictures, as well as the fact it was just the truck since there was no view of the actual machine, but I can't really blame him. In any case, his phone was quickly confiscated and the picture removed due to our strict no pictures on site policy! He was written up in the guards report though I'm sure no harm will come from it (or so we hope), and although I had not taken any pictures I was guilty by proximity and my phone was also confiscated and checked as a precaution. He is planning to request access to the data center for a second attempt, though this time authorized (and no pictures this time). I wish him luck!

Wednesday, December 08, 2010

Oracle11gR2 RAC: Removing Database Nodes from an Existing Cluster

Phase I - Remove the node from RAC database

Policy-Managed

1. Remove DB Console from the node by running the following from another node in the cluster (as ‘oracle’ user):

$> $RDBMS_HOME/bin/emca -deleteNode db

2. For Policy-Managed database a possible method is to decrease the size of the server pool, and relocate the node to the free pool (assuming all other maximums are met). For relocation to work there should be no active sessions on the associated instance. You can either use the ‘-f’ flag to force relocation, or shutdown the service on the instance prior to running the command:

$> $RDBMS_HOME/bin/srvctl stop instance -d -n
$> $RDBMS_HOME/bin/srvctl relocate server -n -g Free

Admin-Managed

1. For Admin-Managed databases, ensure the instance to be removed is not a PREFERRED or AVAILABLE instance for any Services (i.e. modify Services to exclude the instance to be removed).

2. Remove the instance using ‘$RDBMS_HOME/bin/dbca’, running the command from a node not being removed (as ‘oracle’ user):

$> $RDBMS_HOME/bin/dbca -silent -deleteInstance -nodeList -gdbName -instanceName -sysDBAUserName sys -sysDBAPassword

3. Disable and stop any listeners running on the node (as ‘oracle’ user on any node):

$> $RDBMS_HOME/bin/srvctl disable listener -l -n
$> $RDBMS_HOME/bin/srvctl stop listener -l -n

4. Update the inventory on the node to be removed (run from the node being removed as ‘oracle’ user):

$> $RDBMS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME= “CLUSTER_NODES={[oldnode]}” -local

5. Deinstall the Oracle home, from the node being removed (as ‘oracle’ user):

$> $RDBMS_HOME/deinstall/deinstall -local

6. From any of the existing nodes run the following to update the inventory with the list of the remaining nodes (as ‘oracle’ user):

$> $RDBMS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME= “CLUSTER_NODES={[node1,...nodeX]}"

Phase II - Remove node from Clusterware

7. Check if the node is active and unpinned (as ‘root’ user):

#> $GI_HOME/bin/olsnodes -s -t

Note: The node will only be pinned if using CTSS, or using with database version < style="text-align: justify;">8. Disable Clusterware applications and daemons on the node to be removed. Use the ‘-lastnode’ option when running on the last node in the cluster to be removed (as ‘root’ user):

#> $GI_HOME/crs/install/rootcrs.pl -deconfig -force [-lastnode]

9. From any node not being removed delete Clusterware from the node (as ‘root’ user):

#> $GI_HOME/bin/crsctl delete node -n

10. As ‘grid’ user from node being removed:

$GI_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME= "CLUSTER_NODES={[oldnode1,...oldnodeX]}" CRS=TRUE -local

11. Deinstall Clusterware software from the node:

$> $GI_HOME/deinstall/deinstall -local

ISSUE:

‘$GI_HOME/deinstall/deinstall -local’ results in questions about other nodes, and prompts for running scripts against other nodes. DO NOT RUN!! This process deconifigures CRS on all nodes!

ROOT/CAUSE:
Incorrect documentation as a command is missing.

RESOLUTION:
The correct order (as already given in this document) should be as follows:

$GI_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/oracle/app/11.2.0/grid “CLUSTER_NODES={[oldnode]}” CRS=TRUE -local
$GI_HOME/deinstall/deinstall -local

12. From any existing node to remain, update the Clusterware with the remaining nodes (as ‘grid’ user):

$> $GI_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME= “CLUSTER_NODES={[oldnode1,...oldnodeX]}” CRS=TRUE

13. Verify the node has been removed and the remaining nodes are valid:

$GI_HOME/bin/cluvfy stage -post nodedel -n -verbose

14. Remove OCM host/configuration from the MOS portal.

Oracle11gR2 RAC: Adding Database Nodes to an Existing Cluster

Environment:

Oracle RAC Database 11.2.0.1
Oracle Grid Infrastructure 11.2.0.1
Non-GNS
OEL 5.5 or SLES 11.1 (both x86_64)

Adding Nodes to a RAC Administrator-Managed Database

Cloning to Extend an Oracle RAC Database (~20 minutes or less depending)

Phase I - Extending Oracle Clusterware to a new cluster node

1. Make physical connections, and install the OS.

Warning: Follow article “11GR2 GRID INFRASTRUCTURE INSTALLATION FAILS WHEN RUNNING ROOT.SH ON NODE 2 OF RAC [ID 1059847.1]” to ensure successful completion of root.sh!! This is also pointed out in the internal Installation Guide.

2. Create Oracle accounts, and setup SSH among the new node and the existing cluster nodes.

Warning: Follow article “How To Configure SSH for a RAC Installation [ID 300548.1]" for the correct procedure for SSH setup! EACH NODE MUST BE VISITED TO ENSURE THEY ARE ADDED TO KNOWN_HOSTS FILE!! This is also mentioned in the internal Installation Guide.

3. Verify the requirements for cluster node addition using the Cluster Verification Utility (CVU). From an existing cluster node (as ‘grid’ user):

$> $GI_HOME/bin/cluvfy stage -post hwos -n -verbose

4. Compare an existing node (reference node) with the new node(s) to be added (as ‘grid’ user):

$> $GI_HOME/bin/cluvfy comp peer -refnode -n -orainv oinstall -osdba dba -verbose

5. Verify the integrity of the cluster and new node by running from an existing cluster node (as ‘grid’ user):

$GI_HOME/bin/cluvfy stage -pre nodeadd -n -fixup -verbose

6. Add the new node by running the following from an existing cluster node (as ‘grid’ user):

a. Not using GNS

$GI_HOME/oui/bin/addNode.sh -silent “CLUSTER_NEW_NODES={}” “CLUSTER_NEW_VIRTUAL_HOSTNAMES={}}”

b. Using GNS

$GI_HOME/oui/bin/addNode.sh -silent “CLUSTER_NEW_NODES={}}”

Run the root scripts when prompted.

POSSIBLE ERROR:
/oracle/app/11.2.0/grid/root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /oracle/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2010-08-11 16:12:19: Parsing the host name
2010-08-11 16:12:19: Checking for super user privileges
2010-08-11 16:12:19: User has super user privileges
Using configuration parameter file: /oracle/app/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
-ksh: line 1: /bin/env: not found
/oracle/app/11.2.0/grid/bin/cluutil -sourcefile /etc/oracle/ocr.loc -sourcenode ucstst12 -destfile /oracle/app/11.2.0/grid/srvm/admin/ocrloc.tmp -nodelist ucstst12 ... failed
Unable to copy OCR locations
validateOCR failed for +OCR_VOTE at /oracle/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 7979.

CAUSE:
SSH User Equivalency is not properly setup.

SOLUTION:
[1] Correctly setup SSH user equivalency.

[2] Deconfigure cluster on new node:

#> /oracle/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force

[3] Rerun root.sh

POSSIBLE ERROR:
PRCR-1013 : Failed to start resource ora.LISTENER.lsnr
PRCR-1064 : Failed to start resource ora.LISTENER.lsnr on node ucstst13
CRS-2662: Resource 'ora.LISTENER.lsnr' is disabled on server 'ucstst13'

start listener on node=ucstst13 ... failed
Configure Oracle Grid Infrastructure for a Cluster ... failed

CAUSE:

A node is being added that was previously a member of the cluster and Clusterware is aware that the node was a previous member and also that the last listener status was ‘disabled’.

SOLUTION:

Referencing “Bare Metal Restore Procedure for Compute Nodes on an Exadata Environment” (Doc ID 1084360.1), run the following from the node being added as the ‘root’ user to enable and start the local listener (this will complete the operation):

ucstst13:/oracle/app/11.2.0/grid/bin# /oracle/app/11.2.0/grid/bin/srvctl enable listener -l -n
ucstst13:/oracle/app/11.2.0/grid/bin # /oracle/app/11.2.0/grid/bin/srvctl start listener -l -n

7. Verify that the new node has been added to the cluster (as ‘grid’ user):

$GI_HOME/bin/cluvfy stage -post nodeadd -n -verbose

Phase II - Extending Oracle Database RAC to new cluster node

8. Using the ‘addNode.sh’ script, from an existing node in the cluster as the ‘oracle’ user:

$> $ORACLE_HOME/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={newnode1,…newnodeX}"

9. On the new node run the ‘root.sh’ script as the ‘root’ user as prompted.

10. Set ORACLE_HOME and ensure you are using the ‘oracle’ account user. Ensure permissions for Oracle executable are 6751, if not, then as root user:

cd $ORACLE_HOME/bin
chgrp asmadmin oracle
chmod 6751 oracle
ls -l oracle

OR

as 'grid' user: $GI_HOME/bin/setasmgidwrap o=$RDBMS_HOME/bin/oracle

11. On any existing node, run DBCA ($ORACLE_HOME/bin/dbca) to add/create the new instance (as ‘oracle’ user):

$ORACLE_HOME/bin/dbca -silent -addInstance -nodeList -gdbName -instanceName -sysDBAUserName sys -sysDBAPassword

NOTE: Ensure the command is run from an existing node with the same or less memory than the new nodes otherwise the command will fail due to insufficient memory to support the instance. Also ensure that the log file is checked for actual success since it can differ from what is displayed at the screen.

POSSIBLE ERROR:
DBCA logs with error below (screen indicates success):
Adding instance
DBCA_PROGRESS : 1%
DBCA_PROGRESS : 2%
DBCA_PROGRESS : 6%
DBCA_PROGRESS : 13%
DBCA_PROGRESS : 20%
DBCA_PROGRESS : 26%
DBCA_PROGRESS : 33%
DBCA_PROGRESS : 40%
DBCA_PROGRESS : 46%
DBCA_PROGRESS : 53%
DBCA_PROGRESS : 66%
Completing instance management.
DBCA_PROGRESS : 76%
PRCR-1013 : Failed to start resource ora.racdb.db
PRCR-1064 : Failed to start resource ora.racdb.db on node ucstst13
ORA-01031: insufficient privileges
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Process ID: 0
Session ID: 0 Serial number: 0

ORA-01031: insufficient privileges
ORA-01031: insufficient privileges
CRS-2674: Start of 'ora.racdb.db' on 'ucstst13' failed
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Process ID: 0
Session ID: 0 Serial number: 0

ORA-01031: insufficient privileges
ORA-01031: insufficient privileges

DBCA_PROGRESS : 100%

CAUSE:

The permissions and/or ownership on the ‘$RDBMS_HOME/bin/oracle’ binary are incorrect. References: “ORA-15183 Unable to Create Database on Server using 11.2 ASM and Grid Infrastructure [ID 1054033.1]”, “Incorrect Ownership and Permission after Relinking or Patching 11gR2 Grid Infrastructure [ID 1083982.1]”.

SOLUTION:
As root user:
cd $ORACLE_HOME/bin
chgrp asmadmin oracle
chmod 6751 oracle
ls -l oracle

Ensure the ownership and permission are now like:
-rwsr-s--x 1 oratest asmadmin

Warning: Whenever a patch is applied to the database ORACLE_HOME, please ensure the above ownership and permission are corrected after the patch.

12. Following the above, the EM DB Console may not be configured correctly for the new nodes. The GUI does not show the nodes, however the EM agent does report them to the OMS agent via the command line:

**************** Current Configuration ****************
INSTANCE NODE DBCONTROL_UPLOAD_HOST
---------- ---------- ---------------------

racdb racnode10 racnode10.mydomain.com
racdb racnode11 racnode10.mydomain.com
racdb racnode12 racnode10.mydomain.com
racdb racnode13 racnode10.mydomain.com
racdb racnode14 racnode10.mydomain.com

Also, the listener is not properly configured as starting from the GI_HOME so it will appear incorrectly as down in DB Console. To correctly configure the instance in DB Console:

a. Delete the instance from EM DB Console:

$RDBMS_HOME/bin/emca -deleteInst db

b. Create the EM Agent directories on the new node for all the nodes in the cluster (including the new node) as follows:

mkdir -p $RDBMS_HOME/racnode10_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode11_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode12_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode13_racdb/sysman/config
mkdir -p $RDBMS_HOME/racnode10_racdb/sysman/emd
mkdir -p $RDBMS_HOME/racnode11_racdb/sysman/emd
mkdir -p $RDBMS_HOME/racnode12_racdb/sysman/emd
mkdir -p $RDBMS_HOME/racnode13_racdb/sysman/emd

c. Re-add the instance:

$RDBMS_HOME/emca -addInst db

To address the listener incorrectly showing as down in DB Console:

a. On each node as the oracle user, edit the file ‘$RDBMS_HOME/_/sysman/config/targets.xml’.

b. Change the "ListenerOraDir" entry for the node to match the ‘$GI_HOME/network/admin’ location.

c. Save the file and restart the DB Console.

13. Verify the administrator privileges on the new node by running on existing node:

$ORACLE_HOME/bin/cluvfy comp admprv -o db_config -d -n -verbose

14. For an Admin-Managed Cluster, add the new instances to Services, or create additional Services. For a Policy-Managed Cluster, verify the instance has been added to an existing server pool.

15. Setup OCM in the cloned homes (both GI and RDBMS):

a. Delete all subdirectories to remove previously configured host:

$> rm -rf $ORACLE_HOME/ccr/hosts/*

b. Move (do not copy) from the OH the file as shown below:

$> mv $ORACLE_HOME/ccr/inventory/core.jar $ORACLE_HOME/ccr/inventory/pending/core.jar

c. Configure OCM for the cloned home on the new node:

$> $ORACLE_HOME/ccr/bin/configCCR -a

Adding Nodes to a RAC Policy-Managed Database

When adding a node in a cluster running as a Policy-Managed Database, Oracle Clusterware tries to start the new instance before the cloning procedure completes. The following steps should be used to add the node:

1. Run the ‘addNode.sh’ script as the ‘grid’ user for the Oracle Grid Infrastructure for a cluster to add the new node (similar to step 6 above). DO NOT run the root scripts when prompted; you will run them later.

2. Install the Oracle RAC database software using a software-only installation, clone or ‘addNode.sh’ script (same as step 8 above) method. Ensure Oracle is linked with the Oracle RAC option if using the software-only installation.

3. Complete the root script actions for the Database home, similar to steps 9 - 10 above.

4. Complete the root scripts action for the Oracle Clusterware home and then finish the installation. Similar to as mentioned in step 6 above.

5. Verify that EM DB Console is fully operational, similar to step 12 above.

6. Complete the configuration for OCM, similar to step 15 above.

Oracle11gR2 RAC: FAQ

Can I change a node’s hostname?

Yes, however, the node must be removed and added back to the cluster with the new name.

How do I define a service for a Policy-Managed Database?

When you define services for a policy-managed database, you define the service to a server pool where the database is running. You can define the service as either UNIFORM (running on all instances in the server pool) or SINGLETON (running on only one instance in the server pool). For SINGLETON services, Oracle RAC chooses on which instance in the server pool the service is active. If that instance fails, then the service fails over to another instance in the server pool. A service can only run in one server pool.

Services for administrator-managed databases continue to be defined by the PREFERRED and AVAILABLE definitions.

How do I convert from a Policy-Managed Database to Administrator-Managed Database?

You cannot directly convert a policy-managed database to an administrator-managed database. Instead, you can remove the policy-managed configuration using the 'srvctl remove database' and 'srvctl remove service' commands, and then create a new administrator-managed database with the 'srvctl add database' command.

What is Grid Plug and Play (GPnP)?

Grid Plug and Play (GPnP) eliminates per-node configuration data and the need for explicit add and delete node steps. This allows a system administrator to take a template system image and run it on a new node with no further configuration. This removes many manual operations, reduces the opportunity for errors, and encourages configurations that can be changed easily. Removal of the per-node configuration makes the nodes easier to replace, because they do not need to contain individually-managed state.

Grid Plug and Play reduces the cost of installing, configuring, and managing database nodes by making their per-node state disposable. It allows nodes to be easily replaced with regenerated state.

What is a Server Pool?

Server pools enable the cluster administrator to create a policy which defines how Oracle Clusterware allocates resources. An Oracle RAC policy-managed database runs in a server pool. Oracle Clusterware attempts to keep the required number of servers in the server pool and, therefore, the required number of instances of the Oracle RAC database. A server can be in only one server pool at any time. However, a database can run in multiple server pools. Cluster-managed services run in a server pool where they are defined as either UNIFORM (active on all instances in the server pool) or SINGLETON (active on only one instance in the server pool).

You should create redo log groups only if you are using administrator-managed databases. For policy-managed databases, increase the cardinality and when the instance starts, if you are using Oracle Managed Files and Oracle ASM, then Oracle automatically allocates the thread, redo, and undo.

If you remove an instance from your Oracle RAC database, then you should disable the instance’s thread of redo so that Oracle does not have to check the thread during database recovery.

For policy-managed databases, Oracle automatically allocates the undo tablespace when the instance starts if you have OMF enabled.

What is Run-Time Connection Load Balancing?

The run-time connection load balancing feature enables routing of work requests to an instance that offers the best performance, minimizing the need to relocate work. To enable and use run-time connection load balancing, the connection goal must be set to SHORT and either of the following service-level goals must be set:

· SERVICE_TIME—The Load Balancing Advisory attempts to direct work requests to instances according to their response time. Load Balancing Advisory data is based on the elapsed time for work done by connections using the service, as well as available bandwidth to the service. This goal is best suited for workloads that require varying lengths of time to complete, for example, an internet shopping system.

· THROUGHPUT—The Load Balancing Advisory measures the percentage of the total response time that the CPU consumes for the service. This measures the efficiency of an instance, rather than the response time. This goal is best suited for workloads where each work request completes in a similar amount of time, for example, a trading system.

Client-side load balancing balances the connection requests across the listeners by setting the parameter ‘LOAD_BALANCE=ON’ directive. When you set this parameter to ON, Oracle Database randomly selects an address in the address list, and connects to that node's listener. This balances client connections across the available SCAN listeners in the cluster. When clients connect using SCAN, Oracle Net automatically load balances client connection requests across the three IP addresses you defined for the SCAN, unless you are using EZConnect.

What are the different types of Server-Side Connection Load Balancing?

With server-side load balancing, the SCAN listener directs a connection request to the best instance currently providing the service by using the load balancing advisory. The two types of connection load balancing are:

· SHORT—Connections are distributed across instances based on the amount of time that the service is used. Use the SHORT connection load balancing goal for applications that have connections of brief duration. When using connection pools that are integrated with FAN, set the connection load balancing goal to SHORT. SHORT tells the listener to use CPU-based statistics.

· LONG—Connections are distributed across instances based on the number of sessions in each instance, for each instance that supports the service. Use the LONG connection load balancing goal for applications that have connections of long duration. This is typical for connection pools and SQL*Forms sessions. LONG is the default connection load balancing goal, and tells the listener to use session-based statistics.

How do I enable the Load Balancing Advisory (LBA)?

To enable the load balancing advisory, use the ‘-B’ option when creating or modifying the service using the ‘srvctl’ command.

How does the database register with the Listener?

When a listener starts after the Oracle instance starts, and the listener is listed for service registration, registration does not occur until the next time the Oracle Database process monitor (PMON) discovery routine starts. By default, PMON discovery occurs every 60 seconds.

To override the 60-second delay, use the SQL ‘ALTER SYSTEM REGISTER’ statement. This statement forces the PMON process to register the service immediately.

If you run this statement while the listener is up and the instance is already registered, or while the listener is down, then the statement has no effect.

Can I configure both failure notifications with Universal Connection Pool (UCP)?

Connection failure notification is redundant with Fast Connection Failover (FCF) as implemented by the UCP. You should not configure both within the same application.

Should I configure Transparent Application Failure (TAF) in my service definition if using Fast Connection Failure (FCF)?

Do not configure Transparent Application Failover (TAF) with Fast Connection Failover (FCF) for JDBC clients as TAF processing will interfere with FAN ONS processing.

Can I use Fast Connection Failover (FCF) and Transparent Application Failover (TAF) together?

No. Only one of them should be used at a time.

What is the status of Fast Connection Failover (FCF) with Universal Connection Pool (UCP)?

FCF is now deprecated along with the Implicit Connection Caching in favor of using the Universal Connection Pool (UCP) for JDBC.

Does Fast Connection Failover (FCF) support planned outages?

FCF does not support planned outages like service relocation (reference Doc ID: 1076130.1). It is designed to work for unplanned outages, where a RAC service is preferred on all the nodes in the cluster and one of the nodes goes down unexpectedly. When a planned outage like a service relocation is done from one node to the other, FCF does not work as expected and the result is unpredictable. There is no solution at present for this. Enhancement request 9495973 has been raised to address this limitation.

Should I user JDBC Thin driver or JDBC OCI driver?

Oracle thin JDBC driver is usually preferred by application developers because it is cross platform and has no external dependencies. However some applications require the high-performance, native C-language based Oracle Call Interface (OCI) driver. This driver is compatible with FCF and can alternatively use Transparent Application Failover (TAF) which operates at a lower level than FCF and can automatically resubmit SELECT queries in the event of a node failure. However for most applications, the ease of deployment of the thin driver with full FCF support will outweigh any benefits offered by the OCI driver.

How do I subscribe to HA Events?

If you are using a client that uses Oracle Streams Advanced Queuing, such as OCI and ODP.NET clients, to receive FAN events, you must enable the service used by that client to access the alert notification queue by using the ‘-q’ option via the ‘srvctl’ command.

FAN events are published using ONS and Oracle Streams Advanced Queuing. The service metrics received from the Oracle RAC load balancing advisory through FAN events for the service are automatically placed in the Oracle Streams AQ queue table, ALERT_QUEUE.

Use the following query against the internal queue table for load balancing advisory FAN events to monitor load balancing advisory events generated for an instance:

SET PAGES 60 COLSEP '|' LINES 132 NUM 8 VERIFY OFF FEEDBACK OFF

COLUMN user_data HEADING "AQ Service Metrics" FORMAT A60 WRAP

BREAK ON service_name SKIP 1

SELECT

TO_CHAR(enq_time, 'HH:MI:SS') Enq_time, user_data

FROM sys.sys$service_metrics_tab

ORDER BY 1 ;

What is Connection Affinity?

Connection affinity is a performance feature that allows a connection pool to select connections that are directed at a specific Oracle RAC instance. The pool uses run-time connection load balancing (if configured) to select an Oracle RAC instance to create the first connection and then subsequent connections are created with an affinity to the same instance.

What types of affinity does Universal Connection Pool (UCP) support?

UCP JDBC connection pools support two types of connection affinity: transaction-based affinity and Web session affinity.

What is Transaction-Based Affinity?

Transaction-based affinity is an affinity to an Oracle RAC instance that can be released by either the client application or a failure event. Applications typically use this type of affinity when long-lived affinity to an Oracle RAC instance is desired or when the cost (in terms of performance) of being redirected to a new Oracle RAC instance is high. Distributed transactions are a good example of transaction-based affinity. XA connections that are enlisted in a distributed transaction keep an affinity to the Oracle RAC instance for the duration of the transaction. In this case, an application would incur a significant performance cost if a connection is redirect to a different Oracle RAC instance during the distributed transaction.

Transaction-based affinity is strictly scoped between the application/middle-tier and UCP for JDBC; therefore, transaction-based affinity only requires that the setFastConnectionFailoverEnabled property be set to true and does not require complete FCF configuration. In addition, transaction-based affinity does not technically require run-time connection load balancing. However, it can help with performance and is usually enabled regardless. If run-time connection load balancing is not enabled, the connection pool randomly picks connections.

What is Web Session Affinity?

Web session affinity is an affinity to an Oracle RAC instance that can be released by either the instance, a client application, or a failure event. The Oracle RAC instance uses a hint to communicate to a connection pool whether affinity has been enabled or disabled on the instance. An Oracle RAC instance may disable affinity based on many factors, such as performance or load. If an Oracle RAC instance can no longer support affinity, the connections in the pool are refreshed to use a new instance and affinity is established once again.

Applications typically use this type of affinity when short-lived affinity to an Oracle RAC instance is expected or if the cost (in terms of performance) of being redirected to a new Oracle RAC instance is minimal. For example, a mail client session might use Web session affinity to an Oracle RAC instance to increase performance and is relatively unaffected if a connection is redirected to a different instance.

What is recommended for WebLogic Server?

Oracle recommends using WebLogic JDBC multi data sources to handle failover instead. While connect-time failover does not provide the ability to pre-create connections to alternate Oracle RAC nodes, multi data sources have multiple connections available at all times to handle failover.

Transparent Application Failover (TAF) is not supported for any WLS data source. TAF, as delivered via JDBC is currently not transparent. It is documented to affect some ongoing query results and PreparedStatements in unpredictable and unrecoverable ways. TAF JDBC requires specific recovery code at the application level and affects the integrity of statements that WebLogic might be caching.

Do I still need to backup my Oracle Cluster Registry (OCR) and Voting Disks?

You no longer have to back up the voting disk. The voting disk data is automatically backed up in OCR as part of any configuration change and is automatically restored to any voting disk added. If all voting disks are corrupted, however, you can restore.

Oracle Clusterware automatically creates OCR backups every four hours. At any one time, Oracle Database always retains the last three backup copies of OCR. The CRSD process that creates the backups also creates and retains an OCR backup for each full day and at the end of each week. You cannot customize the backup frequencies or the number of files that Oracle Database retains.

How is DBMS_JOB functionality affected by RAC?

DBMS jobs can be set to run either on database (i.e. any active instance), or a specific instance.

What is PARELLEL_FORCE_LOCAL?

By default, the parallel server processes selected to execute a SQL statement can operate on any or all Oracle RAC nodes in the cluster. By setting PARALLEL_FORCE_LOCAL to TRUE, the parallel server processes are restricted to just one node, the node where the query coordinator resides (the node on which the SQL statement was executed). However, in 11.2.0.1 when this parameter is set to TRUE the parallel degree calculations are not being adjusted correctly to only consider the CPU_COUNT for a single node. The parallel degree will be calculated based on the RAC-wide CPU_COUNT and not the single node CPU_COUNT. Due to this bug 9671271 it is not recommended that you set PARALLEL_FORCE_LOCAL to TRUE in 11.2.0.1, instead you should setup a RAC service to limit where parallel statements can execute.

What is the Service Management Policy?

When you use automatic services in an administrator-managed database, during planned database startup, services may start on the first instances to start rather than their preferred instances. Prior to Oracle RAC 11 g release 2 (11.2), all services worked as though they were defined with a manual management policy.

Why does my user appear across all nodes when querying GV$SESSION when my service does not span all nodes?

The problem is you are querying GV$SESSION as the ABC user and this results in the "strange" behaviour. If you select gv$session, 2 parallel servers are spawned to query the v$session on each node. This happens as the same user. Hence when you query gv$session as ABC you are seeing 3 (one real and 2 parallel slaves querying v$session on each instance). The reason you are seeing 1 on one node and 3 on the other is the order in which the parallel processes query the v$session. Take the sys (or any other) user to query the session of ABC and you will not see this problem.

How does Clustereare startup with OCR and Voting Disk in ASM?

The startup sequence has been changed/replaced, now being 2-phased, optimized approach:

Phase I

· OHASD will startup "local" resources first.

· CSSD uses GPnP profile which stores location of voting disk so no need to access ASM (voting disk is stored different within ASM than other files so location is known).

Simultaneously,

· ORAAGENT starts up and ASM instance is started (subset of information in OCR is stored in OLR, enough to startup local resources), and ORAROOTAGENT starts CRSD.

So the 1st phase of Clusterware startup is to essentially start up local resources.

Phase II

· At this point ASM and full OCR information is available and the node is "joined" to cluster.

What is the Oracle Database Quality of Service Management?

Oracle Database QoS Management is an automated, policy-based product that monitors the workload requests for an entire system. Oracle Database QoS Management manages the resources that are shared across applications and adjusts the system configuration to keep the applications running at the performance levels needed by your business. Oracle Database QoS Management responds gracefully to changes in system configuration and demand, thus avoiding additional oscillations in the performance levels of your applications. If you use Oracle Database Quality of Service Management (Oracle Database QoS Management), then you cannot have SINGLETON services in a server pool, unless the maximum size of that server pool is one.

Is a re-link required for the Clusterware home after an OS upgrade?

In 11.2, there are some executables in the GRID home that can and should be re-linked after an OS upgrade. The procedure to do this is:

#> cd GI_HOME/crs/install
#> perl rootcrs.pl -unlock

As the grid infrastructure for a cluster owner:

$> export ORACLE_HOME=Grid_home
$> $GI_HOME/bin/relink

As root again:

#> cd GI_HOME/crs/insta

How do I determine the “Master” node?

For the cluster synchronization service (CSS), the master can be found by searching $GI_HOME/log/cssd/ocssd.log. For master of an enqueue resource with Oracle RAC, you can select from v$ges_resource. There should be a master_node column.

What are the different types of failover mechanisms available?

· JDBC-THIN driver supports Fast Connection Failover (FCF)

· JDBC-OCI driver supports Transparent Application Failover (TAF)

· JDBC-THIN 11gR2 supports Single Client Access Name (SCAN)

What is recommendation on type of tablespaces?

You should use locally managed, auto-allocate tablespaces. With auto-allocate Oracle automatically grows the size of the extent depending on segment size, available free space in the tablespace and other factors. The extent size of a segment starts at 64 KB and grows to 1 MB when the segment grows past 1 MB, and 8 MB once the segment size exceeds 64 MB. So for a large table, the extent size will automatically grow to be large. The use of uniform extents is strongly discouraged for two reasons; space wastage and the impact that wasted space has on scan performance.

For large partitioned objects you should use multiple big file tablespaces to avoid file header block contention during parallel load operations. File header block contention appears as the ‘gc buffer busy’ enqueue wait event in an AWR report. Checking the buffer wait statistic will indicate if it is the file header block that is being contended for.

To evenly distribute a partitioned table among multiple big file tablespaces use the STORE IN clause.

What is the recommendation on column statistics?

Prior to loading any data it is advisable to run all queries against the empty tables to populate or seed the column usage statistics. Column usage statistics are used during optimizer statistics gathering to automatically determine which columns require histograms and the number of buckets that will be used. A column is a candidate for a histogram if it has been seen in a where clause predicate e.g. an equality, range, LIKE, etc. and if there is data skew in that column.

How do I size hash partitions?

Oracle uses a linear hashing algorithm to create sub-partitions. In order to ensure that the data gets evenly distributed among the hash partitions the number of hash partitions should be a power of 2 (i.e. 2 * # of CPU). However, each hash partition should be at least 16MB in size. Any smaller and they will not have efficient scan rates with parallel query. If the subpartitions are too small (from the 2 * # of CPU) considering using a smaller number of partitions (still an even number of partitions).

è What should be my block size?

8 KB is the default block size and is the block size used during all of Oracle's testing. Typically this is good enough for a data warehouse and transactional systems (good compromise or sweet spot). By doubling the default block size you can increase the chances of getting a good compression rate as Oracle applies data compression at the block level. The more rows in the block the greater the chance Oracle will find duplicate values within a block. (Reference: Oracle Sun Database Machine Application Best Practices for Data Warehousing, Doc ID 1094934.1)

What is the guideline on how to auto-extend data files?

When configuring a file to auto extend, the size of the extension should cover all disks in the ASM disk group to optimize balance. For example, with a 4 MB AU size and 128 disks, the size of the extension should be a multiple of 512MB (4*128).