Monday, February 21, 2011

Exadata Series - Performance Comparison

So how does Exadata compare?

A lot of people have asked this question and in truth although your mileage will vary, Exadata is a definite screamer and should solve most performance challenges. But there is more to compare than just performance! A lot needs to be taken into consideration and I highly recommend considering all the variables before making any decisions since Exadata is very expensive and ROI needs to be proven. Some criteria to include:
  • Solution Maturity
  • Pricing (CAPEX and OPEX)
  • Performance (OLTP, Warehousing, Analytics)
  • Scalability
  • Availability
  • Backup/Recovery Options
  • Migration Options
  • Management Options
  • Integration Requirements
  • Virtualization Options
  • Tiering Options
  • Provisioning Options
  • Training Requirement
  • Professional Services Available
  • Refresh/Replication Options
  • Cooling
  • Floor space
  • Networking
Now going back to the 'U' platform, it is very impressive and when paired with the appropriate storage solution is a strong value proposition. Storage features such as snap and clone can provide instant alternate environments, or backup/recovery options. Such options are not available for Exadata. Some will tell you that Data Guard combined with Flashback does the same thing but there are substantial differences (for example, instant snap or clone vs. build and setup the Data Guard environment). You must determine for yourself what solution will best address your business needs now and in the future.


What were the platforms?

Again, without getting into too much detail (for reasons previously mentioned), lets call use the following:

  • Platform P5 w/CX type storage (baseline)
  • Platform P7 w/DS type storage
  • Platform 'U' blades w/CX type storage - Note that the storage was suboptimal as it was our sandbox storage environment with a known bottleneck in the number of connections; the hardware/software stack was also not optimized for Oracle
  • Platform HDS w/AMS type storage
  • Oracle Exadata v2 (1/4 rack w/High Performance disks)

Knowing we were looking at Exadata, the other vendors took the approach of matching Exadata in terms of price/performance taking into consideration the cores and license costs. This was especially relevant to platform P7 since their core factor is 1.0 whereas Intel (used by Exadata and platform 'U') is 0.5. As expected, in the pure CPU processing area, platform P7 was the most efficient CPU. This of course resulted in the vendor (as they knew going in) being able to use less processors to match, or more accurately beat, the Intel processors and hence making the core factor a non-issue. For example, using 6 x P7 cores bested 12 x Intel Nehalem cores. It will be argued of course that Exadata has 96 cores for the DB nodes + 168 cores for the Storage nodes (in a full rack) since processing will also be done by the Storage Servers. That is a valid argument except were the storage servers are not involved which depends a lot on your workload. It must be noted that platform 'U' did quite well given its degraded setup, even besting Exadata in a few individual tests (a real testament to the the platform).

For testing we devised 5 test cases consisting of the same 12 unit tests (i.e. loads, stats collection, and queries):

T1: "As is", i.e. just run without any changes
T2: Hints dropped
T3: Hints & indexes dropped
T4: Same as T3 but using compression (HCC for Exadata & AC for other platforms)
T5: Same as T3 but with artificially induced load on the CPUs (at 100%)


Testing was done off-site by the respective vendors, except platform '
U' which was done on-site by myself. Oracle apparently has a policy against making available performance data so I'd recommend this be discussed upfront if you want access to the AWR and other such information for review. We were unaware of this policy going into the tests and were told the AWR was not captured. As we persisted the explanation changed into it being "company confidential", and recently into such information is not generally made available.

I also recommend ensuring the appropriate Oracle resources are made available. We were less than impressed with the Oracle team running the POC as given the collective resources of Oracle at their disposal it took them until the next day to realize the Exadata machine was improperly configured, had a failed Flash card, and also how to use Data Pump (we had to help them here). Just getting our data (less than 4TB) inside the machine was taking over 4+ hours (operation was killed) until each of these issues were addressed. The load time was still unimpressive though our contention was that the machine was still less than ideally configured:

Exadata: less than 3 hours
Platform 'U': 1 hour 45 minutes
Platform P7: 16 minutes

T
o share some other performance numbers (ordered by overall best time improvement) see below. Note that I've combined the platform 'U' and 'HDS' results since HDS was the storage piece and U was the compute piece.

T1: P7 (9x), Exadata (7x), U/HDS (4x)
T2: Exadata (13x), P7 (10x), U/HDS (9x)
T3: P7 (21x), Exadata (11x), U/HDS (8x)
T4: P7 (16x), Exadata (12x), U/HDS (8x)
T5: U/HDS (17x), P7 (9x), Exadata (6x)

Of note is that we had a particularly nasty load test with which all the platforms had trouble, so much so in fact that in the case of T1 none of the systems managed to complete the test in a time which bested the baseline (unit tests were stopped once past the baseline based on our discretion).

Curiously, the Exadata DB nodes were 100% CPU utilized by two concurrent IAS streams in T5, while for the other platforms a more artificial stress was required using a CPU heavy PL/SQL function (a session per CPU thread). We found this quite strange given the similar processing power between the Exadata DB nodes and the other Intel platforms, though as we were not given access to any data we were unable to get any answers.

No comments:

Post a Comment