All posts by J

Oracle versus RedHat and VMware

As I wrote earlier, there are many things to consider about when deciding what distribution is best for your environment when running Oracle under VMware. Over the past couple of years there’s been a quiet battle with Oracle on one side and RedHat with VMware on the other. As a customer, your dollars fund the game. Your choices will help determine who wins this battle.

First up, Oracle

In Oracle Database 11g, Oracle introduced a new features called Database Smart Flash Cache. The feature leverages any flash storage (aka Solid State Drives (SSDs) or Enterprise Flash Drives (EFDs)) on the system to act a second level cache behind RAM, increasing the database buffer cache without adding RAM to the system. There was a recent post on this topic on Oracle’s Linux blog which links to an interesting white paper if you want more detail on the inner workings of this feature.

Sounds great right? Here’s the thing: It’s only available on Oracle Solaris or Oracle Linux. Now Oracle Solaris on SPARC processors I could see – it’s a Big-Endian Unix whereas x86/x86-64 Linux is Little-Endian OS and Solaris on SPARC is a much more mature OS that Linux.

Oracle Linux is binary compatible with RedHat Enterprise Linux – so there is no technical reason why Database Smart Flash Cache shouldn’t work. I’m not the first to point this out. Dave Welch of House of Brick Technologies wrote about this in October 2009. I suggest you read his eloquent blog post on the subject. Note that this was all before the Oracle UEK Kernel was released and nothing in Oracle’s documentation states that UEK is required for Database Smart Flash Cache.

In September of 2010 at Oracle OpenWorld 2010, Oracle announced Oracle Unbreakable Enterprise Kernel (UEK). As you can read in this datasheet on the UEK,the big selling points are that it’s a modern 2.6.32 linux kernel that shows “tremendous performance improvements” compared to a standard Enterprise Linux 5 kernel (which is a 2.6.18 kernel). Amazing right? Not really. RHEL 5 was released in March of 2007 and hits the end of Production 1 life cycle in Q4 of 2011. RHEL 6 was released in November of 2010 is also a modern 2.6.32 linux kernel. Key benefits of UEK according to Oracle’s FAQ are that it’s fast, modern, reliable and optimized for Oracle. RHEL 6 is fast modern and reliable too. Yes, the UEK does have some optimizations and bug fixes for Oracle but they or may not be relevant in your environment.

So why does Oracle beat up on RedHat? It’s about money. Oracle wants that OS support revenue.

How is Oracle beating up on VMware? It’s also about money. Virtualizing Oracle allows you to run more Oracle servers on the same hardware. That’s lost revenue to Oracle, unless you’re using their hardware (Exadata) or software (OracleVM). Oracle tries to discourage VMware use with Oracle’s infamous supported but not certified argument against VMware. There is also Oracle’s policy of having to license all the cores on a host regardless of how many cores your VM can see (unless you’re using Oracle’s virtualization solution of course). Here’s another one I came across recently: In the release notes for Oracle Linux 6 is this little nugget:

Unbreakable Enterprise Kernel doesn’t contain vmw_pvscsi driver (11697522)
As a workaround, when creating a new VM in VSphere, do not pick Red Hat Enterprise Linux 6 x86-64 as the OS type but use Oracle Linux 5 x86-64 (or Red Hat Enterprise Linux 5). ESX will then expose the LSI Logic SCSI controller in the VM and the 2.6.32-100.28.* kernel will see the devices properly.

I did a default install of RHEL 6 (which uses the RedHat Kernel instead of Oracle Kernel) and lo and behold, RHEL 6 automatically uses PVSCSI to get you better disk performance under VMware. Oracle’s Unbreakable Kernel is based off of RedHat’s Kernel. That means Oracle went out of it’s way to cripple performance by removing PVSCSI support when using their kernel under VMware. Oracle’s own virtualization product OracleVM doesn’t have any sort of PVSCSI enhancements, so what better way to erase the performance benefits you’d see virtualizing under VMware? As a customer, I find this behavior deplorable.

Next Up, RedHat

With the release of RedHat Enterprise Linux 6, RedHat stopped providing the source of a vanilla kernel and all their patches in different packages and now provides this all in one large tarball. It’s totally legal by the requirements of the GPL, but it makes it much more difficult for downstream vendors (such as Oracle) to figure out what optimizations RedHat has made to the vanilla kernel and to then incorporate those changes into their own customized kernel. Now why would RedHat do this? Let’s see what Brian Stevens, CTO, VP Worldwide Engineering for RedHat had to say in this press release

“When we released RHEL 6 approximately four months ago, we changed the release of the kernel package to have all our patches pre-applied. Why did we make this change? To speak bluntly, the competitive landscape has changed. Our competitors in the Enterprise Linux market have changed their commercial approach from building and competing on their own customized Linux distributions, to one where they directly approach our customers offering to support RHEL.

Frankly, our response is to compete. Essential knowledge that our customers have relied on to support their RHEL environments will increasingly only be available under subscription. The itemization of kernel patches that correlate with articles in our knowledge base is no longer available to our competitors, but rather only to our customers who have recognized the value of RHEL and have thus indirectly funded Red Hat’s contributions to open source that will advance their business now and in the future.”

Who is the number one seller of RHEL support behind RedHat? Oracle.

Finally, VMware

VMware has done a few things I can think of to fight back against Oracle:
1) Quite simply, Oracle’s UEK (Unbreakable Enterprise Kernel) is NOT supported under VMware. Yes, I’m sure it requires resources to test yet another kernel, but I don’t believe that’s the case here. Could it be because of the “optimizations” Oracle has made such as the deadline scheduler (not typically good when used under a hypervisor) or stripping PVSCSI support out of the UEK? Maybe.

2) Up until the last few weeks, Hot Memory Add wasn’t listed as supported with Oracle Enterprise Linux 5.X under VMware according to VMware’s Guest OS Compatibility List, even though RedHat Enterprise Linux (on which OEL is based) was supported. Maybe it was an innocuous typo … or maybe not.

3) In November 2010, Oracle changed their support policy to explicitly INCLUDE Oracle RAC 11.2.0.2 when virtualized. Before that time, Oracle RAC was explicitly NOT supported by Oracle under VMware. Oracle’s yearly conference, Oracle OpenWorld, was held in September – before the change in the Oracle RAC on VMware support policy – and VMware had a booth there. I attended a number of sessions at that booth and at least two of them (one by Todd Muirhead of VMware, one by David Welch of House of Brick Technologies) talked about running Oracle RAC virtualized under VMware. Both presenters made it very clear that this wasn’t supported by Oracle, but one has to wonder why VMware would take the time to talk about a solution that wasn’t supported by Oracle? Maybe they knew Oracle was going to change their support policy and were just letting customers know it could be done “if only Oracle would support it”… or maybe VMware was preparing a more offensive role regarding Oracle RAC on VMware. I don’t know, but it’s interesting to think about.

So what’s a customer to do? To me the answer is simple: If you’re going to run Oracle virtualized, I’d highly recommend running it on RHEL Linux on VMware and buying my support for RHEL Linux from RedHat. If you buy support for RHEL from Oracle, assuming the support quality level is the same as RedHat’s, you’re helping Oracle to stifle true innovation that benefits everyone. If you decide to run OL, not only are you helping Oracle to stifle innovation that benefits everyone, you’re also telling Oracle you’re fine with them limiting features to customers (such as Smart Flash Cache) entirely for anti-competitive marketing reasons.

Don’t reward bad behavior.

What Linux distribution should you use for Oracle virtualized on VMware

Recently Tim Hall of Oracle-Base fame wrote an article “Which Linux do you pick for Oracle Installations?” which addresses Oracle on non-virtualized Linux. Tim’s article is excellent but doesn’t take VMware virtualization into account, so without further ado, which Linux distribution should you use for Oracle virtualized on VMware?

When virtualizing Oracle with VMware, most Oracle DBAs are going to run it on some flavor of Linux. Oracle generally supports three distributions of Linux for their enterprise products: Novell’s SuSE Linux Enterprise Server (SLES), RedHat Enterprise Linux (RHEL), and Oracle Linux (OL). Each Operating System has costs, features and support implications that make it unique. You need to determine which is best for your environment. In the United States, RHEL is most popular, whereas in Europe SLES is most prevalent. Almost all of my experience is with RHEL or on downstream distributions (such as CentOS or OL) of it, but my biases shouldn’t have an impact on this evaluation. The file system for VMware’s vSphere ESX and vMA (vSphere Management Assistant) and many of the VMware appliances from EMC and PHD Virtual are RedHat/CentOS based. This shouldn’t be a deciding factor when deciding what OS for your database system, but this does come in useful in the event of the occasional esoteric troubleshooting situation.

With some minor exceptions, RHEL and OL are the same to operate — the files are almost entirely in the same location, the commands are the same, etc.

For the purpose of this evaluation, I am limiting my comparison to the latest two versions of the 64-bit x86 platform for each distribution and how they differ when run on VMware’s latest released version of vSphere (4.1U1). Partially this is being done to save me time and effort, but also these are the platforms you would decide between if you were looking to maximize database system performance.

Note: At the time of this writing, RHEL 6 and OL 6 were NOT certified for most Oracle products. This is due to the fact these versions are relatively recently released and Oracle is still certifying their products on the new versions. Also note that the VMware / SLES promotion is limited to SLES 11.

Cost:
Your main consideration here is whether you just want access to patches and updates or if you want actual support with your issues. In my nine years of running Oracle on Linux, I’ve had to open a total of two tickets on Linux support – once with RedHat, once with Oracle.

o SLES – If you’re running vSphere 4.0U2 or later and are active on qualifying VMware vSphere Software and Services SKUs, you can run an unlimited number of virtual machines and get free subscriptions to patches and updates of SLES 11 SP1. Phone and online support has varying levels and costs. You can read more about VMware’s SuSE agreement.
o OL – Oracle Linux is free to download in compiled form. If you want a subscription to patches and updates only, the cost is $119 per year per server for an unlimited number of physical CPUs. Phone and online support has varying levels and costs. You can read more about Oracle Linux in the Oracle Linux FAQ. You can also check out the Oracle Linux support pricing guide.
o RHEL – RedHat Linux can only be downloaded in compiled form with a subscription. The cheapest subscription is a Self-Support subscription which comes with a subscription to patches and updates and no other support for $349 per year per server, where each server is limited to a 2 socket configuration with 1 virtual guest. Phone and online support has varying levels and costs. You can check out the various support options and their cost on the Redhat website.

Features offered by VMware:
Do you want to use VMware features such as Paravirtualized SCSI (PVSCSI), Hot Add Memory or Hot Plug vCPUs? Do you have a specific requirement for Enhanced VMXNET Networking?

Not all the distributions and versions support all these features. For example, if your database workloads are very I/O intensive, SLES is probably not a good choice.

o SLES 10 – Networking: e1000, Enhanced VMXNET and VMXNET3 are supported. A standard install will default to e1000.
– Storage: PVSCSI is NOT supported
– Hot Add: Hot Add Memory supported, Hot Plug vCPUs NOT supported
o SLES 11 – Networking: e1000, Enhanced VMXNET and VMXNET3 are supported. A standard install will default to e1000.
– Storage: PVSCSI is NOT supported
– Hot Add: Hot Add Memory supported, Hot Plug vCPUs supported
o RHEL 5.6 – Networking: e1000, Enhanced VMXNET and VMXNET3 are supported. A standard install will default to e1000.
– Storage: PVCSCI is supported. PVSCSI is NOT supported on hard disk 1 of the virtual machines.
– Hot Add: Hot Add Memory supported, Hot Plug vCPUs NOT supported
o RHEL 6.0 – Networking: e1000, VMXNEXT3 supported. Enhanced VMXNET NOT supported. A standard install will default to VMXNET3.
– Storage: PVCSCSI is supported. A standard install will default to PVCSCI.
– Hot Add: Hot Add Memory supported, Hot Plug vCPUs supported
o OL 5.6 – Networking: e1000, Enhanced VMXNET and VMXNET3 are supported. A standard install will default to e1000.
– Storage: PVCSCI is supported. PVSCSI is NOT supported on hard disk 1 of the virtual machines.
– Hot Add: Hot Add Memory supported, Hot Plug vCPUs NOT supported
o OL 6.0 – Networking: e1000, VMXNEXT3 supported. Enhanced VMXNET NOT supported. A standard install will default to VMXNET3.
– Storage: PVCSCSI is supported. A standard install will default to PVCSCI.
– Hot Add: Hot Add Memory supported, Hot Plug vCPUs supported

Note: Previously OL 5.6 was listed in VMware’s certified list as NOT supporting Hot Add memory, but this has been changed recently.

Note: On OL, the new Unbreakable Enterprise Kernel (UEK) is NOT supported under VMware. You will have issues installing the VMware Tools if you are running this kernel.

Many companies standardize on one or two operating systems for their organization to minimize support costs. When bringing virtualization into the mix, your organization should re-evaluate your operating system choices to to get the performance and features you need.

Happy Birthday and thank you to John Troyer of VMware

Among the VMware communities, I’m a bit of an outsider – I come from a career background of Oracle and am one of the few Oracle DBAs. Sure, there are a number of people who might manage Oracle databases as part of their daily responsibilities, but first and foremost Oracle is always my primary job role. I’ve been involved in various Oracle-related communities since the mid-1990s and various VMware-related communities since the mid-2000s.

When I started participating in the VMware communities, I was *amazed* at the differences in the social buzz between VMware and Oracle – VMware has more discussion, people appear more open to sharing what they know, it’s about more than just the technology itself … it’s an actual community – not just a discussion list or a forum. It’s hard to describe just how special and unique it is to find a community centered around a large corporate software technology product.

So how can a smaller vendor like VMware develop and foster community so much better than a company with the resources of Oracle? The answer is many little intangible things, but one thing that stands out is a cult of personality. People who, at least for parts of the company, become the public face of a community.

For Oracle, when I think of people that are public faces of Oracle corporation, I think of Larry Ellison, Tom Kyte, and Steven Chan. Steven, especially with his outgoing and friendly attitude, does a great job of fostering sense of community, but he’s really just trying to foster communication with his external customer base and in the range of Oracle’s products, that is a very small group.

For VMware, when I think of the public faces, two come to mind. VMware’s CTO Steve Herrod and VMware’s social media strategist, John Troyer. John and his team manage to make VMware cool – from the Chewbacca cameos in VMware webex sessions with the CTO, to the lab team shirts at VMworld, to the trinkets and contests that various parts of VMware are always performing. I’m sitting here writing this on laptop with a I “heart” VMware sticker that glows from the apple logo on my MacBook Pro from Apple (another company that gets social media). I think I’ve got a wine bottle stopper from Oracle somewhere in a drawer.

Fostering these sorts of cults of personality is an art, and John Troyer is a master of that art. Many thanks to John Troyer for all he does for VMware and its communities to make them true communities. Community is an extremely hard thing to foster, and John’s skill and subtlety make it look effortless.

Today is John’s birthday so I, along with many others active in the VMware community wanted to give John some recognition for all he does. If John has made your career more fun and community focused, give @jtroyer a shout out on twitter with hashtag #vTHNX.

p.s. John, given my love of goofy costumes, my original plan was to rent a Wookie costume and have said Wookie wish you happy birthday from the Wookie home planet of Kashyyyk. Unfortunately circumstances conspired against me to prevent this from happening, but rest assured the force is strong with you 🙂

On Oracles commitment to Linux

Oracle prides itself on its strong support and commitment to Linux. On a webpage at Oracle’s site entitled Oracle’s Technical Contributions to Linux”

Oracle waxes eloquently on Oracle’s support, commitment and leadership for Linux. The paragraphs describe Oracle’s “long history of strong support and commitment to Linux, as evidenced by numerous, on-going technical contributions to the Linux community.”

The page then states that “Oracle continues to strengthen its involvement in the Linux community by providing enhancements that facilitate the development and deployment of enterprise Linux solutions. By developing enhanced capabilities and contributing code, Oracle’s Linux engineering teams continue to make the Linux experience better for all.”

Finally, the page lists a variety of projects and contributions where Oracle is involved. Very impressive stuff and I applaud Oracle for their contributions.

Ksplice is an extension of the Linux kernel which allows you to apply security patches to a running kernel without having to reboot the operating system. What’s the point in having a highly redundant clustered database that never needs to go down if it’s running on an operating system with a security hole that requires a reboot to patch? I highly recommend you read the wikipedia article I linked to above to read up how Ksplice works – it’s very cool.

From its founding in 2008 thru July 20th, 2011, Ksplice, Inc. the company that developed the Ksplice technology, provided prebuilt and tested updates for RedHat, CentOS, SuSE Enterprise Linux and other linux distributions – though not Oracle Linux.

On July 21st, 2011, support for RedHat Enterprise Linux and SuSE Enterprise Linux was dropped. It was announced that going forward this would be a feature only on Oracle Linux to customers who pay for premier support and only then when running Oracle Unbreakable Enterprise Kernel (UEK).

What happened on July 21st to cause Ksplice to drop the most popular enterprise linux distributions?

On July 21st, 2011, Oracle completed its acquisition of Ksplice, Inc.

[update] I also saw this tweet by ORCL_Linux, Oracle’s official twitter account for it’s Linux group:
“#Oracle Buys Ksplice…makes Oracle #Linux the ONLY OS with zero downtime patching bit.ly/qCZTBq ”

Not all Oracle databases require a license

Recently there was some discussion on Twitter about if infrastructure databases such as RMAN and OEM databases required licenses. I always figured they had the same licensing requirements as any other Oracle database.

However, I was incorrect.

If you read Oracle® Database Licensing Information 11gR2 it has this section:

Infrastructure Repository Databases

A separate Oracle Database can be installed and used as a Recovery Manager (RMAN) repository without additional license requirements, provided that all the Oracle databases managed in this repository are correctly licensed. This repository database may also be used for the Oracle Enterprise Grid Control repository. It may not be used or deployed for other uses.

A separate Oracle Database can be installed and used as a Oracle Enterprise Manager Grid Control (OEM Grid Control) repository without additional license requirements, provided that all the targets (databases, applications, and so forth) managed in this repository are correctly licensed. This database may also be used for the RMAN repository. It may not be used or deployed for other uses.

Good to know. For my clients who use VMware clusters to limit what hosts Oracle VMs can run on for licensing purposes, you do not need to restrict your Oracle RMAN and OEM Grid Control DBs / VMs. This allows me to free up those CPU / RAM / Network resources on my Oracle-only hosts for other license-restricted databases.

Choosing the right input output scheduler for Oracle on Linux

As I’ve wrote about before, having an understanding of the stack of software and hardware you are running workloads on is critical to getting the best performance out of your environment.

In all Linuxes (SuSE, RHEL, OL) for which Oracle supports running Oracle database, the kernel is responsible for scheduling the I/O in the system. There are multiple schedulers built into the kernel to allow you to choose the best scheduler for your disk I/O profile. For example, a database server is going to have a much different disk I/O profile than a webserver.

As you can read in this 2005 article from RedHat on RHEL 4 or this last revised 2008 pdf, there are significant benefits to be gained for your OLTP database server by changing your I/O scheduler from the default cfq (Completely Fair Queuing) to “deadline”. Deadline scheduler attempts to reduce latency of any given single I/O by waiting as long as possible before writing buffers to disk.

It’s interesting to note that, according to the Oracle Linux 6 release notes, Oracle’s Unbreakable Linux (UEK) kernel uses the deadline scheduler. Compare this with RedHat’s kernel which uses cfq by default. I wonder how much of Oracle’s performance improvements supposedly from UEK are actually from using a more appropriate scheduler for typical database workloads? I also wonder if deadline would be the right choice for Exadata, which uses UEK, since it has SAN type storage built in.

So the answer here is always use deadline scheduler, right?

Not so fast. What if I’ve virtualized my database and I’m running it under VMware or on a Storage Area Network (SAN)? VMware has designed vSphere as an Operating System (OS) optimized to run other OSes. You can find a fascinating thread on quora about vSphere and how it compares to other OSes.

Is deadline still most likely the best choice?

No. Even as of May 2010 in this paper from VMware on Oracle Databases on vSphere 4, there is NO mention of what scheduler to use.

So, as always, you should be diligent and work with your system administrators to test out what works best in your environment. Having said that, based on my experience and that of others, I typically set the scheduler to noop (No Operations) on all my linux VMs, regardless of if they are running Oracle or not.

More support available when virtualizing Oracle under VMware

What if your management wants more assurances about support for Oracle under VMware?

I’ve talked with many consultants and a few companies over the last year who have been concerned about getting support for their Oracle environment once it’s virtualized under VMware. I’ve written about this multiple times (Oracle listened, customers win! RAC supported on VMware, Oracle support on VMware, and Number One question at VMware booth at Oracle Openworld)

Oracle database, including the latest version (11.2.0.2) of Real Application Clusters (RAC) IS supported under VMware. It’s not certified by Oracle, but neither is almost any other hardware not made by Oracle (i.e. Your Dell servers and Cisco switches aren’t certified by Oracle). What this means is that (according to My Oracle Support (MOS) note 249212.1 ), in the unlikely event Oracle Support determines your known problem’s solution doesn’t work when virtualized, or if the problem is determined not to be a known Oracle issue, Oracle Support may refer you to VMware Support and will continue to work the issue when the customer can demonstrate the issue occurs on native (non-virtualized) hardware.

This has caused some organizations to give pause to virtualizing their Oracle environments under VMware. No organization wants to pay thousands of dollars in support only to find it isn’t there when they need it the most. To help reduce this anxiety over virtualizing Oracle products under VMware, VMware Global Support Sevices (GSS) provides support for VMware customers running Oracle 10g or 11g on VMware vSphere. You can read more about VMware’s Oracle Support policy at on VMware’s dedicated Oracle Support page.

In the event you are running into an issue with Oracle 10g or 11g issue under VMware vSphere 4, you should not only open a ticket with Oracle Support, but also a separate ticket with VMware Global Software Support (GSS). VMware will then use their expertise and resources to troubleshoot your issue to determine if the virtualization layer is the cause of the issue. If VMware deems the issue is not related to virtualization, VMware will escalate the ticket back through TSANet to Oracle Support.

TSANet (thankfully not associated with that TSA) is a vendor-neutral infrastructure that allows members such as Oracle, RedHat, Microsoft, NetApp, EMC and VMware to collaborate behind the scenes when a possible multivendor problem exists to resolve the customer issue. Typically customers aren’t even aware TSANet is being used between the vendors for communication.

In addition to support from Oracle and VMware, your storage vendor also has expertise you can leverage when experiencing issues.

If you’re running NetApp storage, check out their best practices for Oracle on NetApp. I’ve also been in contact with numerous people at NetApp regarding support resources and every NetApp person I contacted was extremely quick and resourceful in helping me find information. In a matter of hours, I had responses from a Virtualization Solution Architect, the Director of Global Support Services and Solutions, and the Senior Vice President of Support. Wow. Anyhow, NetApp has dedicated Virtualization and Oracle teams and also has a Joint Escalation Team (JET) with Oracle, VMware, Cisco etc. Even if you’re running a NetApp v-series controller in front of an EMC array, NetApp will support you and help you out. One final note, Oracle Corporate runs their Global Single Instance (their EBS instance) on NetApp according to the last published documentation I can find.

If you’re running EMC storage, they also have a Virtual Escalation Team process for Oracle on VMware vSphere on EMC. You can read more about EMC’s support of Oracle under VMware vSphere at Chad Sakacc’s blog post on Oracle, x86, VMware and update on support.

Odds are, whatever issue you’re running into or concerned about with virtualizing Oracle has been seen by someone else at VMware and your storage provider. With all the major vendors talking to each other under TSANet, you won’t be left to fend for yourself.

Don’t be scared to run your Oracle products under VMware vSphere. It’s supported by Oracle. It’s supported by VMware. Your storage vendor probably even has a specific team dedicated to Oracle on VMware.

Which would your CEO prefer

Would your CEO choose to give up a negligible amount of system performance at peak system load in exchange for a reduction of risk during system upgrades and an alternative to hours of downtime?

I suspect the answer is yes.

What’s the cost of such a technology? Believe it or not, it’s free. It’s the snapshot functionality that’s built into VMware’s vSphere Hypervisor (based on ESXi) that you can leverage once you virtualize your Oracle database or any other x86 systems. Is the performance impact truly negligible? Don’t take it from me; you can read it in Oracle’s own words.

Recently RedHat and Oracle have come out with updates to their mainstream distributions (RedHat Enterprise Linux 5.6 and Oracle Linux 5.6) and each has come out with an entirely new version (RedHat Enterprise Linux 6 and Oracle Linux 6). System and database administrators all around the world are updating their critical systems. Applying those updates to critical systems requires testing and downtime.

The best practice for operating system upgrades is to test the upgrade on the same exact hardware, software and data as in production. But are your test and production systems identical? Same motherboard? Same processors? Same amount and brand of RAM and same exact operating system packages installed? Pretty unlikely. With virtualization, all those configurations are identical.

With VMware virtualization, I can take take a clone of the entire production server while it’s live and being used by users. Now I have a truly identical copy of production to test my upgrade. Note that to do a clone of the entire production server while it’s live requires VMware vCenter which isn’t free, but vSphere Essentials (which includes vCenter) starts at $1000 as of the time of this writing.

With snapshots (which are free and don’t require vCenter), you take a snapshot (3 mouse clicks and less than 10 seconds) of the virtual machine and then do the upgrade. If you run into issues, just revert / rollback the snapshot to the state it was in when you took the snapshot. The time to do that revert / rollback is only the time necessary to reboot your virtual machine – 5 minutes? If the upgrade and testing goes smoothly, you just merge the snapshot into the virtual machine while the system is up and available to users. Total time spent doing non-upgrade activities such as backups or restores? Essentially zero.

Compare this to how you would handle a critical system without virtualization: you’d take the system offline for the upgrade, take a full backup of the system, do the upgrade (hoping it works just like it did in the similar but most likely not identical test system), have the users test, and, if there’s an issue, possibly spend hours restoring from backup. You do trust your backups… right?

Enterprise DBAs and the companies that employ them tend to be risk adverse when it comes to losing data or experiencing downtime. Virtualizing the hardware allows you to ensure your test and development systems are the exact same systems as production, thereby reducing risk of unforeseen issues during the upgrade. Utilizing snapshots allows you to very quickly take a save point of your entire server and rollback to it very quickly in the event of unforeseen issue, almost entirely eliminating (minutes vs. hours) the downtime associated with recovering from an unforeseen issue.

Licensing Oracle on VMware vSphere

Honestly, I thought this issue was done and buried, but over the past few weeks I’ve seen this question come up multiple times, so let’s get this cleared up.

Let’s go right to the source – Oracle’s own documentation. If you read Oracle’s partitioning document you will see that this is Oracle’s stance as of January 24, 2011. In it, it discusses soft partitioning and hard partitioning. Soft partitioning is leveraging the Operating System features to limit the number of CPUs an Oracle instance (or Oracle virtual machine) can run on. Hard partitioning physically partitions a large server into smaller self contained servers. The document lists what Oracle considers valid examples of each type of partitioning. In the document, Oracle specifically defines VMware’s partitioning (and Oracle VM’s partitioning) as soft partitioning. In the document, Oracle states that soft partitioning isn’t a “valid” means of restricting the amount of software licenses and you must license all the processors on a given server. Note that later in the document Oracle states that Oracle VM CAN be used for hard partitioning if you set it up as described in this document which goes into detail on how to bind an OracleVM VM to physical processors / cores. There is no mention in the documents if binding a VMware VM to a physical processor/core would also count as hard partitioning. Oracle does state that their list of partitioning technologies isn’t comprehensive, so things are left open to interpretation.

Please note I highly recommend you go and read these documents yourself and draw your own conclusions, and of course you can and should talk with an Oracle-employed licensing expert. In these documents Oracle states I cannot reproduce the document in any manner without express written consent so I am only telling you my interpretation.

VMware has three different techniques for restricting a VM to a specific subset of processors / cores. They are VMware vCenter clusters, VMware DRS affinity rules, and vSphere CPU affinity (pinning). I advise my clients to use the VMware vCenter cluster technique, but your organization might have a different interpretation. To describe the different VMware techniques, I will use an example of a 10 host VMware vCenter datacenter, with each host having 2 physical sockets and 4 cores per socket. Therefore, this entire VMware vCenter datacenter has 80 physical x86 cores (4*2*10) of processing power.

VMware vCenter clusters are logical clusters inside of vCenter made up of one or more hosts. By assigning a VM to that cluster, you are forcing that VM to run ONLY on the host(s) that make up that cluster. For example, if you create a 2 host VMware vCenter cluster inside your 10 host VMware datacenter, your VM can run on any processors / cores inside that 2 host cluster. As a result, Oracle licensing requires you to license all 16 (4*2*2) cores in that cluster. Note that you are restricting other non-Oracle workloads from also running on these hosts, so your Oracle VMs will get the best possible performance available on those hosts, possibly at the detriment to your non-Oracle workloads running on other hosts.

In vSphere 4.1, a DRS rule called “Virtual Machines to hosts” became available. That rule allows you to limit the location of a VM to specific host(s) in the VMware vCenter cluster. For example, if you create a DRS affinity rule assigning a VM to a single host inside your VMware cluster, your VM can run on any processors / cores inside that host. As a result, Oracle licensing requires you to license all 8 (4*2*1) cores on that host. You can read more about the VM to hosts affinity rule in this post by Frank Denneman who is a co-author (along with Duncan Epping) of vSphere 4.1 HA and DRS technical deepdive. Note that you aren’t restricting other non-Oracle workloads from also running on this host and thus you could have less than optimal Oracle performance.

VMware vSphere itself allows you to pin a virtual machine to one or more physical cores on a server using vSphere’s CPU affinity settings. You can read about the details of this in the vSphere resource management guide version 4.1 starting on page 20. This is the technical equivalent of the Oracle VM technique of binding a VM to a specific subset of physical processors / cores. For example, if you pin your Oracle VM to two physical cores, your VM can only run on those two physical cores. As a result, Oracle licensing requires you to license those 2 cores. Note that you aren’t restricting other non-Oracle workloads from also running on this core and thus you could have less than optimal Oracle performance.

Does Oracle consider VMware’s CPU affinity settings an acceptable form of partitioning? What about VMware DRS VM to host affinity rules? I have seen no official documentation from Oracle either way. I advise my clients to always buy enough Oracle licenses to allow Oracle to run on at least two hosts. This allows the customer to not be concerned about Oracle’s licensing ambiguity (as you’re licensing the entire hosts Oracle can run on) and also allows the customer to get the benefits of VMware such as vMotion, HA, DRS and FT to reduce and possibly eliminate downtime or less than optimal performance for their Oracle systems. I have had a client who went from running Oracle physical (with the one physical server having 8 processors / cores) to virtual (with the physical server having 8 processors / cores) and the client wanting the benefits of vMotion, HA, DRS and FT but without having to buy Oracle licenses for an additional 8 CPU host. According to the Oracle partitioning document I referenced earlier, Oracle does allow customers to only licenses processors / cores that are turned on. For this customer I therefore recommended that they turn off half the processors / cores in each host. Please note this limited their VMs to a maximum of 4 cores each- the amount of cores available on each host.

Licensing Oracle on VMware vSphere is an area of much confusion and disagreement due to Oracle not presenting clear public guidelines on whether DRS Affinity rules or vSphere CPU affinity are valid methods of partitioning.  I hope that Oracle addresses this licensing confusion soon, but until then, separate VMware vCenter clusters are the least legally risky way to virtualize Oracle.  I would love for someone from Oracle to officially and on the record address the techniques I mentioned in this post.

Oracle listened, customers WIN! RAC supported on VMware

As I was flying home last night and downloading tweets before takeoff, I found out some amazing news. Ugh, not the time to have intermittent internet access! But eventually I got home, did the reading and confirmed the news.

Oracle RAC 11gR2 (11.2.0.2) is now supported by Oracle under VMware.

You can read the updated My Oracle Support (MOS) announcement yourself in note 249212.1 which now states:

NOTE:  Oracle has not certified any of its products on VMware.
For Oracle RAC, Oracle will only accept Service Requests as described in this note on
Oracle RAC 11.2.0.2 and later releases.

(Remember:  Certified is different than Supported .  Oracle doesn't certify hardware that isn't Oracle's own )

This is simply fantastic news.  I talked to an petroleum company in Houston earlier this year who wanted to virtualize their Oracle EBS system and move platforms from Sun Solaris to x86 architecture.  Their big concern was that they were using 8 SPARC Processors and they knew that 8 x86 CPUs is the limit for a virtual machine under VMware vSphere 4.1.  We discussed various steps they could take to ensure their environment would thrive under this limitation, but now it's a non-issue. In the event they need more computing power, they can implement Oracle RAC under vSphere and start up another RAC instance as necessary. 
I do need to point out that as of this moment, 11.2.0.2 database is not certified or supported with Oracle Application (Oracle EBS) 11i or R12.  These certifications usually come out a few months after the initial database announcement (which was Sept 10th for 11.2.0.2).  If you check out the blog of Steven Chan (a Senior Director in Oracle's Applications Technology Group - the group responsible for the Oracle E-Business Suite technology stack) and specifically these comments , you'll see that Steven wrote:

We haven't certified 11.2.0.2 with Oracle E-Business Suite Release
11i yet.  This project is underway now.  11.2.0.1 is the latest
certified database release for the E-Business Suite.
Oracle's Revenue Recognition rules prohibit us from discussing
certification and release dates, but you're welcome to monitor or
subscribe to this blog for updates, which I'll post as soon as soon as
they're available.



So 11.2.0.2 database certification with EBS 11i and R12 is coming.My main client doesn't use RAC (our business can survive the downtime associated with a HA event and we aren't near the 8 CPU limitation of VMware vSphere 4.1), but knowing its an option can only give upper management even more confidence that virtualizing our entire Oracle environment under VMware vSphere was the right thing to do.


For those wanting more information on Oracle RAC under VMware vSphere, I'd suggest watching this Oracle virtualization webcast put on by Embarcadero and VMware a few weeks ago. I'd also highly recommend following VirtualTodd on Twitter.  Todd Muirhead was at Oracle OpenWorld in the VMware booth and presented some very interesting performance data from running RAC under VMware.  I can't find a link to the presentation, but you can follow Todd's postings and perhaps find his testing results at his blog on the VMware communities site .

Think of the possibilities of combined Oracle RAC and VMware vSphere:

o  Multiple RAC nodes on different vSphere hosts means no database downtime during a hardware failure.

o  Combining multiple RAC databases on same vSphere host to consolidate workloads but still segregate environments 

o  Much faster provisioning of new RAC nodes with vSphere virtual machine cloning and VMware VAAI (vStorage APIs for Array integration)

o ... and many more I still need to wrap my head around