Monthly Archives: April 2011

Not all Oracle databases require a license

Recently there was some discussion on Twitter about if infrastructure databases such as RMAN and OEM databases required licenses. I always figured they had the same licensing requirements as any other Oracle database.

However, I was incorrect.

If you read Oracle® Database Licensing Information 11gR2 it has this section:

Infrastructure Repository Databases

A separate Oracle Database can be installed and used as a Recovery Manager (RMAN) repository without additional license requirements, provided that all the Oracle databases managed in this repository are correctly licensed. This repository database may also be used for the Oracle Enterprise Grid Control repository. It may not be used or deployed for other uses.

A separate Oracle Database can be installed and used as a Oracle Enterprise Manager Grid Control (OEM Grid Control) repository without additional license requirements, provided that all the targets (databases, applications, and so forth) managed in this repository are correctly licensed. This database may also be used for the RMAN repository. It may not be used or deployed for other uses.

Good to know. For my clients who use VMware clusters to limit what hosts Oracle VMs can run on for licensing purposes, you do not need to restrict your Oracle RMAN and OEM Grid Control DBs / VMs. This allows me to free up those CPU / RAM / Network resources on my Oracle-only hosts for other license-restricted databases.

Choosing the right input output scheduler for Oracle on Linux

As I’ve wrote about before, having an understanding of the stack of software and hardware you are running workloads on is critical to getting the best performance out of your environment.

In all Linuxes (SuSE, RHEL, OL) for which Oracle supports running Oracle database, the kernel is responsible for scheduling the I/O in the system. There are multiple schedulers built into the kernel to allow you to choose the best scheduler for your disk I/O profile. For example, a database server is going to have a much different disk I/O profile than a webserver.

As you can read in this 2005 article from RedHat on RHEL 4 or this last revised 2008 pdf, there are significant benefits to be gained for your OLTP database server by changing your I/O scheduler from the default cfq (Completely Fair Queuing) to “deadline”. Deadline scheduler attempts to reduce latency of any given single I/O by waiting as long as possible before writing buffers to disk.

It’s interesting to note that, according to the Oracle Linux 6 release notes, Oracle’s Unbreakable Linux (UEK) kernel uses the deadline scheduler. Compare this with RedHat’s kernel which uses cfq by default. I wonder how much of Oracle’s performance improvements supposedly from UEK are actually from using a more appropriate scheduler for typical database workloads? I also wonder if deadline would be the right choice for Exadata, which uses UEK, since it has SAN type storage built in.

So the answer here is always use deadline scheduler, right?

Not so fast. What if I’ve virtualized my database and I’m running it under VMware or on a Storage Area Network (SAN)? VMware has designed vSphere as an Operating System (OS) optimized to run other OSes. You can find a fascinating thread on quora about vSphere and how it compares to other OSes.

Is deadline still most likely the best choice?

No. Even as of May 2010 in this paper from VMware on Oracle Databases on vSphere 4, there is NO mention of what scheduler to use.

So, as always, you should be diligent and work with your system administrators to test out what works best in your environment. Having said that, based on my experience and that of others, I typically set the scheduler to noop (No Operations) on all my linux VMs, regardless of if they are running Oracle or not.

More support available when virtualizing Oracle under VMware

What if your management wants more assurances about support for Oracle under VMware?

I’ve talked with many consultants and a few companies over the last year who have been concerned about getting support for their Oracle environment once it’s virtualized under VMware. I’ve written about this multiple times (Oracle listened, customers win! RAC supported on VMware, Oracle support on VMware, and Number One question at VMware booth at Oracle Openworld)

Oracle database, including the latest version (11.2.0.2) of Real Application Clusters (RAC) IS supported under VMware. It’s not certified by Oracle, but neither is almost any other hardware not made by Oracle (i.e. Your Dell servers and Cisco switches aren’t certified by Oracle). What this means is that (according to My Oracle Support (MOS) note 249212.1 ), in the unlikely event Oracle Support determines your known problem’s solution doesn’t work when virtualized, or if the problem is determined not to be a known Oracle issue, Oracle Support may refer you to VMware Support and will continue to work the issue when the customer can demonstrate the issue occurs on native (non-virtualized) hardware.

This has caused some organizations to give pause to virtualizing their Oracle environments under VMware. No organization wants to pay thousands of dollars in support only to find it isn’t there when they need it the most. To help reduce this anxiety over virtualizing Oracle products under VMware, VMware Global Support Sevices (GSS) provides support for VMware customers running Oracle 10g or 11g on VMware vSphere. You can read more about VMware’s Oracle Support policy at on VMware’s dedicated Oracle Support page.

In the event you are running into an issue with Oracle 10g or 11g issue under VMware vSphere 4, you should not only open a ticket with Oracle Support, but also a separate ticket with VMware Global Software Support (GSS). VMware will then use their expertise and resources to troubleshoot your issue to determine if the virtualization layer is the cause of the issue. If VMware deems the issue is not related to virtualization, VMware will escalate the ticket back through TSANet to Oracle Support.

TSANet (thankfully not associated with that TSA) is a vendor-neutral infrastructure that allows members such as Oracle, RedHat, Microsoft, NetApp, EMC and VMware to collaborate behind the scenes when a possible multivendor problem exists to resolve the customer issue. Typically customers aren’t even aware TSANet is being used between the vendors for communication.

In addition to support from Oracle and VMware, your storage vendor also has expertise you can leverage when experiencing issues.

If you’re running NetApp storage, check out their best practices for Oracle on NetApp. I’ve also been in contact with numerous people at NetApp regarding support resources and every NetApp person I contacted was extremely quick and resourceful in helping me find information. In a matter of hours, I had responses from a Virtualization Solution Architect, the Director of Global Support Services and Solutions, and the Senior Vice President of Support. Wow. Anyhow, NetApp has dedicated Virtualization and Oracle teams and also has a Joint Escalation Team (JET) with Oracle, VMware, Cisco etc. Even if you’re running a NetApp v-series controller in front of an EMC array, NetApp will support you and help you out. One final note, Oracle Corporate runs their Global Single Instance (their EBS instance) on NetApp according to the last published documentation I can find.

If you’re running EMC storage, they also have a Virtual Escalation Team process for Oracle on VMware vSphere on EMC. You can read more about EMC’s support of Oracle under VMware vSphere at Chad Sakacc’s blog post on Oracle, x86, VMware and update on support.

Odds are, whatever issue you’re running into or concerned about with virtualizing Oracle has been seen by someone else at VMware and your storage provider. With all the major vendors talking to each other under TSANet, you won’t be left to fend for yourself.

Don’t be scared to run your Oracle products under VMware vSphere. It’s supported by Oracle. It’s supported by VMware. Your storage vendor probably even has a specific team dedicated to Oracle on VMware.

Which would your CEO prefer

Would your CEO choose to give up a negligible amount of system performance at peak system load in exchange for a reduction of risk during system upgrades and an alternative to hours of downtime?

I suspect the answer is yes.

What’s the cost of such a technology? Believe it or not, it’s free. It’s the snapshot functionality that’s built into VMware’s vSphere Hypervisor (based on ESXi) that you can leverage once you virtualize your Oracle database or any other x86 systems. Is the performance impact truly negligible? Don’t take it from me; you can read it in Oracle’s own words.

Recently RedHat and Oracle have come out with updates to their mainstream distributions (RedHat Enterprise Linux 5.6 and Oracle Linux 5.6) and each has come out with an entirely new version (RedHat Enterprise Linux 6 and Oracle Linux 6). System and database administrators all around the world are updating their critical systems. Applying those updates to critical systems requires testing and downtime.

The best practice for operating system upgrades is to test the upgrade on the same exact hardware, software and data as in production. But are your test and production systems identical? Same motherboard? Same processors? Same amount and brand of RAM and same exact operating system packages installed? Pretty unlikely. With virtualization, all those configurations are identical.

With VMware virtualization, I can take take a clone of the entire production server while it’s live and being used by users. Now I have a truly identical copy of production to test my upgrade. Note that to do a clone of the entire production server while it’s live requires VMware vCenter which isn’t free, but vSphere Essentials (which includes vCenter) starts at $1000 as of the time of this writing.

With snapshots (which are free and don’t require vCenter), you take a snapshot (3 mouse clicks and less than 10 seconds) of the virtual machine and then do the upgrade. If you run into issues, just revert / rollback the snapshot to the state it was in when you took the snapshot. The time to do that revert / rollback is only the time necessary to reboot your virtual machine – 5 minutes? If the upgrade and testing goes smoothly, you just merge the snapshot into the virtual machine while the system is up and available to users. Total time spent doing non-upgrade activities such as backups or restores? Essentially zero.

Compare this to how you would handle a critical system without virtualization: you’d take the system offline for the upgrade, take a full backup of the system, do the upgrade (hoping it works just like it did in the similar but most likely not identical test system), have the users test, and, if there’s an issue, possibly spend hours restoring from backup. You do trust your backups… right?

Enterprise DBAs and the companies that employ them tend to be risk adverse when it comes to losing data or experiencing downtime. Virtualizing the hardware allows you to ensure your test and development systems are the exact same systems as production, thereby reducing risk of unforeseen issues during the upgrade. Utilizing snapshots allows you to very quickly take a save point of your entire server and rollback to it very quickly in the event of unforeseen issue, almost entirely eliminating (minutes vs. hours) the downtime associated with recovering from an unforeseen issue.

Licensing Oracle on VMware vSphere

Honestly, I thought this issue was done and buried, but over the past few weeks I’ve seen this question come up multiple times, so let’s get this cleared up.

Let’s go right to the source – Oracle’s own documentation. If you read Oracle’s partitioning document you will see that this is Oracle’s stance as of January 24, 2011. In it, it discusses soft partitioning and hard partitioning. Soft partitioning is leveraging the Operating System features to limit the number of CPUs an Oracle instance (or Oracle virtual machine) can run on. Hard partitioning physically partitions a large server into smaller self contained servers. The document lists what Oracle considers valid examples of each type of partitioning. In the document, Oracle specifically defines VMware’s partitioning (and Oracle VM’s partitioning) as soft partitioning. In the document, Oracle states that soft partitioning isn’t a “valid” means of restricting the amount of software licenses and you must license all the processors on a given server. Note that later in the document Oracle states that Oracle VM CAN be used for hard partitioning if you set it up as described in this document which goes into detail on how to bind an OracleVM VM to physical processors / cores. There is no mention in the documents if binding a VMware VM to a physical processor/core would also count as hard partitioning. Oracle does state that their list of partitioning technologies isn’t comprehensive, so things are left open to interpretation.

Please note I highly recommend you go and read these documents yourself and draw your own conclusions, and of course you can and should talk with an Oracle-employed licensing expert. In these documents Oracle states I cannot reproduce the document in any manner without express written consent so I am only telling you my interpretation.

VMware has three different techniques for restricting a VM to a specific subset of processors / cores. They are VMware vCenter clusters, VMware DRS affinity rules, and vSphere CPU affinity (pinning). I advise my clients to use the VMware vCenter cluster technique, but your organization might have a different interpretation. To describe the different VMware techniques, I will use an example of a 10 host VMware vCenter datacenter, with each host having 2 physical sockets and 4 cores per socket. Therefore, this entire VMware vCenter datacenter has 80 physical x86 cores (4*2*10) of processing power.

VMware vCenter clusters are logical clusters inside of vCenter made up of one or more hosts. By assigning a VM to that cluster, you are forcing that VM to run ONLY on the host(s) that make up that cluster. For example, if you create a 2 host VMware vCenter cluster inside your 10 host VMware datacenter, your VM can run on any processors / cores inside that 2 host cluster. As a result, Oracle licensing requires you to license all 16 (4*2*2) cores in that cluster. Note that you are restricting other non-Oracle workloads from also running on these hosts, so your Oracle VMs will get the best possible performance available on those hosts, possibly at the detriment to your non-Oracle workloads running on other hosts.

In vSphere 4.1, a DRS rule called “Virtual Machines to hosts” became available. That rule allows you to limit the location of a VM to specific host(s) in the VMware vCenter cluster. For example, if you create a DRS affinity rule assigning a VM to a single host inside your VMware cluster, your VM can run on any processors / cores inside that host. As a result, Oracle licensing requires you to license all 8 (4*2*1) cores on that host. You can read more about the VM to hosts affinity rule in this post by Frank Denneman who is a co-author (along with Duncan Epping) of vSphere 4.1 HA and DRS technical deepdive. Note that you aren’t restricting other non-Oracle workloads from also running on this host and thus you could have less than optimal Oracle performance.

VMware vSphere itself allows you to pin a virtual machine to one or more physical cores on a server using vSphere’s CPU affinity settings. You can read about the details of this in the vSphere resource management guide version 4.1 starting on page 20. This is the technical equivalent of the Oracle VM technique of binding a VM to a specific subset of physical processors / cores. For example, if you pin your Oracle VM to two physical cores, your VM can only run on those two physical cores. As a result, Oracle licensing requires you to license those 2 cores. Note that you aren’t restricting other non-Oracle workloads from also running on this core and thus you could have less than optimal Oracle performance.

Does Oracle consider VMware’s CPU affinity settings an acceptable form of partitioning? What about VMware DRS VM to host affinity rules? I have seen no official documentation from Oracle either way. I advise my clients to always buy enough Oracle licenses to allow Oracle to run on at least two hosts. This allows the customer to not be concerned about Oracle’s licensing ambiguity (as you’re licensing the entire hosts Oracle can run on) and also allows the customer to get the benefits of VMware such as vMotion, HA, DRS and FT to reduce and possibly eliminate downtime or less than optimal performance for their Oracle systems. I have had a client who went from running Oracle physical (with the one physical server having 8 processors / cores) to virtual (with the physical server having 8 processors / cores) and the client wanting the benefits of vMotion, HA, DRS and FT but without having to buy Oracle licenses for an additional 8 CPU host. According to the Oracle partitioning document I referenced earlier, Oracle does allow customers to only licenses processors / cores that are turned on. For this customer I therefore recommended that they turn off half the processors / cores in each host. Please note this limited their VMs to a maximum of 4 cores each- the amount of cores available on each host.

Licensing Oracle on VMware vSphere is an area of much confusion and disagreement due to Oracle not presenting clear public guidelines on whether DRS Affinity rules or vSphere CPU affinity are valid methods of partitioning.  I hope that Oracle addresses this licensing confusion soon, but until then, separate VMware vCenter clusters are the least legally risky way to virtualize Oracle.  I would love for someone from Oracle to officially and on the record address the techniques I mentioned in this post.