Cloud is the trending topic in Data Management area, it was “Big Data” before, but the word “Big Data” became infamous mostly because inside companies (traditional or startups) decision makers didn’t know what to do with so many buzz tools, success articles, google-sized use cases, Gartner magic quadrants, vendors pressure and business non-existent use cases.
Things are more clear in “Big Data” space, there is the clear understanding that it will be the answer to some business questions and future needs and it needs to be fully integrated with the “old” technologies. Integration and “play together”, big and small data.
Cloud is another beast and it is mainly about agility, easy scaling and cost and decision makers know exactly this means for their companies. It is clear as water for them.
Some traditional companies are moving parts of their workloads (the less critical parts) to the cloud, while still maintain a big portion of the workloads on their on-premises data centers. Startups, they were born in the cloud so there is no other clear choice for them.
Oracle Database still remain very present on traditional companies, assuring critical OTLP and OLAP workloads and consequently present in all stages of development/test and production. Some of this stages and workloads will be forced to move to the cloud. From now on, things will become a complex world if you don’t pick Oracle Cloud and decide (for whatever reason) for the IaaS/PaaS leader: Amazon AWS.
First, it will be helpful to check Oracle position on this and that will be “Oracle Database Support for Amazon AWS EC2 (Doc ID 2174134.1)” on MOS as it states:
- Singles instances (no RAC) is supported in Amazon AMI on top of OEL 6.4 or latter (EC2 only)
- No support for Oracle RAC at all
- No support for Oracle Multitenant in Oracle 12C
- No support for Oracle RDS, even it is single instance only.
As Single Instance is supported inside EC2, RAC is not and Oracle has a detailed document about “Third Party Clouds and Oracle RAC”: http://www.oracle.com/technetwork/database/options/clustering/overview/rac-cloud-support-2843861.pdf
On this document Oracle states that RAC is not support for 2 reasons: the lack of shared storage and missing required network capabilities and it justify both reason.
- Lack of shared storage: Amazon AMI images allow bypass EBS limitation on shared storage (concurrent access) using iSCSI and building a NAS to “emulate” shared storage. As Oracle states, there is of course an performance impact on I/O as another layer is built to “emulate” shared storage, so Amazon recomends large AMI instances stating the following here:
“In order to provide high I/O performance, we chose to build the NAS instances on EC2’s i2.8xlarge instance size, which has 6.25 terabytes of local ephemeral solid state disk (SSD). With the i2.8xlarge instances, we can present those SSD volumes via iSCSI to the RAC nodes”
Another side effect of Amazon AMI images is they rely on “ephemeral” storage as it literally means: “It is persistant for the life of the instance.” You will have data loss if your NAS instances fail and Amazon is aware of this and states the following in the same document:
“The SSD volumes that come with the i2.8xlarge instance are ephemeral disk. That means that upon stop, termination or failure of the instance, the data on those volumes will be lost. This is obviously unacceptable to most customers, so we recommend deploying two or three of these NAS instances, and mirroring the storage between them using ASM Normal or High redundancy”Oracle, of course, doesn’t find this a very good solution (let’s face it, it is not) as the data loss danger is real.
- Required network capabilities: The network on EC2 doesn’t not support multicast IP, however this is much needed for RAC to broadcast packets during cluster configuration and reconfiguration.
Amazon provided here an workaround for this issue: point-to-point VPN among the RAC nodes using NTop N2N (a discontinued product).
Of course network performance will suffer on top of this solution and we know how Oracle RAC deals with bad network performance on interconnect with the popular “gc wait events”.
That said, Amazon is well aware of this stating:”
Performance of network communications via N2N is significantly lower than non-VPN traffic over the same EC2 network. In addition, Ntop, the makers of N2N, are no longer developing the N2N product. These factors may preclude running production-class RAC workloads over N2N.Currently, we are developing new approaches using clusterware-managed dynamic creation of GRE tunnels to serve the cluster’s multicast needs. We will provide details on this approach and revise this tutorial accordingly before August 2016.”Still no news at date of this post (as far we have investigated).
Conclusion is that there two major workarounds provided by Amazon AWS should be carefully evaluated if you decide to deploy a RAC cluster inside AWS. Things will eventually improve on Amazon if the demand for deploying Orace RAC is high, but for now please be very careful on this topic.
Article also published here: http://redglue.eu/oracle-database-on-amazon-aws/