This is the last part of the series called “Oracle RAC in the AWS Cloud”.
In my previous article I’ve promised to put a Oracle RAC cluster under the crazy load.
It was time to fulfill my promise.
My previous articles you can find on the following two links:
Let’s go straight to the point where I’ve stopped last time.
100 users 1 node stress test
In this test I’ve simulated 100 concurrent users that are performing single node stress test.
You can recall that I have only 4 CPU cores per node on disposal.
The next two slides are showing Oracle SQL Developer’s view on the first node:
The next slide shows ASH for node 1.
You can observe that session wait is much longer than with 10 users test and again, the main reason for that lies in fact that I’m short in a number of disks (3 per node) and its IO rate.
Below you can find AWR report for Node 1.
Swingbench report you can download from the link below:
Conclusion we can draw from this test would be that cluster survived stress test of 100 simultaneous users on the first node.
100 users 2 node stress test
This time I’ll extent the previous test by adding additional 100 users, this time on the second node, which can be observed on the following picture.
The next two slides show SQL Developer’s view on both nodes:
The next slide shows ASH report.
Clearly, the session wait is much longer when compared to 10 users test and again, the main reason for that lies in fact that I’m short in a number of disks (3 per node) and its IO rate.
Below you can find AWR report for Node 1
and Node 2:
Swingbench reports for both nodes you can download from here:
Again, the major bottleneck are in short number of disks / available IO.
The good news is that Oracle RAC cluster survived the stress test of 200 concurrent users on two nodes with 4 CPU cores and 32GB of RAM memory each.
Average response time for 100 users single node test was 231.63 ms, while the same metric in case of both nodes was slightly increased (256.85 & 243.73 ms respectively).
Average number of transactions per second in a single node test was 32.45 ms, while in two nodes test was 26.28 ms & 24.46 ms respectively.
In this test it is not only about performing very intensive stress tests with 200 users over 10 – 15 minute period of time on undersized HW.
Rather it is about robustness of cluster when you setup all components (OS parameters, network, storage locality etc.) properly that scales out almost linearly when you have enough of CPU/memory and especially disk capacity.
500 users 1 node stress test
When I’ve further increased the load to 500 concurrent users Oracle RAC did not survive and the cluster finally fall apart.
Main reason for that is that node run out of memory (swap was not enabled which is the main reason why cluster didn’t fall apart earlier, with much lower load).
In this series of running Oracle RAC on the Cloud, I’ve performed enough tests to be sure that combination of AWS Cloud and Oracle RAC is stable enough to survive under the heavy load with undersized HW.
Code as a template (provided by FlashGrid) is also an excellent way of delivering SW, superior way of distribution that targets enterprises (unlike SaaS delivery model).
For those industries and smaller enterprises with no regulatory compliance restrictions, running Oracle RAC on the Cloud might be a viable option, especially as you can get shared control (with Cloud provider) of your solution and put it under the Landing Zone umbrella.
However, with galloping tightening of the laws and penalties related to data protection/handling/sovereignty and security all over the Globe, large enterprises will more likely embrace Hybrid-Cloud solution or even more likely Private Cloud for running Oracle RAC, to reduce possibility of data leakage to acceptable level, as I expect to see a lot of PII, sensitive and governmental issued data you usually store in transaction databases like Oracle RAC.