This is the third part of a series of articles about measuring performance of disk subsystem.
In the first part you can find details of how to measure disk performance by using dd command:
while in the second part you can find similar information by using the hdparm utility:
In this blog I’ll describe the Oracle Orion Calibration Tool.
Although you can find description of Orion tool inside the Oracle database docs, as usual I’ll take a slightly different approach to avoid duplicating of what’s already available on many sites around.
Oracle Orion is a tool for testing disk speed limits with several advantages over alternatives.
Orion is part of Oracle database software installation, placed inside $ORACLE_HOME/bin directory.
For that reason you don’t need to install and configure Orion, which is not the case with alternatives like hdparm/SLOB/Swingbench.
Orion is designed to simulate Oracle database I/O load by using the same OS calls as Oracle database is performing.
This is not the case with Linux OS generic utilities described in the first two parts of checking disks speed limit series (dd and hdparm) as they can test only disk performance by reading and writing from file, which is very different from database type of load.
Orion can even simulate effect of ASM disk striping for really big installations where performance matter in each moving part of the system.
Orion does not require Oracle database to be up and running, and you don’t need to create any database object (schema, tablespace…), which is not the case with SLOB and Swingbench utilities.
Orion create synthetic pressure on I/O part only which is great advantage over SLOB/Swingbench/TOAD etc.
Tools like SLOB, Swingbench, TOAD and Oracle Real Application testing, they all perform tests on running database, meaning you are not only performing disks I/O test, but you are testing all other parts of database (like memory setup, undo/redo, temp tablespace, buffer cache…) along with operating system setup and disks.
Anyway, first goal should always be to try to isolate only disk subsystem and to run performance stress test targeting disks only, since in case you are performing load tests on running database, disk performance might degrade due to poor configuration of other components.
There are too many options how you can perform disk stress tests in Orion, but in most cases only few options will be sufficient.
In this article local file system directory will simulate enterprise storage lun.
First step is to create directory like this:
After that you need to create configuration file by executing next command:
For each LUN (LUN refers to one logical unit number, which can be device addressed by the SCSI or SAN enterprise storage, but it can also refers to local attached physical disk or local disk partition or even directory) you need to add the following line in test.lun file (as I’m using laptop with one disk, only one path to the LUN will be created):
Save and close test.lun file.
Next step is to create lun1 file of 10 Mb (normally you’ll skip this step and use path to the lun exposed to the OS instead):
dd if=/dev/zero of=/luns/lun1 bs=1024k count=10
This is the content of luns directory (one configuration and one 10 Mb file generated with dd utility):
oracle@oel75db12r2:/ls -la total 10244 -rw-r--r-- 1 oracle oinstall 10485760 Dec 9 23:08 lun1 -rw-r--r-- 1 oracle oinstall 14 Dec 9 23:19 test.lun
Finally it’s time to execute OLTP test like in the following example:
oracle@oel75db12r2:/luns>$ORACLE_HOME/bin/orion -run oltp -testname ob-test -hugenotneeded ORION: ORacle IO Numbers -- Version RDBMS_220.127.116.11.0_LINUX.X64_180103.1 ob-test_20181209_2319 Calibration will take approximately 22 minutes. Using a large value for -cache_size may take longer. ... ORION VERSION RDBMS_18.104.22.168.0_LINUX.X64_180103.1 Command line: -run oltp -testname test -hugenotneeded These options enable these settings: Test: test Small IO size: 8 KB Large IO size: 1024 KB IO types: small random IOs, large random IOs Sequential stream pattern: RAID-0 striping for all streams Writes: 0% Cache size: not specified Duration for each data point: 60 seconds Small Columns:, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Large Columns:, 0 Total Data Points: 21 Name: /test/test/luns/lun1 Size: 10485760 1 files found. Maximum Small IOPS=535 @ Small=8 and Large=0 Small Read Latency: avg=14942.111 us, min=35.911 us, max=260739.286 us, std dev=19374.189 us @ Small=8 and Large=0 Minimum Small Latency=2502.506 usecs @ Small=1 and Large=0 Small Read Latency: avg=2502.506 us, min=36.030 us, max=82293.343 us, std dev=4830.983 us @ Small=1 and Large=0 Small Read / Write Latency Histogram @ Small=8 and Large=0 Latency: # of IOs (read) # of IOs (write) 0 - 32 us: 0 ( 0.00%) 0 ( 0.00%) 32 - 64 us: 130 ( 0.55%) 0 ( 0.00%) 64 - 128 us: 223 ( 1.48%) 0 ( 0.00%) 128 - 256 us: 898 ( 5.26%) 0 ( 0.00%) 256 - 512 us: 8375 ( 40.45%) 0 ( 0.00%) 512 - 1024 us: 7944 ( 73.82%) 0 ( 0.00%) 1024 - 2048 us: 1583 ( 80.47%) 0 ( 0.00%) 2048 - 4096 us: 681 ( 83.34%) 0 ( 0.00%) 4096 - 8192 us: 1418 ( 89.29%) 0 ( 0.00%) 8192 - 16384 us: 1770 ( 96.73%) 0 ( 0.00%) 16384 - 32768 us: 769 ( 99.96%) 0 ( 0.00%) 32768 - 65536 us: 7 ( 99.99%) 0 ( 0.00%) 65536 - 131072 us: 2 (100.00%) 0 ( 0.00%) 131072 - 268435456 us: 0 (100.00%) 0 ( 0.00%)
As you can observe, main concern with OLTP type of load are low latency and IOPS (Input Output per second) characteristics instead of MB/sec, as OLTP transactions usually will be very small (reading just a few database blocks, or writing one row or column).
Second test is suitable for DSS (Decision Support Systems) type of load, which includes Data Warehouses.
For DSS test 1 Gb file is more appropriate, and I’ll do that by executing the following command:
dd if=/dev/zero of=/u01/jp/test/lun1 bs=1G count=1
Configuration file will remain the same as in the first OLTP test (only lun1 is present although with different file size).
At this stage everything is ready to start the test by executing the following command:
$ORACLE_HOME/bin/orion -run dss -testname test -hugenotneeded oracle@oel75db12r2:/u01/jp/test>$ORACLE_HOME/bin/orion -run dss -testname test -hugenotneeded ORION: ORacle IO Numbers -- Version 22.214.171.124.0 test_20181210_1029 Calibration will take approximately 65 minutes. Using a large value for -cache_size may take longer. ORION VERSION RDBMS_126.96.36.199.0_LINUX.X64_180103.1 Command line: -run dss -testname test -hugenotneeded These options enable these settings: Test: test Small IO size: 8 KB Large IO size: 1024 KB IO types: small random IOs, large random IOs Sequential stream pattern: RAID-0 striping for all streams Writes: 0% Cache size: not specified Duration for each data point: 240 seconds Small Columns:, 0 Large Columns:, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 Total Data Points: 16 Name: /test/test/luns/lun1 Size: 1073741824 1 files found. Maximum Large MBPS=51.96 @ Small=0 and Large=10
For the same disk by using dd command, read speed was 115 MB/sec, while with the hdparm tool read speed was 109.15 MB/sec.
In the DSS test with Orion tool, read speed was 51.96 MB/sec which is approximately 2 times slower.
To check what’s going on, I performed the same run (DSS type of the load, Orion tool used with the same options as previous DSS run) with only one difference: number of simulated luns are now 2 (each of 1 GB in size), while in the first run there was only 1.
[oracle@oel75db12r2 luns]$ $ORACLE_HOME/bin/orion -run dss -testname test -hugenotneeded ORION: ORacle IO Numbers -- Version RDBMS_188.8.131.52.0_LINUX.X64_180103.1 test_20181219_2140 Calibration will take approximately 69 minutes. Using a large value for -cache_size may take longer. Setting ftype=0 Maximum Large MBPS=119.93 @ Small=0 and Large=4
This time disk speed was 119.93 MB/sec, which is very close to results by using dd (115 MB/sec) and hdparm (109 MB/sec).
In the final test we’ll use 4 luns on the same external HDD.
[oracle@oel75db12r2 luns]$ $ORACLE_HOME/bin/orion -run dss -testname test -hugenotneeded ORION: ORacle IO Numbers -- Version RDBMS_184.108.40.206.0_LINUX.X64_180103.1 test_20181220_0657 Calibration will take approximately 77 minutes. Using a large value for -cache_size may take longer. Setting ftype=0 Maximum Large MBPS=115.50 @ Small=0 and Large=56
Result is a bit lower compared with 2 luns, but almost equal to dd and hdparm results which is proof that will all utilities it is possible to get similar results.
This concludes series of blogs of how to quickly find disks/LUNs limits.
Main goal of disks speed series is to serve as starting point to get familiar with the environment that you use, as whole books have been written explaining different concepts of storage sub-system/architecture and test measurements.
I’m aware that many interesting topics are left uncovered(maybe sometime in future I can expose some of them) along with topics like how to estimate environment impact on shared enterprise storage etc.
In one of the future articles I’ll described methodology of how to gradually improve performance of Oracle database by fixing setup issues one by one.