Last time I’ve suggested two ways of how to perform string concatenation.
Unfortunately, both approaches have serious drawbacks that you might experience.
In this post I’ll explain one approach which is not Oracle version dependent, which corrects both shortcomings of previously described methods:
- size of concatenated string
- speed of data processing
Part I you can find on the following link:
Method 3 – cast/collect
I’ll again concatenate CACHE column from my TEST table.
As CACHE column is of type Number, first I need to create appropriate TYPE:
create or replace type name_coll_type as table of number;
Next I need to create a function that will accept collection (previously created table of number) as input parameter and return CLOB.
There s nothing special with that function, that would require further explanation.
create or replace function name_coll_type_to_clob ( p_name_coll in name_coll_type , p_delimiter in varchar2 default null ) return clob is v_name_string clob := ' '; begin for idx in p_name_coll.first..p_name_coll.last loop if idx = p_name_coll.first then dbms_lob.open(v_name_string, DBMS_LOB.LOB_READWRITE); dbms_lob.write(v_name_string, length(p_name_coll(idx)), 1, p_name_coll(idx)); else DBMS_LOB.WRITEAPPEND(v_name_string, length(p_name_coll(idx) || p_delimiter), p_delimiter || p_name_coll(idx)); end if; end loop; DBMS_LOB.CLOSE(v_name_string); return v_name_string; end name_coll_type_to_clob;
Finally I need to execute the query:
select name_coll_type_to_clob( cast( collect( cache order by timestamp ) as name_coll_type ) , ',' ) as some_list from test t where rownum < 10000000; Elapsed: 00:02:08.11 Statistics ---------------------------------------------------------- 302 recursive calls 89615909 db block gets 20002237 consistent gets 105463 physical reads 0 redo size 23606 bytes sent via SQL*Net to client 14131 bytes received via SQL*Net from client 66 SQL*Net roundtrips to/from client 0 sorts (memory) 1 sorts (disk) 1 rows processed
Oracle COLLECT function enables aggregating data into a collection retaining multiple records within a single row – similar to the nested tables.
CAST function converts input collection (result of COLLECT function) into another collection, where I need to specify exactly which one.
That’s why I have to use “as name_coll_type”.
More about SQL functions you can find in Oracle Database SQL Language Reference book, which is available at the following link:
As you can observe, SQL completes in 2 minutes and 8 seconds which is a very good result, when compared with two previous attempt, and taking into account extensive context switching – 10 million times between SQL and PL/SQL engine.
Can we do better?
Method 4 – apex_string – join_clob function
In this method I’m going to use Oracel APEX join_clob function.
For this to work, you need to first install APEX.
You can download APEX from the following link:
You can choose the latest version of APEX (at the time of writing it is 19.2) no matter which database version you are running (e.g. you can install APEX 19.2 on Oracle database 12c).
If you only need one language (english), you can download smaller version.
After downloading and unpacking zip file, you need to connect as SYSDBA to create tablespace and execute the installation script:
connect / as sysdba create tablespace apex datafile '/u01/app/oracle/oradata/ORCL/apex01.dbf' size 10m autoextend on next 10m maxsize unlimited extent management local autoallocate; --execute as SYSDBA @apexins.sql apex apex temp /i/ Position 1: Name of tablespace for Application Express application user – apex Position 2: Name of tablespace for Application Express files user - apex Position 3: Name of temporary tablespace or tablespace group - temp Position 4: Virtual directory for APEX images - i
You don’t need to create web listener if you just want to install APEX functions and you are not planning to run APEX apps.
Finally I can execute the query:
select apex_string.join_clob( cast( collect( rpad(t.cache, 2) order by t.timestamp ) as apex_t_varchar2 ) , '' , 12 /* dbms_lob.call */ ) as some_list from test t where rownum < 10000000; Elapsed: 00:00:24.33 Statistics ---------------------------------------------------------- 283 recursive calls 55942 db block gets 92872 consistent gets 29510 physical reads 0 redo size 23610 bytes sent via SQL*Net to client 14130 bytes received via SQL*Net from client 66 SQL*Net roundtrips to/from client 0 sorts (memory) 1 sorts (disk) 1 rows processed
Again I’m using cast/collect combination as in previous method (name_coll_type type/name_coll_type_to_clob function), but this time I added apex_string.join_clob function.
But this approach has brought the best result so far (from 2 minutes 8 sec → 24 sec).
Method 5 – json_arrayagg / json_value
This is the last method I’ll explain.
JSON_ARRAYAGG function takes as input column of SQL expression, converts it to JSON value and returns JSON array where each element is JSON value.
JSON_VALUE converts JSON value into the SQL value.
Here is the SQL code:
select json_value( replace( json_arrayagg( rpad(t.cache, 2) order by t.timestamp returning clob ) , '","' , '' ) , '$' returning clob ) as some_list from test t where rownum < 10000000; Elapsed: 00:00:14.92 Statistics ---------------------------------------------------------- 406 recursive calls 670738 db block gets 175185 consistent gets 29510 physical reads 0 redo size 23606 bytes sent via SQL*Net to client 14130 bytes received via SQL*Net from client 66 SQL*Net roundtrips to/from client 0 sorts (memory) 1 sorts (disk) 1 rows processed
The JSON method is even better than previous one, as elapsed time dropped from 24 sec → 15 sec.
There are many way to achieve desired functionality.
In this and previous posts I’ve listed five different methods.
While there are many ways to achieve the same goal, only one or two ways are usually scalable and fast.
As always, the key is to understand how, under the hood, some functionality is implemented.