IBM Db2 - The Ultimate Database for Cloud, Analytics & Mobile
I have got an ETL stored procedure which performs the following steps and takes in the following input parameters:
IN SOURCE_SCHEMA VARCHAR(128)
, IN SOURCE_TABLE VARCHAR(128)
, IN TARGET_SCHEMA VARCHAR(128)
, IN TARGET_TABLE VARCHAR(128)
, IN EXCEPTION_SCHEMA VARCHAR(128)
, IN EXCEPTION_TABLE VARCHAR(128))
1. Checks for delta rows between source table and target table (where a certain flag
column = 'N' in source table).
2. Delete any exceptions from the exception table (to store only the latest
3. Update Audit Tables.
4. Based on certain conditions, this delta flag column value is set to 'P', 'U' or
'S' in the source table (where delta flag column = 'N').
5. Where Delta Flag column value = 'P', load those rows into the target table.
6. Writes all these steps (along with exceptions) into a log file and then emails
that log file to respective recipients.
There are couple other steps involved between step 5 and step 6 but they are not quite related to the step I need help with. So currently, load statements are dynamically built:
SET STATEMENT = '';
SET STATEMENT ='LOAD FROM (SELECT ' || SELECT_COLS ||', CURRENT_TIMESTAMP AS REC_CRTN_TS, '''||ETL_JOB_ID||''' AS JOB_ID FROM '||SOURCE_SCHEMA||'.'||SOURCE_TABLE||' WHERE PROC_RAW_FG = ''P'' '|| ORDER_BY_CLAUSE ||' FOR FETCH ONLY ) OF CURSOR MESSAGES ON SERVER INSERT INTO '||TARGET_SCHEMA||'.'||TARGET_TABLE || ' FOR EXCEPTION '||EXCEPTION_SCHEMA||'.'||EXCEPTION_TABLE;
CALL DBMS_OUTPUT.PUT_LINE(STATEMENT); -- for logging purposes into a file
CALL UTL_FILE.PUT_LINE(OUTFILE,UTL_TCP.CRLF||'LOAD STATEMENT :'||UTL_TCP.CRLF||STATEMENT); -- for logging purposes into a file
SELECT_COLS is a variable that holds comma separated source fields with logic to trim char/varchar fields for select clause to apply to target table. It would look something like this (column list is huge, so only including a few): TRIM(TSO_SPLIT_NO) AS TSO_SPLIT_NO,QTY_SPLIT,TRIM(HIGH_PRTY_INDC) AS HIGH_PRTY_INDC,TRIM(QTE_MATCH_CNTL_CD) AS QTE_MATCH_CNTL_CD
Also, this column list depends on what source table and target table we have passed as inputs, following logic is implemented to extract the column list:
SET STATEMENT = '';
SET STATEMENT = 'SET (?,?,?,?,?) = (SELECT SELECT_COLUMNS, ORDER_BY_CLAUSE, PRIMARY_KEY_COLUMNS, SUSP_REC_WHERE_CLAUSE,SOURCE_COLS FROM schemaName.tableName WHERE SOURCE_SCHEMA = '''|| SOURCE_SCHEMA ||''' AND SOURCE_TABLE = '''|| SOURCE_TABLE ||''' AND TARGET_SCHEMA= '''|| TARGET_SCHEMA ||''' AND TARGET_TABLE = '''|| TARGET_TABLE ||''') ';
PREPARE C_STATEMENT FROM STATEMENT;
EXECUTE C_STATEMENT INTO SELECT_COLS, ORDER_BY_CLAUSE, PRIMARY_COLS, SUSP_REC_WHERE_CLAUSE, SOURCE_COLS;
ETL_JOB_ID is a variable that holds dynamically generated value based on the following:
SET MAIN_STEP_CD = THIS_PROC||'.'||TARGET_TABLE;
SET ETL_JOB_ID = MAIN_STEP_CD ||'_'|| TO_CHAR(BEGIN_TS,'YYYYMMDDHH24MISS');
Now, since we are using DB2 Load Utility, this step often causes blocking since this step locks up the entire table, not just the rows that it is working with. That said, I was instructed to use a cursor to insert records into target table, of course, without hard coding any of the stuff as that would make this stored procedure unusable.
I do have something in mind but I am having difficulties in constructing the actual cursor. More specifically, how do I fetch dynamically input column values and then dynamically insert them into the target table? Also, I couldn't think of a way to catch expections (like we have a FOR EXCEPTION clause in the load statement) I was thinking it is going to look something like this:
DECLARE EOFR INT DEFAULT 0;
DECLARE CR_SELECT_COLS ANCHOR DATA TYPE TO schemaName.tableName.SELECT_COLUMNS
DECLARE CR_REC_CRTN_TS TIMESTAMP;
DECLARE CR_ETL_JOB_ID ANCHOR DATA TYPE TO schemaName.tableName.ETL_JOB_ID;
DECLARE STMT VARCHAR(200);
DECLARE STMT2 VARCHAR(200);
SET STMT ='SELECT ' || SELECT_COLS ||', CURRENT_TIMESTAMP AS REC_CRTN_TS, '''||ETL_JOB_ID||''' AS JOB_ID FROM '||SOURCE_SCHEMA||'.'||SOURCE_TABLE||' WHERE PROC_RAW_FG = ''P'' '|| ORDER_BY_CLAUSE || ' FOR FETCH ONLY'
PREPARE S1 FROM STMT;
DECLARE C1 CURSOR FOR
DECLARE CONTINUE HANDLER FOR NOT FOUND
SET EOFR = 1;
SET STMT2 = 'INSERT INTO '||TARGET_SCHEMA||'.'||TARGET_TABLE || ' VALUES (?,?,?)';
PREPARE S2 FROM STMT2;
WHILE EOFR = 0 DO
FETCH FROM C1 INTO CR_SELECT_COLS, CR_REC_CRTN_TS, CR_ETL_JOB_ID;
EXECUTE S2 USING CR_SELECT_COLS, CR_REC_CRTN_TS,CR_ETL_JOB_ID;
But I think this is not going to work as CR_SELECT_COLS will not be able to hold multiple column VALUES. That said, based off of my DB2 knowledge (which is way down there at the moment, since DB2 LUW platform is not my expertise yet), this is the only way I could think of implementing a cursor. If you guys think there is a better way, please feel free to share. Thanks in advance!
Does LUW have a concurrent LOAD that does actually INSERT under the covers? That could be easy solution if was on z/OS.
Can you explain what types of Exceptions are expected? Will an entire run fail or individual rows will have problems? What is current technique that determines the Delta between source and target? That could be harder to do than the load of selected rows.
What is maximum number of rows to be Inserted?
It is going to be easier to construct an INSERT FROM SELECT to insert multiple rows at a time, either all of them, or batch of n rows using FETCH FIRST n ROWS only in a loop, or even case n = 1. INSERT VALUES after a FETCH from Cursor is probably too hard to code.
Is there something to be updated in the source table to indicate delta 'P" rows have already been copied? If not, it is tricky to SELECT only the rows that have not already been copied. Perhaps INSERT from SELECT cols FROM source EXCEPT SELECT cols from Target FETCH FIRST n ROWS ONLY.
Your question probably has too many complex parts to it. Simplification of the question might help.