New features are being continuously added to the OMEGAMON for Db2 Performance Expert Enhanced 3270 User Interface.  A continued investment in the 3270 UI demonstrates a commitment to synergy and integration with the entire OMEGAMON family of products. This blog describes the latest feature introduced through the version 5.4.0 service stream.

Introduction

Deadlocks and timeouts are a fairly routine occurrence in Db2 as applications vie for potentially the same resource.  A deadlock occurs when two or more resources acquire an exclusive lock on the same object – typically to write to a table.  A timeout is a built-in safety mechanism in Db2 (defined by ZPARM IRLMRWT) that cancels the threads in a deadlock situation.   Any Db2 thread (OLTP, batch) can have contention for resources that cause deadlocks and timeouts to occur.  It is important to find the cause of these contentions and reduce their occurrence to keep Db2 applications running efficiently.  OMEGAMON Db2 Performance Expert on z/OS has had the ability to record and display Deadlocks and Timeouts on the Performance Expert Client for some time.  Now that same data will be on the Enhanced 3270 UI.

Requirements

The PTF UI57769 for APAR PI94904 must be installed on the 5.4.0 release of OMEGAMON for Db2. Parmgen may need to be run to configure collection of Events Deadlock and Timeout.  See Parmgen parameters KD2_OMPE_DB2_EVENT, KD2_OMPE_DEADLOCK and KD2_OMPE_TIMEOUT.

Accessing Deadlock and Timeout Events

The remaining sections of this blog explain the panels that are especially useful to newer users of the OMEGAMON family of products. As deadlocks and timeouts are considered “events” in Db2, the enhanced 3270 UI Events are accessed by a new selection code ‘E’ from the Enterprise Summary for one single Db2 subsystem or selection code ‘E’ from the DB2 Main Screen for Data Sharing groups to display events from an entire DSG. Selecting ‘E’ will take you to the filter workspaces, where filter values can be entered to focus and limit the events you are looking for to find resource contention problems.

Filter Workspaces

The first filter workspace is time ranges:

9524603495?profile=original

Using this filter workspace, you can narrow down a time range by minutes, hours or a specific date and time range, so you can find contentions.

From any of the filter workspaces, click the ‘OK’ button when you have completed entering your filters.  The filters you entered will remain set for the entire logon session. Clicking the ‘Clear’ button on any of the filter workspaces will restore filters back to their default values.

Clicking on the Thread ID tab will bring you to the Thread Identification filter screen:

9524603877?profile=original

On this screen, you can enter any identification fields to limit the contentions you are looking for.

The asterisk (*) can be used for wild carding for multiple characters at the end of the field. 

Clicking on the End User Tab will bring to the End User Filter screen.

9524604081?profile=original

Here, an asterisk (*) can be used at the end of the field for wildcarding.

Clicking OK will navigate to the Events Summary.

Event Summary

This is what the Events Summary looks like for a single Db2:

9524604653?profile=original

For a Data Sharing Group, you will see something like this:

9524604694?profile=original

Selection from the Summaries navigates to either Deadlock Detail or Timeout Detail depending on which type of Event the ‘S’ selection is entered on. The detail screens are the same from both Summaries.

Deadlock Detail

The deadlock detail shows the resource type and a list of waiters and blockers.

9524603275?profile=original

Using the ‘S’ select code shows more detail about the deadlock:

9524603296?profile=original

The Waiter and Blocker SQL Statement ID are zoom columns. By Clicking on one of these field names you will navigate to the Dynamic or Static SQL cache to display statement information including statement text. This can only be displayed if the SQL statement is still present in the SQL cache.

Deadlock SQL Cache

This screen displays if the statement is in the SQL cache:

9524605852?profile=original

You can then navigate to more details about the SQL statement in the cache including the SQL text.

9524606073?profile=original9524606688?profile=original

What do I do about deadlocks?*

Deadlocks can often be prevented by changing applications to modify resources in the same sequence.  Row level locking can sometimes help prevent deadlocks by reducing the amount of data being locked and preventing two applications which are updating the same page but not the same row from conflicting with each other.

A deadlock is detected by Db2 in milliseconds and Db2 will usually resolve the situation very quickly. But if a deadlock occurs at the end of long running application unit of work with a lot of updates without frequent commits, there will be an expensive ROLLBACK processing before normal processing can continue.

*From Db2 manual Managing Performance.   Deadlock detection scenarios https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/perf/src/tpc/db2z_managingdb2perf.html

Timeout Details

If you use ‘S’ to select from one of the summaries on a timeout type of events, you will navigate to the Timeout Detail screen.

9524606287?profile=original

This shows the identity of the thread that timed out, the resource causing the contention, and information about the blocker that was holding the resource.

You can then select the blocker to get more detail.

9524607462?profile=original

Clicking on the Statement ID field name will navigate you to the SQL cache to display more information about the SQL statement that caused the contention. Same as with the Deadlock detail.

What do I do about timeouts?**

For timeouts, there are several things to look for.

  1. Check the LOCKRULE on the tablespace. If it has LOCKSIZE TABLE or TABLESPACE and the application does not need to lock the entire table or tablespace, the LOCKSIZE could be changed to PAGE or ANY. Applications should not lock more resources than are nessassary.
  2. Check the number of LOCKS PER TABLESPACE) NUMLKTS system parameter, which was set in DSNZPARM. If many page or row locks are acquired and held, a small value for LOCKS PER TABLE(SPACE) might cause lock escalation. You can resolve the timeout by changing the value on the NUMLKTS DSNTIPK update panel or by specifying the RELEASE(COMMIT) bind options and committing changes more frequently.
  3. Check EXPLAIN output for the failing SQL statement and examine the value of the PLAN_TABLE.TSLOCKMODE column for every table or table space. If the competing applications are attempting to obtain incompatible locks, you might be able to resolve the timeout by running the applications sequentially rather than concurrently. You might also resolve the problem by changing lock size and the following bind options: RELEASE and ISOLATION.
  4. Check the RELEASE bind option for the application. Binding with RELEASE(DEALLOCATE) causes partition, table, table space, and DBD locks to be held longer than binding with RELEASE(COMMIT), possibly causing a timeout.
  5. Issue the DISPLAY DATABASE command and specify the LOCKS keyword during program execution. By doing so, you can verify that only expected locks are held when the timeout occurs. If unexpected locks are held, you might also resolve the problem by changing lock size and the following bind options: RELEASE and ISOLATION.
  6. Check the ISOLATION bind option of the application. The isolation level affects whether locks are acquired and how long they are held.
  7. You may want to increase the wait time if you know that your applications must keep resources longer than the specified time. You can increase the limit by specifying a new IRLM locked resource wait time limit. You can take one of the following actions to make the change:
    • Use the parameter in the DSN6SPRM assembler macro in the DSNTIJUZ job stream Update the 'wait-time' field of the Db2 installation IMS Resource Lock Manager panel, IRLMRWT. You must also specify all other parameters contained in this version of the CSECT. Next, take one of the following actions:
    • Resubmit installation job DSNTIJUZ with the link-edit SYSIN file 'NAME' parameter that matches the -START DB2 'PARM=' parameter.
    • Reassemble DSN6SPRM and relink-edit DSNZPARM by resubmitting installation job DSNTIJUZ. The link-edit SYSIN file 'NAME' parameter must match the -START DB2 'PARM=' parameter.

**From Db2 manual Managing Performance.  Investigating and resolving timeout situations https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/perf/src/tpc/db2z_managingdb2perf.html

Conclusion

Deadlocks and Timeouts are important events to monitor for and resolve to ensure optimal application throughput. This new Enhanced 3270 UI feature will enable users to quickly locate threads that were involved in contentions and diagnose and resolve their problems. 

E-mail me when people leave their comments –

You need to be a member of WorldofDb2 to add comments!

Join WorldofDb2