Sowmya Kameswaran's Posts (4)

Sort by

By Vassil Dimov, Sowmya Kameswaran and Eirini Kalogeiton

IBM Db2 for z/OS Date Gate 2.1 introduces a new major feature — Query Acceleration.

Why is this big news and what does it mean? For those familiar with IBM Db2 Analytics Accelerator for z/OS (IDAA), the ability to accelerate analytical queries that target Db2 for z/OS data by routing to IDAA has been game changer. Support to accelerate analytical queries by routing to Db2 Data Gate, enables cloud-native applications that depend on business critical Db2 for z/OS data to send queries from Db2 for z/OS through Data Gate to Db2 Warehouse in Cloud Pak for Data in the private or public cloud.

The target database, IBM Db2 Warehouse is optimized for analytics. Applications always connect to Db2 for z/OS to execute queries, all result sets are also returned back to Db2 for z/OS, making query acceleration and execution completely transparent to the client application. This allows to achieve improved application performance with minimal to no changes required to the application itself, which is a key differentiator.

Not familiar with Db2 Data Gate? Learn more about it and how it is cloud enabling Db2 for z/OS data here.

Note: Query acceleration has been available as a Beta feature within IBM Db2 for z/OS Date Gate 2.0 since 2021.

Technical details

The query acceleration functionality is only supported when Db2 Warehouse has been configured as target database during the provision of Db2 Data Gate. If the feature is enabled, the character encoding of the target database will be kept the same as the encoding on Db2 for zOS (EBCDIC). This ensures that a query routed to Db2 Warehouse to be accelerated delivers the same result set as when executed by Db2 for z/OS.

1*1gWsB_F55hWZKwhWme9iQg.png

Setting up Data Gate for Query Routing

  1. Provision Data Gate with Query Routing enabled

To use the query routing feature with Db2 Data Gate, the first step is to select a Db2 Warehouse instance that is pre-provisioned on the Cloud Pak for Data platform and then enable the “Use this data gate instance for query acceleration” checkbox.

1*uj_qxC3sVNf7cGkz4QjzCw.pngEnable query acceleration when creating an instance

2. Set up Db2 for z/OS for query acceleration

The following zPARMs have to be set to enable query acceleration when Db2 for z/OS subsystem starts:

ACCEL=AUTO

Additionally, you can define what queries can be routed by setting QUERY_ACCEL_OPTIONS. An example setting is:

QUERY_ACCEL_OPTIONS=(1,2,3,5,6)

3. Connect Db2 Data Gate to Db2 for z/OS source database

1*qN_zYpjzyWHJUA9E9EcvwQ.png

4. Manage query acceleration for Db2 Data Gate

1*WO39iJPktDfDJ8Q7weaBzg.png

5. Add tables from Db2 for z/OS to Db2 Data Gate

1*uancdn7S407fHJt1aYoSWg.pngT

6. Load data from Db2 for z/OS to Db2 Data Gate and enable acceleration 

When adding the tables to Data Gate, users have the option to also load data and enable data synchronization. If this option is not exercised, then data load can be initiated from the table list on the Tables tab. Once the data load is complete, query acceleration can be enabled from the same location for the selected tables. 

7. Get information about queries being accelerated

Once query acceleration is enabled on Db2 Data Gate instance level and on table level, queries targeting Db2 for z/OS can be accelerated to Db2 Data Gate (to execute against the data in the backend Db2 Warehouse). Similar to the IBM Db2 Analytics Accelerator for z/OS, this is enabled by using the SET CURRENT QUERY ACCELERATION ALL statement:

1*BV5HaWxCvXAsjEjrMlZ_-Q.png 1*EWEsDpLsEEG9FI4Ka1G0Dg.png

After the query is executed, corresponding monitoring information will be displayed in the Queries section of the Data Gate User interface.

1*ylvhDwlqMQwqOyof_AeLtA.png

Why static images? Let’s see some live action!

1*FkPEOyO1A5E-6Kd9BIfxWg.gif

See blog for information on all key Data Gate 2.1 functional highlights.

For information on install and configuration IBM Documentation for Data Gate

Read more…
In this blog we will discuss about the Capture Query environment and performance data functionality that is supported by IBM Db2 Analytics Accelerator for z/OS (IDAA) to gather query environment information to help understand and address query performance issues. With query acceleration being the primary business value proposition for IDAA it is always important to understand the environment in which the query is executing to help achieve optimal query performance .  The need for a robust means to gather information about the environment in which queries execute is amplified in the case where complex, resource intensive, analytical queries are being executed on the IDAA. 
 
Since versions 7.5.1 and 7.5.3 of the IDAA server, this new capability to allow Capturing query environment and performance data has been supported. As you may be aware, IDAA uses a Db2 Warehouse columnar engine as the datastore. When eligible for acceleration, Db2 for z/OS sends the queries over to IDAA for execution. In case of performance issues, collection of query environment and performance data can be enabled. When enabled, the queries need to be re-executed under the covers to gather the necessary information.  
 
Initiating the data capture can be done by invoking the ACCEL_CONTROL_ACCELERATOR with the following inputs or using the existing Data Studio interface to administering IDAA. 
 
202d8500-799a-11ea-9335-4a9eeca002cf
 
Since the query is executed under the covers and may be long-running, the timeout value here allow users to control how much longer the query will be allowed to execute to capture environment data.  This execution of the query along with performance tracing enabled will utilize Db2 Warehouse resources and may potentially impact performance of other IDAA operations, depending on the nature of the query.

52d77d80-799a-11ea-933b-06b5be6c0564
 
 
Once the capture has been initiated, saving the trace with the option to also gather the Manually collected trace information will combine the query and performance trace data as part of the downloaded traces to allow users to analyze and share with the support teams for further analysis. 
 
This doc along with the default IDAA trace can help provide actionable steps to resolve the query performance problem. Some examples include, but are not limited to the following:
 
This can also be automated in a batch process using the sample stored procedure ACCEL.GET_QUERY_HISTORY2 How to store Db2 query history in a Db2 for zOS table . With GET_QUERY_HISTORY2 running at a set interval, i.e. every 5 minutes, a batch application can set a threshold for total elapsed time. For queries that exceed this threshold call ACCEL_CONTROL_ACCELERATOR automatically to collect the query environment data and performance docs. 
 
<?xml version="1.0" encoding="UTF-8" ?>
<aqt:controlCommand xmlns:aqt="http://www.ibm.com/xmlns/prod/dwa/2011"
 version="1.0">
    <getAdditionalSupport planID="0" traceCommand="CDE_PERFORMANCE_TRACE"
     timeoutInMinutes="0"/>
</aqt:controlCommand >


Then at another interval (every 24 hours), call ACCEL_CONTROL_ACCELERATOR to save the trace . 
 
<?xml version="1.0" encoding="UTF-8" ?>
<aqt:controlCommand xmlns:aqt="http://www.ibm.com/xmlns/prod/dwa/2011" 
 version="1.0">
    <getTraceData keepConfiguration="true">
        <content>ACCELERATOR</content>
        <content>ADDITIONALSUPPORTINFO </content>
    </getTraceData>
</aqt:controlCommand>
 
NoteDocuments saved from capture query environment and performance data are automatically deleted on the accelerator after 7 days.
Read more…

By Vassil Dimov, Mateo Tošić, Sowmya Kameswaran and Eirini Kalogeiton

Introduction

In the previous blog post we have introduced Db2 for z/OS Data Gate and described it in detail. In this blog we will discuss the integration between Db2 Data Gate and Watson Knowledge Catalog and highlight the business value.

AI Ladder

In the current business world, modernization of data and use of AI is the key to success. The guiding principles of the AI ladder defined by IBM, help organizations with business transformation based on the four key areas mentioned below:

  1. Collect — Make data simple and accessible (All data sources contribute to this pillar)
  2. Organize — Create a business-ready analytics foundation (Data governance services like Watson Knowledge Catalog)
  3. Analyze — Build and scale AI with trust and transparency (Watson Studio)
  4. Infuse — Operationalize AI throughout the business (this is what customers do with the data in their own products)

In this blog we will discuss how Db2 Data Gate and Watson Knowledge Catalog, representing the first two pillars of the AI Ladder, can help organizations to unlock the huge value of their Z data in the cloud.

About Db2 Data Gate

Db2 Data Gate enables modern high-volume, high-frequency hybrid cloud applications that need read-only access to valuable enterprise data from Db2 for z/OS. It plays a key role in the Collect pillar by enabling movement of data from Db2 for z/OS into the Cloud Pak for Data platform. With data synchronization between source Db2 for z/OS and target IBM Db2 and IBM Db2 Warehouse, applications are able to get access to current data. To learn more about IBM Db2 for z/OS Data gate, please read “What is Db2 Data Gate? Db2 Data Gate Blog Series Part 1

About Watson Knowledge Catalog

Watson Knowledge Catalog (WKC) is an enterprise data catalog management platform that forms the core of the Organize pillar of the Cloud Pak for Data platform. A catalog connects people to the data and knowledge that they need. It is the key enabler to building the enterprise data catalog on Cloud Pak for Data that enables platform users to find, prepare, understand, and use the data as needed. The data governance framework ensures that data access and data quality are compliant with your business rules and standards.

WKC unites all information assets into a single metadata-rich catalog, based on Watson’s understanding of relationships between assets and how they’re being used and socialized among users in existing projects. It is integrated with an enterprise data governance platform that merges the analytics capabilities of Watson Studio. The data catalog assists data scientists in easily finding, preparing, understanding and using the data as needed.

Data protection has gained importance in recent years. That is why it is so important that WKC protects data from misuse and enables sharing of assets with automated, dynamic masking of sensitive data elements. This avoids violating various data protection regulations. For instance, when handling healthcare data in the USA, companies need to be aware of HIPAA (Health Insurance Portability and Accountability Act), a set of rules on how personally identifiable information maintained by the healthcare and healthcare insurance industries should be protected from fraud and theft. Moreover, any company based in the EU or offering services to people in the EU must comply with GDPR (General Data Protection Regulation), which has a much broader scope and governs the use of all personal data.

Db2 Data Gate and Data Fabric

Db2 for z/OS data is core to many operational applications but also foundational to business insight. Db2 for z/OS data is some of the most indispensable within an organization for a data fabric implementation. The end-to-end Db2 Data Gate solution makes data available and synchronized for end user access within a data fabric. As compared to alternative approaches, Db2 Data Gate can be simpler, less expensive, and timelier. With Db2 Data Gate organizations can go from transaction to business action in moments.

Db2 Data Gate 2.1 — Watson Knowledge Catalog integration highlights

Watson Knowledge Catalog provides fine-grained control of data from various sources to users who need access to them. While administrators have the most permissions, data scientists and developers can only access data that is published to catalogs. Business analysts can, in addition to that, view data quality and access information asset views, while data engineers and data stewards can discover assets, import metadata, and access governance artefacts. The benefits are numerous for different user personas.

With the combination of Data Gate and WKC, data scientists and software engineers can explore the most important enterprise data coming from the mainframe and use all tools they are familiar with in the cloud for analysis and modelling and prototyping. They can benefit from tools, like schema structure discovery, to further accelerate the development of models and application. They do not even need to look for connection metadata since all assets are cataloged and accessible in just a few clicks.

Data stewards, on the other hand, can easily work on data quality using governance artifacts, such as business terms, business glossary, classifications, and automatic data profiling in WKC. They can define which columns from Db2 for z/OS are visible for whom in the cloud. More importantly, they can take care of regulations mentioned above (GDPR, HIPAAA, etc.). This has significant importance for data coming from Db2 for z/OS, as a data store containing the most sensitive customers’ data. On top of that, they can use rules, such as automatic data deletion, triggered once data on Z is deleted (e.g., customer related analysis for some customer that needs to be deleted once they leave the company).

In addition to the above, one of the other key benefits is the ability to understand and track data lineage — the journey made by the data from source through any transformations all the way till usage. Data lineage is very important when it comes to making sure the data is coming from the right source, being handled by the right people, undergoing the right transformations and landing in the right target. When Db2 for z/OS data is brought into the platform by Db2 Data Gate and then discovered and imported into the catalog, the data lineage can easily be maintained allowing data custodians to keep track of data all the way from the source. Last but not least, the usage of Db2 Data Gate allows to discover schema changes which can be maintained in the data linage of the data asses in WKC.

Steps to connect Data Gate with WKC

One of the major highlights of Data Gate 2.1 is the direct integration with WKC. It is now as simple as a click of a button from the Data Gate dashboard to publish metadata data assets to the catalog.

Users can publish to catalog, whenever there are new objects added from Db2 for z/OS source to Data Gate. Without any manual setup or configuration, the Publish does all the magic behind the scenes from creating the catalog to publishing the assets and making them available for access.

Create Data Profile

(Benefit 1 — for Data Steward)

Data Profiles include generated metadata and statistics about the content of a data asset. An asset profile helps data stewards understand what actions to take to improve the data quality.

1*3McA-3HLClQJ4AEAFWPFRg.gif

Use the data assets in Watson Studio

Create a project and add data assets

(Benefit 2 — Data Scientist / Software Engineer)

If you go to Watson Studio and create a new project (or use an existing one), you can add data assets from this catalog to it.

1*Agu_qKMniI2OkGgXaLQhfA.gif

Create a notebook and load an asset into a data frame

Then you can use that asset for data analysis and modeling. In a Python/R notebook you will get an automatically generated block of code. Watson Studio will ask WKC for data and WKC will use the connection metadata to retrieve the data from the database. You can use the loaded data as a data frame.

1*3VuYnQofeF3R9XBlSc5Yvw.gif

Conclusion

We have shortly described Db2 Data Gate and introduced Watson Knowledge Catalog. We pointed out the benefits of their integration. By following the step by step video you were instructed how to get the integration yourself. To better materialize the benefits we went through a couple of example scenarios likely to be relatable in your usage flow.

For further reading, please check Daniel's blog here on Use Current Db2 for z/OS Data on Cloud, Without Direct Mainframe Access and Without Loosing Control Over Your Data

Read more…

By Sowmya Kameswaran and Jens Müller

 

As we all know, Db2 for z/OS has one of the largest footprints in the enterprise database world. Your organization may have all or most of its' business critical data on Db2 for z/OS (you are not alone and we wouldn't recommend you to change a thing about that!). We realize though that many organizations are experimenting with or, rearchitecting and extending their infrastructure to take advantage of hybrid cloud.
With the changing environment and importance of hybrid cloud, cloud-based applications need easy, secure access to this data for modern applications. IBM Db2 Data Gate for z/OS makes data from Db2 for z/OS readily accessible on the IBM Cloud Pak for Data platform for business users and application developers.
 
As-is scenario
We have embarked on an exciting era of compelling modern application development. There is a surge in both reporting applications requiring read-only access to transactional data as well as data-intensive analytics applications requiring access to historical data. Organizations are developing new compelling applications for differentiating services delivered to their customers. Since much of this data originates in Db2 for z/OS, many organizations have built custom ETL (extract, transform, load) jobs to extract and load this data into other databases to support their application needs. While this approach may work in the short term, some of the problems with this approach are:
 
  1. Expensive to create and maintain over the course of time (due to complexity, and the costs of synchronizing source and target databases, and ensuring transactional consistency if necessary)
  2. Data security concerns once data is moved from where it originates
  3. Increased operational processes and cost on IBM Z
 
Why Db2 Data Gate
  • It is an integrated solution to securely access data from Db2 for z/OS on the cloud without the need for direct access to Db2 for z/OS.
  • Avoids significant investment in building and maintaining custom ETL solutions to move Db2 for z/OS data.
  • Provides better data currency via the Integrated Synchronization feature that replicates data from Db2 for z/OS to IBM Cloud Pak for Data.
  • Significantly reduces (operation) cost of data replication on mainframe since 96 % of the underlying data synchronization technology is zIIP eligible.
  • Enables modernization and transformation in your enterprise's Journey to Cloud.
  • With Integrated Synchronization, the data availability for applications accessing data from the source is not affected (source tables are fully online for reading and writing) while data is replicated to the target.
  • HA/DR is built directly into IBM Cloud Pak for Data 

 

Architecture highlights

9712629474?profile=RESIZE_710x

 

Db2 Data Gate is based on Db2 (row store) or Db2 Warehouse (column store) as the target databases within IBM Cloud Pak for Data. This model makes it suitable for supporting  applications that require row level access  as well as analytical applications that benefit from column based data store. Only one Db2 for z/OS database can be used as the data source. The key aspect of the architecture is the Integrated Synchronization feature which is optimized to replicate data from Db2 for z/OS to Db2 running under IBM Cloud Pak for Data.
 
With Db2 Data Gate and the target database running on the IBM Cloud Pak for Data platform, the solution works wherever the platform is able to run – private, public or hybrid cloud implementations, thus making relevant data readily available to application developers and business users where they need it.
 
Lab performance benchmarks
Db2 Data Gate boasts unrivaled performance when compared to any other data synchronization tool synchronizing data from Db2 for z/OS to Db2 (Warehouse).
 
With IBM Cloud Pak for Data and Db2 Data Gate installed on Linux on IBM Z, using Db2 Warehouse as the target database using hostPath data storage:
 
  • Peak load performance (for making initial copy): 2.1 TB/h
  • Peak synchronization performance: 200k rows/sec at 1.2 secs peak latency
 
Db2 Data Gate does not:
  • Replace traditional transactional systems since it only provides read-only data access to source data
  • Serve as a replacement for other data replication technologies because it supports only one source and one target and also does not support bidirectional replication or data transformation
  • Support data versioning and hence is not an operational data store
  • Guarantee data currency for high-volume transactional workloads. It is not a carbon copy of source data. Meaning, applications requiring absolute currency should access data at its point of origin.
 
User Interface snapshots
 
Db2 Data Gate provisioning
Select the target database type and deployment, resource allocation and network routing to proceed with creation of the Db2 Data Gate instance.
 
9712621054?profile=RESIZE_710x
Setting up source
Once the instance is created, the first step is to point the Db2 Data Gate instance to the Db2 for z/OS subsystem to be used as data source.
 
9712599073?profile=RESIZE_710x9712580674?profile=RESIZE_710x
 
 
Select and add tables from source to target
The next step is to select the tables to synchronize data from source to target.
 
9712580064?profile=RESIZE_710x
 
Db2 Data Gate dashboard
Overview of status and activities associated with the provisioned Db2 Data Gate instance.
 
9712578895?profile=RESIZE_710x 
 
Db2 Data Gate in action
The video below demonstrates the Db2 Data Gate end user experience. One of the key aspects to note is that when the source tables are added, loaded and setup for synchronization with Db2 Data Gate, there is no impact to concurrent workloads executing on the source tables. The source tables are fully online for reading and writing while Db2 Data Gate makes the copy and starts synchronizing.
 
 
Our next blog ...
Next time we will look at Db2 for z/OS Data Gate and Watson Knowledge Catalog integration.
Read more…