Is there a future for Analytics on IBM Z ? - Machine Learning on z/OS

This blog is part of a multi blogs story published by Aymeric Affouard, Guillaume Arnould, Khadija Souissi and Leif Pedersen. Here are the links to the previous entries:

Part 1: Is there a future for Analytics on IBM Z?

Part 2: IBM Db2 Analytics Accelerator

What’s Driving Digital Transformation?

Customers are not as loyal as they used to be. As a customer, if you don‘t get the experience you are looking for, you look for alternatives. Digital transformation will be rated by your customers based on that experience. Customer retain & help driving your business.

How quickly you can deliver something new into the market, you might have all the data. You need to make that data available. You might have collected and consolidated the data in different places (customer data/ credit card information) in order to get 360° view. If you add more and more data to this kind of data repository, there are challenges like how fast you can make the data available in a format that you can draw resolutions from that and make the experience for the customer.

Last but not least not just as customer base, but also when working with your partners (BPs or other partners), you want to simplify and streamline the whole workflow around data provisioning.

Machine learning is helping to drive improved customer service through personalization, drive greater employee productivity by helping to make the best decision at the right time and serves as a foundation for innovation by uncovering customer behaviors and product uses that you may not be aware of.

According to a Forrester Research study: “Insight-driven businesses bring insight, not just data, into every decision, and they know exactly how to use them for the greatest advantage across the entire customer life cycle. For these firms, digital insights and what they do with them are their secret weapons --- to disrupt your market and steal your customers”.

That statement really encapsulates what Machine Learning can bring to your organization.

But What is really Machine Learning?

Let’s start with the basics. It feels like you can hardly listen to a podcast, blog or technical publication without coming across Machine Learning. It powers everyday services we use like Watson, Siri, Facebook, and Google. But what is it? In short, it’s a way for computers to learn without explicitly being programmed. Or said another way, it’s software that can write software.

Not exactly a new idea: Neural networks were discovered by Santiago Ramón y Cajal early 1900, AI followed in 1950s… but, it finally works! Or at least, it works in specialized forms: Language translation, language understanding, speech transcription, Object detection, face recognition, …

It exists different types of Machine Learning such as,

Classification where Data points are labeled and are being used to predict a category. You can have Two-classes vs multi-classes. Typical examples are Fraud Detection (fraud vs non-fraud), Spam email detection (spam vs non-spam).

Regression when a value is being predicted, for instance Stock prices prediction.

Clustering when Data points are not labeled. The goal is then to group data into clusters to better organize the data.

To accomplish these calculations, Machine Learning is using a huge variety of mathematical algorithms to compute all the input information. These algorithms are classified in 3 groups:

Supervised Learning which is Task driven using Regression and Classification technics.

Unsupervised Learning which is Data driven using Clustering technics.

And Reinforcement Learning when Algorithms learn how to react to the environment.

IBM Machine Learning for z/OS

Now let’s focus specifically on IBM Machine Learning for z/OS.

Data science efforts represent a significant investment in skills, time, hardware and software. Data scientists and the business teams that depend on them, want to get the most out of the models that they develop.

To help data scientists build, deploy and monitor behavioral models, IBM introduced Machine Learning for z/OS.  Machine Learning for z/OS is an enterprise grade, collaborative and extensible machine learning offering. It runs on IBM Z and benefits from all of the advantages mentioned before.

As it provides faster model development, deployment and monitoring, Machine Learning for z/OS offers a quick return on investment. 

Through a hybrid cloud approach to model life-cycle management and collaboration it gives data science teams the flexibility to train and evaluate their models on their platform of choice.  And for organizations that develop models on other platforms, using Spark or Python, they can easily deploy these models on IBM Z, where the majority of their transactions occur and most of their enterprise data originates.

Scoring can be easily integrated with transactional applications, without significant overhead, enabling real time insight at the point of interaction.

Machine Learning for z/OS is built on open source components but delivers far more than open source does. IBM's unique patented features enable data science teams to collaborate productively.

  • For example, data engineers and application developers work in SQL, Java and sometimes COBOL. Data scientists work in Scala, Python, R, SPSS and SAS. Collaboration between these team members is essential.  With Machine Learning for z/OS they can share assets, simplify deployment via RESTful APIs, and facilitate the management and monitoring of the hundreds to thousands of enterprise models.
  • Machine Learning for z/OS can evaluate the performance of models as they are exposed to new data. Data scientists can set a threshold for when they want to be notified, if and when model performance deteriorates. They can also schedule regular model evaluations and feedback data can be stored for retraining to help continuously improve model performance.

A data science team could certainly do a great deal of this themselves, but it would demand a significant investment in time and money.  They would be wasting time and effort building software instead of building models that help you transform and innovate.


Model Development Tools for Both Coders and Non-Coders

Machine Learning for z/OS has IDE for both coders and non-coders. Because data scientists usually have various background.

For some data scientists, especially data scientists with compute science or mathematical background, they need more powerful tools and usually they have their favorite data processing, visualization and all kinds of libraries. They prefer to use programing IDE.

In Machine Learning for z/OS, we provide Jupyter Notebook for Coder. Jupyter Notebook is an open source tool, probably the most popular open source IDE for data scientists. Jupyter Notebook supports interactive programming. This is critical because the work of data scientists is iterative. Jupyter Notebook supports multiple languages like Scala or Python. It also supports table, charts, graphs for visualization.

Last but not the least, it supports import/export so people can share their work.

For some others, they like to use wizard or canvas to create models by drag & drop. Because they don’t have to learn a programming language in this way, or can quickly create a prototype.

For non-coders, besides Notebook, we also provide Visual Model Builder. They can just follow a wizard to process data and create a Spark ML model without any coding. Data scientists tell the tool where the data is, what the features and label are, what the algorithm they want to choose, then a model is created. Very simple.

Utilities to Accelerate Every Stage of Machine Learning

Data scientists can also use the libraries packaged in Machine Learning for z/OS to accelerate the most time-consuming work.

For instance, data preparation is painful for most data scientists. They have to spend time to fix the data quality problems. For instance, they need to fill the missing values if any exists in the attributes that matter. They need to figure out how to encode or index string data type to numeric data type because algorithm takes numeric data type as input only. Machine Learning for z/OS provides a utility auto data preparation (ADP) to automate the work saving data scientists’ time.

Machine Learning for z/OS also provides automatically modelling tool Cognitive Assist for Data Scientists (CADS). CADS can help data scientists find out the model with best performance from dozens or hundreds of candidates. It’s not just automation to evaluate all models and pick the best one. It uses a smart approach by evaluating the performance of a model on a small dataset to predict the performance of model on a larger dataset. In this way, CADS saves lots of resources and time to evaluate all hundreds of models. The same methodology can be applied to hyper-parameter tuning. The utility to automate hyper parameter tuning is called HPO. They both come from IBM research and could significantly help data scientist pick the best model with less time and resource.

Highly Available Online Scoring Services

Online scoring service is the service making prediction instantly and usually called in transaction. So, the performance, availability and scalability of online scoring service are very important to operationalize machine learning.

Machine Learning for z/OS supports PMML, Spark ML and scikit-learn models. PMML is an industry standard that almost all vendors and open source tools support. Spark ML is the format Spark machine learning library generates. Scikit-learn is the most popular machine learning library in Python community. So, we have a good coverage on mainstream machine learning framework.

Machine Learning for z/OS scoring engines exist for each model type. PMML and Spark ML use JVM based scoring engines, deployed in Liberty. Scikit-learn use the Python based scoring engine, deployed in Flask and uWSGI server, that are Python web server.

Also, Machine Learning for z/OS supports JVM based scoring service deployed in CICS region. This is a new feature of CICS TS 5.3 and 5.4. With this feature, we could see significant performance benefit. That also simplifies the work for COBOL developers making scoring calls.

As all services on Z, high availability is a basic requirement. Machine Learning for z/OS Online scoring service takes advantage of Liberty’s HA architecture.

In Summary

To recap on the benefits of Machine Learning for z/OS, it Moves Machine Leaning capability to the platform where the most valuable data resides, it Integrates real-time predictive analytics with transactions and finally it Leverages z/OS superior reliability, availability and security.

This Blog continues here :

Part 4: Operational Decison Manager

Part 5: Modernizing Applications by using API’s

Views: 134

Add a Comment

You need to be a member of The World of DB2 to add comments!

Join The World of DB2

Bringing Db2 enthusiasts together virtually. Expert or novice, distributed or mainframe, this is the place for everything DB2.


Introducing IBM Db2 for z/OS Developer Extension for Microsoft Visual Studio Code

Started by Calene Janacek in Application Development and DB2 Jul 30. 0 Replies

We are excited to announce that the first iteration of IBM Db2 for z/OS Developer Extension is available now as a free downloadable extension in the…Continue

QMF Governor

Started by Maitena Gallastegi Ginea in Application Development and DB2. Last reply by Maitena Gallastegi Ginea Jul 30. 4 Replies

Hi,We are using QMF Governor to limit the QMF queries of users.We have configured correctly and it is working OK. We want to get statistics of those queries canceled by QMF Governor but we are not able to discover where that information is stored.…Continue

© 2020   Created by Surekha Parekh.   Powered by

Badges  |  Report an Issue  |  Terms of Service