cli (4)

There are several use cases where data extracted from live data streams such as Twitter may need to be persisted into external databases. In this example, you will learn how to filter incoming live Twitter data and write relevant subsets of Twitter data into IBM database DB2. Sample program will work against all flavors of IBM databases i.e. DB2 for z/OS, DB2 distributed, dashDB and SQLDB.

We will use Spark Streaming to receive live data streams from Twitter and filter the tweets by a keyword . We will then extract the twitter user names associated with the matching tweets and insert them into DB2. These user names extracted from Twitter can have many applications – such as a more comprehensive analysis on whether these Twitter users are account holders of the bank by performing joins with other tables such as customer table.

1) For a background on Spark Streaming, refer to

2) We will use TwitterUtils class provided by Spark Streaming. TwitterUtils uses Twitte4J under the covers, which is a Java library for Twitter API.

3) Create a table in DB2 called TWITTERUSERS using -


4) Create a new Scala class in Eclipse with contents available at this link. Change database and Twitter credentials to yours (as shown in Step 7).

5) Make sure the Project Build Path contains the jars db2jcc.jar (DB2 JDBC driver), spark-assembly-1.3.1_IBM_1-hadoop2.6.0.jar and spark-examples-1.3.1_IBM_1-hadoop2.6.0.jar, as shown below -


6)Lines 12 to 15 loads the DB2 driver class, establishes a connection to the database and prepares an INSERT statement that is used to insert Twitter user names into DB2.

7) Lines 17 to 24 sets the system properties for consumerKey, consumerSecret, accessToken and accessTokenSecret that will be used by Twitter4j library to generate Oauth credentials. You do this by configuring consumer key/secret pair and access token/secret pair in your account at this link – Detailed instructions on how to generate the two pairs are contained at

8) Lines 26 and 27 create a local StreamingContext with 16 threads and batch interval of 2 seconds. StreamingContext is the entry point for all streaming functionality.

9) Using the StreamingContext created above, Line 30 creates an object DStream called stream. DStream is the basic abstraction in Spark Streaming and is a continuous stream of RDDs containing object of Type twitter4j.Status ( A filter is also specified (“Paris”) which will select only those tweets that have keyword “Paris” in them.

10) In Line 31, map operation on stream maps each status object to its user name to create a new DStream called users.

11) Line 32 returns a new DStream called recentUsers where user names are aggregated over 60 seconds.

12) Lines 34 to 41 iterate over each RDD in the DStream recentUsers to return number of users every 60 seconds, and inserting those users into the database table TWITTERUSERS through JDBC.

13) Lines 44 starts real processing and awaits termination.

14) Following screenshot shows a snippet of console output when the program is run. Of course, you can change the filter to any keyword in line 29.


15) You can also run SELECT * from TWITTERUSERS on your database to confirm that the Twitter users get inserted.

Above simple Twitter program can be extended to more complicated use cases using Spark Streaming to do analysis of social media data more effectively, persist subset of social media data into databases and join social media data with relational data to derive additional business insights.

You can reach us for questions (Pallavi or Param

Read more…

We are seeing a trend of DB2 data being accessed by modern distributed applications written in new APIs and frameworks. JavaScript has become extremely popular for Web application development. JavaScript adoption was revolutionized by Node.js which makes it possible to run JavaScript on the server-side. There is an increasing interest amongst developers to write analytics applications in Node.js that need to access DB2 data (both z/OS and distributed). Modern DB2 provides a Node.js driver that makes Node.js connectivity straight forward. Below are step-by-step instructions for a basic end-to-end Node.js application on Windows for accessing data from DB2 for z/OS and DB2 distributed -

1) Install Node and its companion NPM. NPM is a tool to manage Node modules. Download the installer from

2) Note that DB2 Node.js driver does not support Node 4 on Windows yet. Node 4 support is already available for Mac and Linux. We will have Node 4 support for Windows out very soon.

3) Install a 64-bit version of Node since DB2 Node.js driver does not support 32-bit.

4) Run the installer (in my case node-v0.12.7-x64.msi). You should see a screen like Screenshot 1.

9524595887?profile=originalScreenshot 1

5) Follow the instructions on license and folder choice until you reach the screen for the features you want installed. Default selection is recommended and click Next to start intsall (Screenshot 2).

9524596664?profile=originalScreenshot 2

6) Verify that the installation is complete by opening the command prompt and executing node -v and npm -v as shown in Screenshot 3.

9524596487?profile=originalScreenshot 3

7) You can write a simple JavaScript program to test the installation. Create a file called Confirmation.js with contents console.log('You have successfully installed Node and NPM.');

8) Navigate to the folder you have created the file n and run the application using command

node Confirmation.js. Output looks like Screenshot 4.

9524597055?profile=originalScreenshot 4

9) Now install the DB2 Node.js driver using the following command from Windows command line: npm install ibm_db (For NodeJS 4+, installation command would be different as follows

npm install git+

10) Under the covers, the npm command downloads node-ibm_db package from github and includes the DB2 ODBC CLI driver to provide connectivity to the DB2 backend. You should see following output (Screenshot 5).

9524596886?profile=originalScreenshot 5

11) Copy the following simple DB2 access program in a file called DB2Test.js and change the database credentials to yours -

var ibmdb = require('ibm_db');"DRIVER={DB2};DATABASE=<dbname>;HOSTNAME=<myhost>;UID=db2user;PWD=password;PORT=<dbport>;PROTOCOL=TCPIP", function (err,conn) {

if (err) return console.log(err);

conn.query('select 1 from sysibm.sysdummy1', function (err, data) {

if (err) console.log(err);

else console.log(data);

conn.close(function () {





12) Run the following command from Windows command line to execute the program: node DB2Test.js. You should see Screenshot 6, containing the output of SQL SELECT 1 from SYSIBM.SYSDUMMY1. Your simple Node application can now access DB2.

9524597068?profile=originalScreenshot 6

13) For connecting to DB2 for z/OS, modify the Connection URL, DB name, port, user name and password to DB2 for z/OS credentials.

14) DB2 for z/OS access needs DB2 Connect license entitlement. In most production DB2 for z/OS systems with DB2 Connect Unlimited Edition licensing, server side license activation would have already been done, in which case you don't need to do anything about licensing. If you get any license error on executing the program, server side activation may not have been done. In that case, copy the DB2 Connect ODBC client side license file into ibm_db/installer/clidriver/license folder.

15) Also make sure that the DB2 for z/OS server you are testing against has CLI packages already bound (this would have been already done as part of DB2 Connect setup on the DB2 z/OS server).

16) Run the program with DB2 for z/OS credentials and you will observe similar output as Step 12.

17) Attached is a Node.js program (NodeDb2zosSelect.js) that fetches rows from DB2 for z/OS Employee table in the sample database (DSN8A10.EMP). For running the same program with DB2 distributed, make sure to not only change the database credentials, but also change the table name in the SELECT SQL to EMPLOYEE. In both DB2 for z/OS and DB2 distributed, you should see an output as shown in Screenshot 7.

9524597490?profile=originalScreenshot 7

Continue enjoying your Node.js test drive with DB2!

Read more…
With the advancements in DB2 server clustering technologies (both Sysplex and pureScale) and the DB2 Connect client drivers, it is now possible to achieve near continuous application availability. Availability is now no longer about only database server availability, but there is more focus on availability of applications that access the data stored on the server. Having the server up and running is not very meaningful if the applications connected to the server suffer connection failures and downtime. And customers rightfully have come to expect that their local and distributed applications originating from around the world, stay up and running 24X7.

DB2 Connect client driver availability technology has grown leaps and bounds over the last few years – this ensures that applications continue to function in case of server unplanned outages as well as planned outages such as rolling upgrades when members of a DB2 cluster are migrated one by one. Client driver availability algorithms complement server availability technologies to ensure highest availability in a distributed environment. A combination of features around workload balancing and failover have been built into DB2 Connect client drivers. While workload balancing achieves higher throughput by distributing connections and transactions to members with maximum capacity at any point in time, failover reconnects applications to next best available member in case of a connection failure.

In recent releases of DB2 Connect, default values of properties that determine high availability behavior have been changed to values that work well out of the box for majority of customers. Given the number of products in the DB2 stack (application server, client driver, DB2) and myriad high availability knobs at each layer, customers found it difficult to achieve an optimal end-to-end configuration leading to outages that could have been avoided in the fist place. Default value changes also control resource consumption and allow faster recovery from failover. For example, the default value of maxTransportObjets was changed from -1 (unlimited connections) to a more reasonable value of 1000, that prevents runaway applications from creating too many unnecessary connections and deplete system resources.

Number of features were added for easier problem determination in complex high availability setups. For eg, new statistics were introduced to better manage socket connections to clustered DB2. More details were added to the errors that are returned to the application going against a cluster – that provide the application more information on what kind of failover happened and whether the application needs to take further action for a successful transaction. For eg, reason codes and detailed message texts were added to the -30108 error that notifies the application that the transaction did not qualify for a seamless reroute to another DB2 member and that the application should resubmit the transaction for the reroute to occur.

High availability technologies built into the DB2 server and DB2 Connect client drivers complement each other and together have come a long way to satisfy customer's needs for continuous availability of business critical data and applications. To get the most availability for distributed applications, we do recommend you to upgrade DB2 Connect client drivers to the latest available levels.
Read more…

Application paradigm has evolved rapidly – new programming and scripting languages geared towards Web and Mobile spring up regularly. DB2 has embraced new developments in the application space and continues to be the database of choice for application developers. It has kept up with new programming trends - whenever a new programming language or framework gets adopted by the developer community, DB2 steps up and adds or enhances support. Selection of programming language by developers depends upon several factors such as skills, performance, usage scenario, whether libraries are available for desired functionality etc.

Primary APIs for DB2 include C and C++, Visual Basic and Visual C# (for .NET applications),and Java (JDBC and SQLJ). DB2's CLI/ODBC and JDBC drivers serve as the base for several open source wrappers provided by DB2 – such as Perl, PhP, Python, Ruby and Node.js Advantage of using the JDBC and ODBC/CLI drivers as the foundation is that they not only implement standard API, but also provide advanced features such as workload balancing, failover, security, connection management, monitoring etc. that the wrappers can take advantage of to build robust enterprise applications. DB2 also contributes actively to the open source communities to keep them up-to-date. We are also seeing adoption of frameworks and Object-Relational Mapping tools such as Hibernate, JPA, iBatis and Spring for enterprise applications that take advantage of accelerators provided by DB2 for ease of use and improved performance.

Expect to see below diagram grow fast over time as application developers experiment with new APIs and DB2 continues down the path of supporting those application developers, further strengthening its position as an ideal database server for Cloud, Analytics and Mobile.


Read more…