This article will help you to integrate IBM Watson with Machine Learning.
Prerequisites
First of all, you will require an account on a website known as Kaggle. Kaggle is perhaps the most popular website for Data Science and Machine Learning enthusiasts where a large variety of datasets are available to experiment with. Luckily, Kaggle is free and does not require any charges. On Kaggle, you will need to download the following files in the.csv file format.
- train.csv: This file has the data through which Watson ML can be trained to create ML model
- test.csv: This file contains data which will help to ascertain the performance of a Watson ML model
- gender_submission.csv: this file contains the relevant answers of the test data to measure Watson’s success
Once the download finishes and the files are saved on your computer, you will need a database. In this scenario, we will use a Compose PostgreSQL database. After creating a DB and signing-in to the deployment, you will require typing PostgreSQL’s relevant credentials. In the psql shell, we will need to develop a couple of tables. First table will correspond to the gender_submission.csv file while the other will correspond to the test.csv file.
One table can be named as “list” that will contain the entries of the passengers from the test.csv file. Enter an additional column and name it as “rescued”. The remaining table will be named as “saved_test” and will contain the entries from the gender_submission.csv file. As the tables have now been updated, we will now move over to the Watson.
Working with Watson Machine Learning
In order to work with Watson ML, you will require creating an account on IBM Bluemix, which can be done through registration on IBM’s Data Science Experience (DSX). IBM’s Data Science Experience helps to develop ML models via IBM Watson. However, unlike Kaggle, the service is not free but you will get a free trial for a month.
After registration, move forward and sign-in to your account. You will be presented with a web page which presents you with all ML work done on the DSX. Here, you can click on the option of “Create Project” to generate a new project or hit the button for “Create New” which is available on the navigation bar at the top part of the web page.
Now, you have to give a title to your project. Since we are dealing with passengers on a ship, let’s name it “Victory” (after the name of a famous British ship).
Afterwards, you will have to make an instance of Apache Spark. Apache Spark and Watson ML are often used in tandem as they are easily integrated and provide flexibility to software developers. You will have to get the Object Storage instance for saving the training data in the above mentioned files to the DSX project. You might be wondering why one should go through the hassle of working with .csv files. However, there is a reason behind it; Watson ML currently only works properly with .csv files.
Subsequently you will have to hit the button labeled “Create a new IBM Analytics for Apache Spark Instance” which will take you forward. Then you can now choose either a free plan or a priced one. For now, click on the free one and click on “Buy Apache Spark”. You will be asked to name it. For the purposes of this project, we will name it “Victory Spark”.
Afterwards you will have to click on the default instance of “Object Storage (Swift API)”. Click on the “create a new instance” option which will send you to a separate web page. You will be presented again with a free plan or a priced one. Hit the button of “Buy IBM Object Storage” and give it the name of “Victory object storage”.
Afterwards, you will be presented with a web page that has details of Notebooks, Data Assets and Bookmarks. You can add Watson ML’s files here. On the button “Models”, there will be an option of “add models”. Click it in order to develop a model of Watson ML. This model should have a name. You can name it as “Victory Model.” Now proceed to click on “Associate a Machine Learning service instance”. Subsequently, click on “Buy IBM Watson Machine Learning” and name it as “Victory Machine Learning”.
On the web page of the ML models, you will be presented with two options. Either the ML algorithm can be picked by you or it will be automatically assigned to you. You can experiment manually with all these algorithms to ascertain their performance with Watson.
For this example, click on “Manual” and then select “Create”. Now you will be on another web page with an option of “select data asset”. Here upload the .csv file and move forward. You will be asked to select a methodology here so that it can be processed for results. Here you can select your columns for processing.
Now you will be presented with classifications. Choose “Binary Classification”. It will help to guess the results of the passenger’s survival rates. The results can be then measured with the data in the “saved_test” table of our database. Now you can use ML algorithms like Logistic Regression or Random Forest through the option of “Add Estimators”. By testing all these algorithms, you can check which brings you the best result and save it.
Now you will be on the “Overview” web page and check the “Summary and “Input Schema”. On the top, you can click on “Evaluation” which presents data related to the model through which Watson ML trained. Moreover, there is also an option of “Test” which can help to test your data before the deployment.
There is also an option of “Deployments”. To deploy, click on “deployment” on the top and select “Add Deployment” and name it “Victory Deployment”. You will be then be presented with your deployment Application Programming Interface. Check the “View API specification” which will help you with integration and get you the relevant details.
Final Thoughts
With IBM Watson, ML enthusiasts have gotten their hands on a highly powerful tool due to its unique insights that drives better decisions. You must try Watson if you want to integrate Machine Learning into your application.