Getting started with the advanced ML (and Azure ML Studio) | Derry, NI, Ireland

Getting started with the advanced ML (and Azure ML Studio) | Derry, NI, Ireland

Getting started with the advanced ML (and Azure ML Studio)

Technologies are constantly evolving and yielding more options for us. As an expert in Computer Science, you might be skeptical about trends and brands but not about efficiency. This is the only thing you should be seeking in new tools. Nobody would dare to say that you can’t develop front-end part of your website in the Notepad. Would it be fast? Would it be convenient? Guess the answer.

The example with the front-end development can be extrapolated to every single subfield of computer science. Machine Learning (ML) isn’t an exception. Depending on the language you use for the scripting side, you may not want to change the IDE you feel comfortable with. But let me explain why the benefits are out of your comfort zone.

First answer the simple question: what are the basic requirements for ML? Loosely speaking, one can end up with the data and computer power. What is the trickiest point about the large sets of data? Data is about numbers and figures while human brain needs objects to show the best processing results. The solution is Visualization.

Azure Services of Microsoft are known as a cloud storage at the first instance. However, with the popularity of Data Science and ML, the situation has changed. So now you already have one more significant reason to get an Azure subscription –  Azure Machine Learning Studio.

What can I do in Azure ML studio?

  1. get and use any virtual machines you want on the Pay-As-You-Go terms
  2. create and manipulate numerous SQL databases
  3. leverage data visualization and get on without coding (if you don’t want to create your own ML algorithms)
  4. create your own ML algorithms and sell them on Azure marketplace
  5. advantage from the numerous sample projects and datasets

Now let’s look at Azure ML Studio in action.

 

Advancing into ML with Azure ML Studio

Let’s start by creating the Azure subscription. Don’t worry, it’s for free. Later, you may want to get more powerful virtual machines than the ones you get with the basic subscription. Then, you will have to pay for what you are using (Pay-As-You-Go). But now don’t bother yourself with it.

Go to Azure, sign up with your Outlook account (yes, you need one to work with Microsoft services like Azure), and create a new subscription for free.

After Azure subscription is created, you can directly go to Azure ML Studio and sign in with your Azure account. As easy as that.

Open the studio, and you will see your future workplace. On the left-hand side, you may see Projects, Experiments, Web Services, Notebooks, Datasets, Trained Models, and Settings. We will start with the creating a new Project.

Name it the way you want and write a few words as a description. Now we need to attach an experiment to the project we have created. The experimental tab is the place where all your data manipulations are going to happen. So press “+ NEW” on the bottom left-hand side.

You will see an empty tree-like diagram that contains only placeholders for your future operations. This diagram can tell a lot about how the ML studio works. By dragging from the left to the right you will be adding new manipulations. Take a look at the left side. You can browse possibilities by means of search line. However, the first thing everything starts with adding data. Drag “Import Data” to your experiment.

For the teaching purposes, we will import open source data from the UCI Machine Learning Repository:

http://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data

This file contains information about different autos. If you take a look at the data, you will notice that each line begins with the number from -3 to 3. Those symbols represent the risk factor level of the autos from the list. In the current experiment, we will try to predict risk level by creating, training, and evaluating the model from the features we have. If you click on the “Import data” block, on the right-hand side you will see the properties. Copy the URL of the source and paste it into the “Data source URL” field.

Press the right mouse button => Results dataset => Visualize on the “Import Data”. This how you will be visualizing your data sets all along the way.

When your data set is loaded, you can see not just a data table but also statistics for each column. Now let’s get back to the experiment. Next, we need to label some of our columns. Go to the left side panel and search for “Edit Metadata”. Then, drag and put it below the “Import Data” block. To connect the blocks. Press left mouse button on the dot of “Import Data” block and pull it to the “Edit Data” block.

Go to the properties of “Edit Metadata” block. To label the blocks, press “Launch column selector” and pick the following columns: Col1,Col3,Col8,Col12,Col21,Col22,Col26.

 

Now descend to the field “New column names” and put there the following: symboling,make,drive-wheels,width,compression ratio,horsepower,price.

To perform labeling, press right mouse button => Run Selected on the “Edit Metadata”. The current block and the previous one should become highlighted. After it’s over (green toggles should appear in the right upper corners of the blocks), visualize the data. You will see the columns with the newly acquired labels.

Let’s select columns that we need. Search for “Select Columns in Dataset” and drag it into the experiment. Then, create the connection from the “Edit Metadata block”. Click on the “Select Columns in Dataset” block, launch column selector, and choose the labeled columns.

We need to clean missing data in order to prevent false results. Again, search for “Clean Missing Data” item and drag it to the experiment. Connect it to the previous column and edit the properties. Here we need to select columns by type “Numeric”. The minimum missing value should 0 while the maximum missing value 1. Set “Cleaning mode” to “Remove entire row”.

Run it and visualize the data. There should be no empty columns.

As you may have noticed, every time you press “Run selected”, not only the new block but all the previous ones as well are getting involved. However, there’s something you can do with it now. Click on the “Clean missing data” block. On the bottom panel, you will see “Save as” option.

Press it, and you will get a new saved dataset from the data you’ve altered. You will be able to access it anytime, not only for this project.

On the left, find “My Datasets”, and find the one that has been created.

Drag it to your experiment. You can visualize it and make sure it’s the same data you got after labeling and cleaning.

After we have performed all the necessary steps for preparing data, let me remind you our mission. We are going to build a model that will help us to predict the risk level for the autos that we may want to test later. For this, you will need to add 4 more blocks: Multiclass Decision Forest, Train Model, Score Model, and Evaluate Model. This is how it should look like:

In the “Train Model” block, launch column selector, and pick “Symboling”.

Make sure that your data set is connected to “Train Model” and “Score Model” block at the same time. Click on the “Evaluate Model” block, and run selected. Then, visualize your data. You will get Metrix and Confusion Matrix data. Now you can see how accurate our model is.

 

You can retrieve this data with the help of the R or Python script (first you will need to add “Execute Script” item from the left side panel).

Together we have just created and trained model with the high predictive accuracy without any scripting. I recommend you to take a look at the sample projects and learn best ML practices.

 

Conclusion

Microsoft CEO Satya Nadella says about smart Azure services the following: “We are building out our infrastructure to … empower every developer to be able to infuse intelligence into everything that they are doing.”

But whatever useful Azure ML studio is, it won’t make you proficient data scientist. You need to understand in which case you need to use exactly that algorithm and why. However, if you do want to take root in ML field, Azure services can help you to pass the way from the beginner to the advanced level significantly faster than you may be able to do it on your own.

Dee-Technology Specialist

Need a Software Solution or Support ?

Any bespoke software and applications development CRM - ERP - CMS

Click here to start your project now