Types of Machine Learning Algorithm You Should Know

First, you have a question: What is Machine Learning?

Machine learning is the study of computer algorithms that can improve automatically through experience and by the use of data. It is a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so.

Machine Learning Process

According to Arthur Samuel,
“Machine Learning is a field of study that gives computers the ability to learn without being explicitly programmed.”

According to Tom Michell,
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

Types of Machine Learning Algorithms

There are so many types of Machine Learning systems, it is useful to classify them in broad categories based on:

  • Whether they are trained in human supervision

Machine Learning systems can be classified according to the amount and type of supervision they get during training. There are four major categories:

  1. Supervised Learning
  2. Unsupervised Learning
  3. Semi-supervised Learning
  4. Reinforcement Learning
  • Whether they can learn incrementally on the fly.
  1. Online Learning
  2. Batch Learning
  • Whether they work by simply comparing new data points to known data points, or instead detect patterns in the training data and build a predictive model, much like scientists do.
  1. Instance-based Learning
  2. Model-based Learning

Let’s understand all one by one with a simple explanation:

Supervised Learning

This algorithm is like approximation concept. Suppose you have given dataset where X is features and y is label, in simple language x is input and y is output of those inputs. Now you have to train a model with this x and y and select best function that computes xy with less error. Now, with the help of that function, you can predict more and more output by giving input to the function.

Here we, human, acts as a teacher where we feed the computer with training data containing the input and we show it the correct answers (output) and from the data the computer should be able to learn the patterns.

Regression

This model predicts continuous value output. Used for estimating the relationships between a dependent variable (y or ‘labels’) and one or more independent variables (x or ‘features’).

Regression Model
Credits: Wikipedia

Classification

This model predicts discrete value output. It gives class as output. For example, in Spam Classification there’s 2 outcome Spam or Non-spam. So according to input data, it gives any one class as output.

Spam or Ham: Spam Classification

Common Algorithm

  • Support-vector machines (SVM)
  • Linear regression
  • Logistic regression
  • Naive Bayes
  • Linear discriminant analysis
  • Decision trees
  • K-nearest neighbors algorithm
  • Neural networks (Multilayer perceptron)

Unsupervised Learning

This model identify pattern from data. It divides data into categories and then, from new data, it checks where new data lies and gives output. Here dataset doesn’t have ‘Label’ or y and we have to just find some pattern or structure from data.

Here there’s no teacher at all, actually the computer might teach you new things after it learns patterns in data. These algorithms a useful where the human expert doesn’t know what to look for in the data.

Clustering and Association Rule Learning Algorithm

Clustering algorithm is grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups.

Association rule learning checks for the dependency of one data item on another data item and maps accordingly so that it can be more profitable. It tries to find some interesting relations or associations among the variables of the dataset. It discovers the interesting relations between variables.

Clustering: Given many items find cohesive subsets of items.

Association Rule Learning: Given many baskets find which items inside a basket predict another item in the basket.

Common Algorithm

  • K-means clustering
  • KNN (k-nearest neighbors)
  • Hierarchal clustering
  • Anomaly detection
  • Neural Networks
  • Principle Component Analysis
  • Independent Component Analysis
  • Apriori algorithm
  • Singular value decomposition

Semi-Supervised Learning

In Supervised Learning, we use a labeled dataset, and in Unsupervised Learning, we use an unlabeled dataset. Semi-Supervised Learning is lies between these two. Some algorithms can deal with partially labeled training data, usually a lot of unlabeled data and a bit of labeled data.

Here are some real-world application of Semi-Supervised Learning

  • Speech Analysis
  • Photo-Hosting Service (Like Google Photos)
  • Web Content Classification
  • Text Document Classification

For example, Deep Belief Networks (DBNs) are based on unsupervised components called Restricted Boltzmann Machines (RBMs) stacked on top of one another. RBMs are trained sequentially in an unsupervised manner, and then the entire system is fine-tuned using supervised learning techniques.

Reinforcement Learning

This is a very cool method in Machine Learning and also branch of AI. The learning system, called an agent in this context, can observe the environment, select and perform actions, and get rewards in return (or penalties or negative rewards). It must then learn by itself what is the best strategy, called a policy, to get the most reward. A policy defines what action the agent should choose when it is in a situation.

Scenario of Reinforcement Learning
Reinforcement Learning Scenario

Look at this example of Robot Implementation to understand Reinforcement Learning. There is an environment in which a robot is an agent. There are 2 states, one is Fire Side and other is Water Side. If robot go in Fire Side, it’ll punish and if go in Water Side, it’ll rewarded. And once robot goes to the Fire Side, it’ll learn policy that Fire Side is bad so won’t go there.

Robot implementation with Reinforcement Learning
Reinforcement Learning with Robot implement
  1. Observe
  2. Select action policy
  3. Perform Action
  4. Get Reward or Penalty
  5. Update Policy (Learning)
  6. Iterate until an optimal policy is found

Google’s DeepMind project builds AlphaGo from Reinforcement Learning for play Go Game. It is the best example of a power of Reinforcement Learning. It beat the World Champion (Ke Jie) of Go. It learned its winning policy by analyzing millions of games, and then playing many games against itself. And then it was just applying the policy it had learned.

Common Algorithm

  • Q-Learning
  • Markov Decision Process
  • Temporal Difference

Online Learning

In online learning, you train the system incrementally by feeding it data instances sequentially, either individually or by small groups called mini-batches. Each learning step is fast and cheap, so the system can learn about new data on the fly as it arrives. Here, data becomes available in a sequential order and is used to update the best predictor for future data at each step.

Online Learning

Online learning algorithms can also train systems on huge datasets that cannot fit in one machine’s primary memory, this is called Out-of-Core learning. The algorithm loads part of the data, runs a training step on that data, and repeats the process until it has run on all the data.

Online Learning with huge dataset

Batch Learning

In Batch Learning, the system is incapable of learning incrementally. It must be trained using all the data. This will take a lot of time and computing resources, so it is typically done offline. First, the system is trained, and then it is launched into production and runs without learning anymore; it just applies what it has learned. This is called Offline learning.

If you want a batch learning system to know about new data (such as a new type of spam), you need to train a new version of the system from scratch on the full dataset (not just the new data, but also the old data), then stop the old system and replace it with the new one.

Also, training on the full set of data requires a lot of computing resources (CPU, memory space, disk space, disk I/O, network I/O, etc.). If you have a lot of data and you automate your system to train from scratch every day, it will end up costing you a lot of money. If the amount of data is huge, it may even be impossible to use a batch learning algorithm. So better option is that use algorithm that can learn incrementally.

Instance-Based Learning

Instance-Based Learning (also called Memory-Based Learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have been stored in memory. Because computation is postponed until a new instance is observed, these algorithms are sometimes referred to as lazy.

For example, just flagging emails that are identical to known spam emails, your spam filter could be programmed to also flag emails that are very similar to known spam emails. This requires a measure of similarity between two emails. A similarity measure between two emails could be to count the number of words they have in common. The system would flag an email as spam if it has many words in common with a known spam email.

Instance-Based Learning

The system learns the examples by heart, then generalizes to fresh cases by comparing them to the learned examples using a similarity measure. For example, the new instance would be classified as a triangle because most of the most similar instances belong to that class.

Model-Based Learning

Another way to generalize from a set of examples is to build a model of these examples, then use that model to make predictions. This is called Model-Based Learning.

Model-Based Learning

Thanks for reading it!

Follow for more on Medium, I’ll share more Machine Learning stuff soon. Here’s my Twitter, follow and connect with me there and feel free to DM.

--

--

--

Machine Learning Enthusiastic | Computer Science Student

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Building an End to End Search Engine Chatbot for the website using Amazon Lex, Google Knowledge

Digital Twin of the News — AI enhanced news and EO data just a few clicks away

Twitter Sentiment Analysis using Vader & Tweepy

Tracking cars and pedestrians from an autonomous vehicle

How to create your own NLP for your Chatbot: Deploy Rasa NLU on AWS

The Anatomy of a Machine Learning System Design Interview Question

graphic showing brain and two people talking and asking questions

How Microsoft Uses Transfer Learning to Train Autonomous Drones

Never struggle again to share data between your Kubeflow Pipelines components

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Kishan Modasiya

Machine Learning Enthusiastic | Computer Science Student

More from Medium

Heatmap For Correlation Matrix & Confusion Matrix | Extra Tips On Machine Learning

Sports Person Classification

Predicting Student Performance Using Machine Learning

Machine Learning: Regularization Techniques