The Most Popular Machine Learning Algorithms Explained | by Thomas Vato | Feb, 2022 – DataDrivenInvestor

Image source

We’ve already talked about artificial intelligence and how much it’s been evolving.

Behind all those developments, there’s a lot of programming & data stuff. For some, it may sound boring but the fun part is — the most popular algorithms are quite simple to understand.

For concept building…

Complex systems unusually have basic rules beneath to support them.

These are the most popular machine learning algorithms:

Image source

So we will grab four from the top. Not only because of popularity but also they related with one another. They belong to supervised machine learning.

One — Linear Regression.

This machine learning algorithm is the easiest one to understand. Also used in statistics, linear regression combines two variables. Where one is considered being independent variable and the other is a dependent one.

To do the modeling important thing to understand is if there is any relationship between the variables to display some correlation.

Correlation does not imply causation

Patterns and graphs time. For example, you want to understand the weight you’ll gain depending on your intake of calories and the food you eat. You’re given the following data:

However, you want to know how much weight (y-axis) you’ll gain if your intake is 2500 calories(x-axis). First, you’ll draw a scatter plot of the data to visualize it. Then, you’ll imagine a straight line drawn close to every data point in the graph, like this:

Simple:

Less simple

(With a completely different dataset):

Image source

What can you do with Linear regression?

  • To do predictive analytics
  • To do sales forecasting
  • To discover trends
  • To see how office temperature impacts overall productivity
  • To see how business availability impacts sales
  • To see how food consumption impacts health
  • To see how investment in marketing pays off
  • Etc…

A good thing to keep in mind is linear regression does not care about classification. That’s why we go to machine learning algorithm number 2.

Two — Logistic regression

This machine learning algorithm is about

Classification, ordering, rating, categorization, grouping.

This is a classification algorithm used for predicting the categorical dependent variable using a given set of independent variables. Depending on the number of variables take a look at two types:

Binomial: there can be only two possible types of variables. Or: spam or not spam, the tumor is malignant or non-malignant.

Good visual representation of binomial classification

Image source

Where continuous classification makes tumors be classified from non-dangerous to dangerous depending on their size. Or…

The smaller the tumor — the less dangerous… And the larger the tumor — the more dangerous it becomes. Such common sense displays binomial classification used in Machine learning.

Multinomial: there can be more than 2 possible types of variables. Or: small, medium or large

A good visual representation would be

Image source

Where binomial & multinomial display their differences.

What can you do with logistic regression?

  • To solve classification problems
  • To teach systems to detect either email is spam or not.
  • To group the occurrence of certain words
  • To measure success rate by win/ lose ratio
  • To perform sentiment analysis by words
  • To classify people by income/ marital status or postcode…
  • To structure your time
  • Etc.

Three — Decision Tree

This machine learning algorithm is about

Classification, ordering, rating, categorization, grouping.

+

decision making by learning and applying simple decision rules.

The tree can be explained by two entities, namely decision nodes and leaves. The leaves are the decisions, the outcomes or the final nodes. And the decision nodes indicate a step where a decision has to be made.

Confused?

A good visual representation:

Image source

On this tree, you will notice if you were given a situation to accept a job offer you will run through decision nodes. Pretty much a flow chart for decision-making.

Decision trees are popular in data analytics and machine learning, with practical applications across sectors from health, to finance, and technology.

What can you do with decision trees?

  • Act on insights generated by linear or logistic regressions
  • Do strategic decisions
  • Do operational decisions
  • Decide whether to buy or not
  • Decide whether to party or not
  • Decide whether to do or not
  • Decide whether to work or not
  • Etc.

Four — Random forest

This machine learning algorithm is about

Putting one decision tree close to another to make a forest.

+

Taking many conclusions from decision trees to extract the final one.

A good visual representation to understand a random forest. Portrait of this machine learning algorithm:

Image source

It is constructed from a group of decision trees and is used to solve both regression and classification problems.

The “forest” is trained through bagging (an ensemble meta-algorithm that improves the accuracy of machine learning) or bootstrap aggregating.

This is how it works:

Image source

Where first you get data to train it. Then you do mini decisions. And after, it makes a final decision as an ‘average’ of all mini-decisions coming from decision trees.

The algorithm establishes the outcome based on the predictions of the decision trees by taking the average or mean output from the various trees.

The more trees, the more precise outcome.

Decision trees compared to random forests are easier because it only combines some of them, whereas a random forest combines many decision trees.

Decision trees are faster to make and operate easily even on a large dataset. However, if you want a more stable and reliable prediction, taking a rigorous, random forest is the best option.

Random forest algorithm has been applied across several industries, such as finance, healthcare, and e-commerce.

What can you do with Random Forests?

  • To do predictive analytics
  • To recommend what movies to watch
  • To advise what stock to invest
  • To suggest what smartphone to buy
  • To predict the quality of wine
  • To forecast the weather
  • To measure potential market response to the launch of a new product
  • Etc.

Thank you for reading till the end. If you liked this piece — please consider sharing or giving some claps so others can find this article too.

Spread the love

Leave a Reply

Your email address will not be published.