1  Introduction to Machine Learning

1.1 What is the difference between traditional programming and machine learning?

Suppose we had some data related to some houses, where one particular column is related to the surface area of each house. In traditional programming, we would call a formula or a program that we would execute to find the price of the house.

Machine learning would go about it in a different way - we would still input the data, so the computer still has access to it. However, this time we don’t know any formulas, the only thing we know is the price of the house. From this information, we are able to find out what the program was that calculated the input.

Traditional Programming vs Machine Learning.

1.2 What is machine learning?

Machine Learning is a field devoted to building methods/algorithms that learn based on the data and return a program/model that will automise and improve performance on some tasks.

Machine Learning can be broadly categorised into three types based on the nature of the learning and the data available:

Supervised Learning:

  • This is the most common technique.
  • In supervised learning, you have an input variable (X) and an output variable (Y), and you use an algorithm to learn the mapping function from the input to the output.
  • The goal is to approximate the mapping function so well that when we have new input data (X), we can predict the output variables (Y) for that data.
  • Examples include regression, classification, etc.

Unsupervised Learning:

  • Unlike supervised learning, in unsupervised learning, you only have input data (X) and no corresponding output variables.
  • The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data.
  • Examples include clustering and association.

Reinforcement Learning: - Distinct from supervised and unsupervised learning. - In reinforcement learning, an agent makes observations and takes actions within an environment, and in return, it receives rewards. - The objective is to find a strategy, known as a policy, that will result in the maximum cumulative reward for the agent over time. - Instead of having clear input-output pairs like supervised learning, it works on a feedback loop where the agent continuously adjusts its actions based on the rewards it receives. - Examples include game playing, robot navigation, and real-time decisions in various fields.

The different types of machine learning.

1.3 Classification and Regression

The main difference between classification and regression is that classification is used for categorical data, and regression is used for numerical data.

For regression, we are looking to predict an output value that is a real number. For example, if predicting what the temperature will be tomorrow, we could use regression as the output value is numerical.

For classification, we are looking to predict an output value that is in a category. In this example, instead of predicting the exact temperature, we would be predicting whether it will be hot or cold tomorrow.

Classification vs Regression

1.4 Machine Learning model design workflow in the Data Analytics lifecycle

Model Design Workflow.