Machine Learning Cookbook

Author

Georgina Dangerfield

Welcome!

This ebook contains all the material covered in the Machine Learning module: regression, classification, and time series. Its primary aim is to have all the code in one place to make it easier to find what you are looking for, as well as some extra guidance about the process of each machine learning model and the answers to some questions that you might be thinking of.

Hints and tips

Moving over from the training data to the actual data you use at work might be challenging, but don’t panic. If you encounter an error, try and work out what it is telling you and spend some time trying to fix it. Google is your friend - someone out there is bound to have had the same issue as you! Use the steps below to help you solve any errors you might get.

  1. Start by carefully reading the error message. Python error messages are designed to be helpful and often point you directly to the issue, including the line number and a description of the error.

  2. It’s easy to overlook the simple things. Ensure that:

  • Your data is correctly loaded and formatted.
  • You haven’t missed any essential imports.
  • All variables and functions are named correctly.
  1. Insert print statements before the error line to check the values of variables and make sure they’re what you expect.

  2. The official documentation for libraries like pandas, scikit-learn, or matplotlib is incredibly detailed and often has examples similar to what you’re trying to achieve. Stack Overflow, GitHub, and other coding forums are invaluable for specific issues.

  3. If the error isn’t obvious, try to isolate it. Comment out sections of your code and run it piece by piece. This method can help you pinpoint exactly where things are going awry.

  4. In machine learning, many errors stem from the data itself:

  • Missing or infinite values
  • Incorrect data types
  • Mismatched array shapes

Different machine learning models have their own specific requirements and quirks. Ensure that your data meets these requirements, like the shape of the input array, categorical data encoding, or feature scaling.

  1. If you’ve been at it for a while and the solution isn’t coming to you, take a short break. A little distance can sometimes help you see the problem in a new light.

  2. Don’t be afraid to ask for help from colleagues or your trainer/mentor. Explain what you’ve tried so far and share the specific error message or issue.

Disclaimer

This book was put together in order to be available to you as soon as possible. Therefore, I may have missed some typos or made a mistake somewhere. If you spot any errors, think of some more questions that should be in the FAQ to benefit other apprentices, or have some general feedback about how this ebook could be improved, please send me an email.