Anomaly Detection Using Isolation Forest in Python

In this article we'll cover:

You can run the code for this tutorial for free on the ML Showcase.

This is a companion discussion topic for the original entry at

Hi! interesting article, it opened my mind. But I am still wondering, if Isolation Forest is an unsupervised learning method, then why do we need to train it?

Hi, unsupervised algorithms also require training. It just that they don’t have a target label

Thank you for very simple and straightforward explanation. I am new to Python and data science (I mostly focus on embedded devices and now want to connect them to ML, ANN and DL, therefore, learning python and these concepts).
I have two questions here, if you could please elaborate them

  1. df[‘anomaly’] currently gives the index location where anomalies were detected. How to pull the respective values from the Salary columns which are faulty?

  2. How to plot original Salaries along with the Anomalies (discrepancies in salaries) onto the same graph for a quick visual interpretation ?

Thanks a lot for your time