PART: 1 AN INTRODUCTION IN LAYMAN’S TERM

Anomaly Detection
Anomaly Detection

THE CONCEPTS INVOLVED IN THIS ARTICLE ARE:

  • About the Problem Statement

Let us begin with a story:

  • Let us assume that, there is a company name Splash, and you are a newly joined Machine Learning Engineer at the company, held at 30 LPA CTC. So, the below thing describes the company and the task in hand.

1. About the Problem Statement:

  • Since the company mainly helps the retailers with Business Intelligence, the company is mainly focused on viewing the client’s data, and then see if there is an anomaly on a given period, and try to quantify the measure with the help of the anomaly score.

So, now we would frame the question that, predict anomaly (at a given time, if it is present), and then try to quantify it.

  • Let me clarify the word “quantify” , it means that, as compared to all the previous anomalies, how sure are you (on the scale of 0–100), that the given value is an anomaly.

DATASET:

You would be given a Comma Separated File (CSV), which would contain the following columns:

  • timestamp (Data Type: String, You know the meaning, it means at a particular duration)

SO, PROBLEM STATEMENT AND DATASET CLEAR? I HOPE SO……

Anomaly

2. Why is the problem statement important?

First of all, think about it from your perspective, then we would see it from the world’s perspective, maybe something new could come up, isn’t it?

OUR PERSPECTIVE:

  • If we both are running a business, why would anomaly matter to us? Maybe, if I am running a sales business, why would anomaly matter? Because…..

WORLD’S PERSPECTIVE:

  • Eric Ogren, the senior security analyst at 451 Research, describes anomaly detection as “security analytics”.

Quoting Ogren again, “Two years from now, analytics will drive most organizations’ security strategies as operations teams use insights gleaned from analytics to apply preventive measures.

  • It will be analytics first, and then more pinpoint, siloed-type approaches based on what the analytics tell you.”

3. PRACTICAL USE CASES:

The question comes, why to get information about the anomaly, what is its significance in real life, how the corporate world deals with anomaly?

  • MEDICAL DOMAIN: Anomaly, in medical domain??? Yes, in the medical domain, it is used in detecting some cells which are anomalous in nature (could be detecting a tumor in the brain cells)
Photo by Yiorgos Ntrahas on Unsplash

THERE ARE MANY MORE OTHER USE CASES, HOWEVER WE NEED TO JUST GO OVER IT, AND LEARN HOW TO MAKE A ANOMALY DETECTION PROGRAM.

4. VARIOUS CONCEPTS INVOLVED IN THE UPCOMING ARTICLES OF THE SERIES

So, in the upcoming articles, we would various algorithms used for the classification, and use some statistical methods to find out the anomaly score for the given anomaly.

I hope you found this article useful, if you have any suggestions or criticism, they are most welcome

Thanks…..

I am a Data Science enthusiast, currently exploring some fancy terms such as Machine learning, deep learning, etc.