DBSCAN Clustering Algorithm ~ Coding Interview Questions With Solutions

Thursday, 20 July 2017

DBSCAN Clustering Algorithm

Krishna Chaurasia machine learning No comments

Hi friends,

In this post, we will discuss the DBSCAN (Density-based Spatial Clustering of Applications with Noise) clustering algorithm. DBSCAN is one of the most common clustering algorithms.

Parameters:

The algorithm takes two important parameters:

Epsilon - also called the neighborhood value is the distance-measure based on which the similarity between the points is defined. Two points are said to belong to the same cluster if they are at most epsilon distance apart
minpts - the minimum number of points required to form a cluster

Important points:

In DBSCAN, a single object is represented as a numerical point in some space.
A neighborhood of a point includes the set of all points that are at most epsilon distance apart from it
A point in a DBSCAN can of three types:

core point - which has at least minpts points in its neighborhood
border point - one which has a core point in its neighborhood
noise point - one which is neither a core nor a border point and is considered an outlier in the dataset

Algorithm:

The DBSCAN starts with a random point and performs a DFS (depth first search) from that point to identify it neighbors

It recursively applies the DFS for each of the identified neighborhood points until there can be no more points that can be added to the set. This resulting tree structure represents a cluster in the DBSCAN algorithm.

The DBSCAN repeats with steps 1 and 2 until all the points in the dataset are explored

Advantages of DBSCAN:

It can discover any number of clusters
Clusters of varying shapes and sizes can be obtained using the DBSCAN algorithm
It can detect and ignore outliers

Disadvantages of DBSCAN:

The epsilon value is too sensitive

too small a value can result in elimination of spare clusters as outliers
too large a value would merge dense clusters together giving incorrect clusters

Refer this link to view the video tutorial of the same.

Thursday, 20 July 2017

DBSCAN Clustering Algorithm

Parameters:

Important points:

Algorithm:

Advantages of DBSCAN:

Disadvantages of DBSCAN:

0 comments:

Post a Comment

Contact Me

Popular Posts

Categories

Blog Archive

| Privacy Policy | Disclaimer | Resume |

| Terms of Use | Contact Us | Site Map |