Thursday, 20 July 2017

DBSCAN Clustering Algorithm

Hi friends,

In this post, we will discuss the DBSCAN (Density-based Spatial Clustering of Applications with Noise) clustering algorithm. DBSCAN is one of the most common clustering algorithms.

Parameters:

The algorithm takes two important parameters:
  1. Epsilon - also called the neighborhood value is the distance-measure based on which the similarity between the points is defined. Two points are said to belong to the same cluster if they are at most epsilon distance apart 
  2. minpts - the minimum number of points required to form a cluster

Important points:

  1. In DBSCAN, a single object is represented as a numerical point in some space.
  2. A neighborhood of a point includes the set of all points that are at most epsilon distance apart from it
  3. A point in a DBSCAN can of three types:
    1. core point - which has at least minpts points in its neighborhood
    2. border point - one which has a core point in its neighborhood
    3. noise point - one which is neither a core nor a border point and is considered an outlier in the dataset


Algorithm:

  1. The DBSCAN starts with a random point and performs a DFS (depth first search) from that point to identify it neighbors


  2. It recursively applies the DFS for each of the identified neighborhood points until there can be no more points that can be added to the set. This resulting tree structure represents a cluster in the DBSCAN algorithm.  



  3. The DBSCAN repeats with steps 1 and 2 until all the points in the dataset are explored

Advantages of DBSCAN:


  1. It can discover any number of clusters
  2. Clusters of varying shapes and sizes can be obtained using the DBSCAN algorithm
  3. It can detect and ignore outliers

Disadvantages of DBSCAN:

  1. The epsilon value is too sensitive
    1. too small a value can result in elimination of spare clusters as outliers
    2. too large a value would merge dense clusters together giving incorrect clusters
Refer this link to view the video tutorial of the same.
Share:

0 comments:

Post a Comment

Contact Me

Name

Email *

Message *

Popular Posts

Blog Archive