Lesson 10 of 15

DBSCAN Core Operations

DBSCAN: Density-Based Clustering

DBSCAN (Density-Based Spatial Application of Noise) clusters points that are densely packed together while labelling sparse outliers as noise. Unlike k-means, it does not require choosing kk in advance.

Key Concepts

Epsilon neighbourhood (ε\varepsilon-neighbourhood): the set of points within distance ε\varepsilon of a point pp:

Nε(p)={qX:d(p,q)ε}N_\varepsilon(p) = \{q \in X : d(p, q) \leq \varepsilon\}

Core point: a point with at least min_samples neighbours within ε\varepsilon (excluding itself):

Nε(p)min_samples|N_\varepsilon(p)| \geq \text{min\_samples}

Border point: within ε\varepsilon of a core point but not itself a core point.

Noise point: neither core nor border.

Algorithm Sketch

for each unvisited point p:
    if p is a core point:
        expand cluster from p (BFS/DFS through neighbours)
    else:
        mark p as noise (may later become a border point)

Your Task

Implement the building blocks:

  • neighbors(X, idx, eps) → indices of points within ε\varepsilon (excluding idx itself)
  • is_core_point(X, idx, eps, min_samples) → True if Nε|N_\varepsilon| \geq min_samples
  • range_query(X, idx, eps) → same as neighbors (alias used during cluster expansion)
Python runtime loading...
Loading...
Click "Run" to execute your code.