Struct cogset::Optics [-] [+] [src]

pub struct Optics<P: Points> where P: Hash + Eq + Clone {
    // some fields omitted

Clustering via the OPTICS algorithm[1].

[OPTICS] is an algorithm for finding density-based clusters in spatial data. [...] Its basic idea is similar to DBSCAN, but it addresses one of DBSCAN's major weaknesses: the problem of detecting meaningful clusters in data of varying density.wikipedia

An instance of Optics represents the dendrogram that OPTICS computes for a data set. Once computed, this dendrogram can then be queried for clustering structure, for example, a clustering similar to the one that would be computed by DBSCAN can be retrieved with dbscan_clustering.

This uses the P::Point yielded by the iterators provided by ListPoints and RegionQuery as a unique identifier for each point. The algorithm will behave strangely if the identifier is not unique or not stable within a given execution of OPTICS. The identifier is cloned several times in the course of execution, so it should be cheap to duplicate (e.g. a usize index, or a &T reference).

[1]: Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander (1999). OPTICS: Ordering Points To Identify the Clustering Structure. ACM SIGMOD international conference on Management of data. ACM Press. pp. 49–60.


A basic example:

use cogset::{Optics, BruteScan, Euclid};

let points = [Euclid([0.1]), Euclid([0.2]), Euclid([1.0])];

let scanner = BruteScan::new(&points);
let optics = Optics::new(scanner, 0.2, 2);

// use the same epsilon that OPTICS used
let mut clustering = optics.dbscan_clustering(0.2);

// get the clusters themselves
let mut clusters = clustering.by_ref().collect::<Vec<_>>();
// the first two points are the only cluster
assert_eq!(clusters, &[&[0, 1]]);
// now the noise, which is just the last point
assert_eq!(clustering.noise_points(), &[2]);

// cluster again, with a much smaller epsilon
let mut clustering = optics.dbscan_clustering(0.05);

// get the clusters themselves
let mut clusters = clustering.by_ref().collect::<Vec<_>>();
// no clusters (its less than the smallest distance between points)
// everything is noise
assert_eq!(clustering.noise_points(), &[0, 1, 2]);


impl<P: RegionQuery + ListPoints> Optics<P> where P: Hash + Eq + Clone

fn new(points: P, eps: f64, min_pts: usize) -> Optics<P>

Run the OPTICS algorithm on the index points, with the eps and min_pts parameters.

The return value can be queried for the actual clustering structure using, for example, dbscan_clustering. The parameter eps is used as a performance enhancement, and should be made as small as possible for the use-case.

NB. this computes the clustering dendrogram immediately, unlike Dbscan's laziness.

fn dbscan_clustering<'a>(&'a self, eps: f64) -> OpticsDbscanClustering<'a, P>

Extract a clustering like one that DBSCAN would give.

The returned type is similar to the Dbscan type: an iterator over the clusters (as vectors of points), along with the noise_points method to retrieve unclustered points.


eps must be less than the eps passed to new.