Struct cogset::Optics
[-] [+]
[src]
pub struct Optics<P: Points> where P: Hash + Eq + Clone {
// some fields omitted
}
Clustering via the OPTICS algorithm[1].
[OPTICS] is an algorithm for finding density-based clusters in spatial data. [...] Its basic idea is similar to DBSCAN, but it addresses one of DBSCAN's major weaknesses: the problem of detecting meaningful clusters in data of varying density.wikipedia
An instance of Optics
represents the dendrogram that OPTICS
computes for a data set. Once computed, this dendrogram can then
be queried for clustering structure, for example, a clustering
similar to the one that would be computed by DBSCAN can be
retrieved with dbscan_clustering
.
This uses the P::Point
yielded by the iterators provided by
ListPoints
and RegionQuery
as a unique identifier for each
point. The algorithm will behave strangely if the identifier is
not unique or not stable within a given execution of OPTICS. The
identifier is cloned several times in the course of execution, so
it should be cheap to duplicate (e.g. a usize
index, or a &T
reference).
[1]: Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander (1999). OPTICS: Ordering Points To Identify the Clustering Structure. ACM SIGMOD international conference on Management of data. ACM Press. pp. 49–60.
Examples
A basic example:
use cogset::{Optics, BruteScan, Euclid}; let points = [Euclid([0.1]), Euclid([0.2]), Euclid([1.0])]; let scanner = BruteScan::new(&points); let optics = Optics::new(scanner, 0.2, 2); // use the same epsilon that OPTICS used let mut clustering = optics.dbscan_clustering(0.2); // get the clusters themselves let mut clusters = clustering.by_ref().collect::<Vec<_>>(); // the first two points are the only cluster assert_eq!(clusters, &[&[0, 1]]); // now the noise, which is just the last point assert_eq!(clustering.noise_points(), &[2]); // cluster again, with a much smaller epsilon let mut clustering = optics.dbscan_clustering(0.05); // get the clusters themselves let mut clusters = clustering.by_ref().collect::<Vec<_>>(); // no clusters (its less than the smallest distance between points) assert!(clusters.is_empty()); // everything is noise assert_eq!(clustering.noise_points(), &[0, 1, 2]);
Methods
impl<P: RegionQuery + ListPoints> Optics<P> where P: Hash + Eq + Clone
fn new(points: P, eps: f64, min_pts: usize) -> Optics<P>
Run the OPTICS algorithm on the index points
, with the eps
and min_pts
parameters.
The return value can be queried for the actual clustering
structure using, for example, dbscan_clustering
. The
parameter eps
is used as a performance enhancement, and
should be made as small as possible for the use-case.
NB. this computes the clustering dendrogram immediately,
unlike Dbscan
's laziness.
fn dbscan_clustering<'a>(&'a self, eps: f64) -> OpticsDbscanClustering<'a, P>
Extract a clustering like one that DBSCAN would give.
The returned type is similar to the Dbscan
type: an iterator
over the clusters (as vectors of points), along with the
noise_points
method to retrieve unclustered points.
Panics
eps
must be less than the eps
passed to new
.