tropea_clustering.onion_multi

tropea_clustering.onion_multi(X, ndims=2, bins='auto', number_of_sigmas=2.0)[source]

Performs onion clustering on the data array ‘X’.

Returns an array of integer labels, one for each signal sequence. Unclassified sequences are labelled “-1”.

Parameters:
  • X (ndarray of shape (n_particles * n_seq, delta_t * n_features)) – The data to cluster. Each signal sequence is considered as a single data point.

  • ndims (int, default = 2) – The number of features (dimensions) of the dataset. It can be either 2 or 3.

  • bins (int, default="auto") – The number of bins used for the construction of the histograms. Can be an integer value, or “auto”. If “auto”, the default of numpy.histogram_bin_edges is used (see https://numpy.org/doc/stable/reference/generated/numpy.histogram_bin_edges.html#numpy.histogram_bin_edges).

  • number_of_sigmas (float, default=2.0) – Sets the thresholds for classifing a signal sequence inside a state: the sequence is contained in the state if it is entirely contained inside number_of_sigmas * state.sigmas times from state.mean.

Returns:

  • states_list (List[StateMulti]) – The list of the identified states.Refer to the documentation of StateMulti for accessing the information on the states.

  • labels (ndarray of shape (n_particles * n_seq,)) – Cluster labels for each signal sequence. Unclassified points are given the label “-1”.

Return type:

tuple[list[StateMulti], ndarray[tuple[int, …], dtype[int64]]]

Example

import numpy as np
from tropea_clustering import onion_multi, helpers

# Select time resolution
delta_t = 2

# Create random input data
np.random.seed(1234)
n_features = 2
n_particles = 5
n_steps = 1000

input_data = np.random.rand(n_features, n_particles, n_steps)

# Create input array with the correct shape
reshaped_input_data = helpers.reshape_from_dnt(input_data, delta_t)

# Run Onion Clustering
state_list, labels = onion_multi(reshaped_input_data)