UniqueValuesMapping
- class ase2sprkkr.common.unique_values.UniqueValuesMapping(mapping, value_to_class_id=None)[source]
A class, that can map a collection of (possible non-unique) values to a set of unique identifiers. It effectively makes the classes of equivalence between indexes of the input array.
The instances of the class can be merged to distinct the values, that are the same according to one criterion, but distinct on the other.
>>> UniqueValuesMapping.from_values([1,4,1]).mapping array([1, 2, 1], dtype=int32) >>> UniqueValuesMapping.from_values([int, int, str]).mapping array([1, 1, 2], dtype=int32) >>> UniqueValuesMapping.from_values([1,4,1]).value_to_class_id {1: 1, 4: 2} >>> UniqueValuesMapping.from_values([1,4,1,1]).merge([1,1,2,1]).mapping array([1, 2, 3, 1], dtype=int32)
Class hierarchy
Constructor
- Parameters
mapping (List) –
value_to_class_id (Dict) –
- __init__(mapping, value_to_class_id=None)[source]
- Parameters
mapping (Union[np.ndarray, list]) – Array of equivalence class members members[id] = <eq class id>
value_to_class_id (dict) – Mapping { value: <eq class id> }
- mapping
Map from <object index> to <object equivalence class id>.
- value_to_class_id
Map from <object> to <object equivalence class id>. If two mappings are merged, this attribute is not available.
- indexes(start_from=0)[source]
Returns the dictionary that maps equivalence class id to the list of class members indexes.
- Parameters
start_from (int) – The indexes are by default zero-based, however they can start with the given number (typically with 1).
..doctest:: –
>>> UniqueValuesMapping([1,4,1]).indexes() {1: [0, 2], 4: [1]} >>> UniqueValuesMapping([1,4,1]).indexes(start_from = 1) {1: [1, 3], 4: [2]}
- unique_indexes()[source]
Returns the dictionary that maps equivalence class id to the list of class members indexes.
- ..doctest::
>>> UniqueValuesMapping([1,1,4]).unique_indexes() [0, 2]
- static from_values(values, length=None)[source]
Create equivalence-classes mapping. Unlike the constructor, this method tags the values by integers and also compute the reverse (value to equivalence class) mapping.
- values: iterable
Values to find the equivalence classes
- length: int
Length of values - provide it, if len(values) is not available
- Parameters
length (Optional[int]) –
- static _create_mapping(values, length=None, start_from=1, dtype=<class 'numpy.int32'>)[source]
- Returns
mapping (np.ndarray) – maps the value indexes to equivalence class id
reverse (dict) – maps equivalence classes to value indexes
.. doctest:: – >>> UniqueValuesMapping._create_mapping([1.,4.,1.]) (array([1, 2, 1], dtype=int32), {1.0: 1, 4.0: 2})
- is_equivalent_to(mapping)[source]
Return, whether the mapping is equal to given another mapping, regardless the actual “names” of the equivalence classes.
- Parameters
mapping (Union[UniqueValuesMapping, Iterable]) – The other mapping can be given either by instance of this class, or just by any iterable (that returns equivalence class names for the items)
doctest:: (..) –
>>> UniqueValuesMapping([1,4,1]).is_equivalent_to([0,1,0]) True >>> UniqueValuesMapping([1,4,1]).is_equivalent_to([0,0,0]) False >>> UniqueValuesMapping([1,4,1]).is_equivalent_to([0,1,1]) False >>> UniqueValuesMapping([1,4,1]).is_equivalent_to([5,3,5]) True >>> UniqueValuesMapping([1,4,1]).is_equivalent_to(UniqueValuesMapping.from_values([2,5,2])) True
- Return type
bool
- static are_equivalent(a, b)[source]
Return, whether the two mappings are equal, regardless the actual “names” of the equivalence classes.
See
is_equivalent
- Parameters
a (Union[UniqueValuesMapping, Iterable]) –
b (Union[UniqueValuesMapping, Iterable]) –
- Return type
bool
- normalized(start_from=1, strict=True, dtype=None)[source]
Map the class ids to integers
- Parameters
strict (bool) – If True, the resulting integer names will be from range (start_from)..(n+start_from-1), where n is the number of equivalence classes. If False and the names are already integers in a numpy array, do nothing.
start_from – Number the equivalent classes starting from.
- Returns
mapping (np.ndarray) – Array of integer starting from start_from, denotes the equivalence classes for the values, It holds, that
mappind[index] == equivalence_class
reverse (dict) – Dict
{ equivalence_class : value }
.. doctest:: – >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalized() (array([1, 2, 1], dtype=int32), {1: 1, 2: 2}) >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalized(start_from=0) (array([0, 1, 0], dtype=int32), {1: 0, 2: 1})
- normalize(start_from=1, strict=False, dtype=None)[source]
Replace the names of equivalent classes by the integers.
- Parameters
strict (bool) – If True, the resulting integer names will be from range (start_from)..(n+start_from-1), where n is the number of equivalence classes. If False and the names are already integers in a numpy array, do nothing.
start_from – Number the equivalent classes starting from.
dtype – dtype of the normalized values. None means
numpy.int32
, however if not strict, any integer type will be sufficient.
- Returns
unique_values_mapping – Return self.
.. doctest:: – >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalize().mapping array([1, 2, 1], dtype=int32) >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalize().value_to_class_id[(0,3)] 2 >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalize(start_from=0).mapping array([0, 1, 0], dtype=int32)