molecular_simulations.analysis.utils module

class molecular_simulations.analysis.utils.EmbedData(pdb, embedding_dict, out=None)

Bases: object

Embeds given data into the beta-factor column of PDB. Writes out to same path as input PDB and backs up old PDB file, unless an output path is explicitly provided. Embedding data should be provided as a dictionary where the keys are MDAnalysis selection strings and the values are numpy arrays of shape (n_frames, n_residues, n_datapoints) or (n_residues, n_datapoints).

Parameters:
  • pdb (Path) – Path to PDB file to load. Also will be the output if one is not provided.

  • embedding_dict (dict[str, np.ndarray]) – A dictionary containing MDAnalysis selections as keys and data as the values.

  • out (OptPath) – Defaults to None. If not None this will be the path to the output PDB.

embed()

Unpacks embedding dictionary, embeds data and writes out new PDB.

Return type:

None

Returns:

None

embed_selection(selection, data)

Embeds data into given selection in the beta column for each residue.

Parameters:
  • selection (str) – MDAnalysis selection string.

  • data (np.ndarray) – Array of data to place in beta column. Shape should be (n_residues_in_selection, 1).

Return type:

None

Returns:

None

write_new_pdb()

Writes out PDB file. If an output was not designated, backs up original PDB with the extension .orig.pdb. If this backup already exists, do not back up the PDB as that may occur if you run this twice and to do so would mean losing the actual original PDB.

Return type:

None

Returns:

None

class molecular_simulations.analysis.utils.EmbedEnergyData(pdb, embedding_dict, out=None)

Bases: EmbedData

Special instance of EmbedData in which the data stored in embedding_dict is non-bonded energy data with both LJ and coulombic terms. In this case we need to obtain the total energy by summing these and rescale it as many softwares do not understand a negative beta factor.

Parameters:
  • pdb (Path) – Path to PDB file to load. Also will be the output if one is not provided.

  • embedding_dict (dict[str, np.ndarray]) – A dictionary containing MDAnalysis selections as keys and data as the values.

  • out (OptPath) – Defaults to None. If not None this will be the path to the output PDB.

preprocess()

Processes embeddings data so that it can be fed through parent methods. This requires the embeddings data contain values of one-dimensional arrays, and that the data be rescaled such that there are no negative values while preserving the distance between values.

Returns:

Processed data array.

Return type:

(dict[str, np.ndarray])

static sanitize_data(data)

Takes in data of shape (n_frames, n_residues, n_terms) and returns one-dimensional array of shape (n_residues,) by first averaging in the first dimension and then summing in the new second dimension - originally the third dimension.

Parameters:

data (np.ndarray) – Unprocessed input data.

Returns:

One-dimensional processed data.

Return type:

(np.ndarray)