dreams.algorithms.murcko_hist package
Submodules
dreams.algorithms.murcko_hist.murcko_hist module
- dreams.algorithms.murcko_hist.murcko_hist.are_sub_hists(h1: Dict[str, int], h2: Dict[str, int], k: int = 3, d: int = 4) bool
Determines if two Murcko histograms are considered sub-histograms of each other.
This function checks if the histograms are equal when their sums are small, or if their distance is within a specified threshold for larger histograms.
- Parameters:
h1 (Dict[str, int]) – The first Murcko histogram dictionary.
h2 (Dict[str, int]) – The second Murcko histogram dictionary.
k (int, optional) – The threshold for considering small histograms. Defaults to 3.
d (int, optional) – The maximum allowed distance for larger histograms. Defaults to 4.
- Returns:
True if the histograms are considered sub-histograms, False otherwise.
- Return type:
bool
- dreams.algorithms.murcko_hist.murcko_hist.break_rings(mol: Mol, rings_size: int = 3) Mol
Breaks all rings of a specified size in a molecule by removing a bond with minimal degree with respect to rings.
This function is intended to be used prior to computing Murcko scaffolds. NOTE: It is not extensively tested and may not be useful for rings_size != 3.
- Parameters:
mol (Mol) – The input RDKit molecule.
rings_size (int, optional) – The size of rings to break. Defaults to 3.
- Returns:
The modified RDKit molecule with specified rings broken.
- Return type:
Mol
- dreams.algorithms.murcko_hist.murcko_hist.multirings(mol: Mol) List[Set[int]]
Returns a list of sets of atom indices belonging to “multiring” in a molecule.
Each “multiring” is a generalized ring, where ordinary fused rings are considered to be a single ring/set (hence the “multiring” name).
- Parameters:
mol (Mol) – The input RDKit molecule.
- Returns:
A list of sets, where each set contains atom indices belonging to a multiring.
- Return type:
List[Set[int]]
- dreams.algorithms.murcko_hist.murcko_hist.murcko_hist(mol: Mol, as_dict: bool = True, show_mol_scaffold: bool = False, no_residue_atom_as_linker: bool = True, break_three_membered_rings: bool = True) Dict[str, int] | Tuple[ndarray, ndarray]
Computes the Murcko scaffold histogram for a given molecule.
This function calculates a histogram of rings in the Murcko scaffold of the input molecule, with respect to the number of adjacent rings and linkers.
- Parameters:
mol (Mol) – The input RDKit molecule.
as_dict (bool, optional) – If True, return the histogram as a dictionary. Otherwise, return as numpy arrays. Defaults to True.
show_mol_scaffold (bool, optional) – If True, display the original molecule and its Murcko scaffold. Defaults to False.
no_residue_atom_as_linker (bool, optional) – If True, do not consider residue atoms as linkers. Defaults to True.
break_three_membered_rings (bool, optional) – If True, break all three-membered rings before processing. Defaults to True.
- Returns:
If as_dict is True, returns a dictionary where keys are string representations of (adjacent rings, adjacent linkers) and values are counts. If as_dict is False, returns a tuple of two numpy arrays: unique (adjacent rings, adjacent linkers) pairs and their counts.
- Return type:
Union[Dict[str, int], Tuple[np.ndarray, np.ndarray]]
- dreams.algorithms.murcko_hist.murcko_hist.murcko_hists_dist(h1: Dict[str, int], h2: Dict[str, int]) int
Computes the distance between two Murcko histogram dictionaries.
The distance is calculated as the sum of absolute differences between corresponding histogram values, including keys present in only one histogram.
- Parameters:
h1 (Dict[str, int]) – The first Murcko histogram dictionary.
h2 (Dict[str, int]) – The second Murcko histogram dictionary.
- Returns:
The distance between the two histograms.
- Return type:
int
Module contents
- dreams.algorithms.murcko_hist.are_sub_hists(h1: Dict[str, int], h2: Dict[str, int], k: int = 3, d: int = 4) bool
Determines if two Murcko histograms are considered sub-histograms of each other.
This function checks if the histograms are equal when their sums are small, or if their distance is within a specified threshold for larger histograms.
- Parameters:
h1 (Dict[str, int]) – The first Murcko histogram dictionary.
h2 (Dict[str, int]) – The second Murcko histogram dictionary.
k (int, optional) – The threshold for considering small histograms. Defaults to 3.
d (int, optional) – The maximum allowed distance for larger histograms. Defaults to 4.
- Returns:
True if the histograms are considered sub-histograms, False otherwise.
- Return type:
bool
- dreams.algorithms.murcko_hist.murcko_hist(mol: Mol, as_dict: bool = True, show_mol_scaffold: bool = False, no_residue_atom_as_linker: bool = True, break_three_membered_rings: bool = True) Dict[str, int] | Tuple[ndarray, ndarray]
Computes the Murcko scaffold histogram for a given molecule.
This function calculates a histogram of rings in the Murcko scaffold of the input molecule, with respect to the number of adjacent rings and linkers.
- Parameters:
mol (Mol) – The input RDKit molecule.
as_dict (bool, optional) – If True, return the histogram as a dictionary. Otherwise, return as numpy arrays. Defaults to True.
show_mol_scaffold (bool, optional) – If True, display the original molecule and its Murcko scaffold. Defaults to False.
no_residue_atom_as_linker (bool, optional) – If True, do not consider residue atoms as linkers. Defaults to True.
break_three_membered_rings (bool, optional) – If True, break all three-membered rings before processing. Defaults to True.
- Returns:
If as_dict is True, returns a dictionary where keys are string representations of (adjacent rings, adjacent linkers) and values are counts. If as_dict is False, returns a tuple of two numpy arrays: unique (adjacent rings, adjacent linkers) pairs and their counts.
- Return type:
Union[Dict[str, int], Tuple[np.ndarray, np.ndarray]]
- dreams.algorithms.murcko_hist.murcko_hists_dist(h1: Dict[str, int], h2: Dict[str, int]) int
Computes the distance between two Murcko histogram dictionaries.
The distance is calculated as the sum of absolute differences between corresponding histogram values, including keys present in only one histogram.
- Parameters:
h1 (Dict[str, int]) – The first Murcko histogram dictionary.
h2 (Dict[str, int]) – The second Murcko histogram dictionary.
- Returns:
The distance between the two histograms.
- Return type:
int