Module deeporigin.src.utilities.alignments
Functions
def calculate_bounding_box(structure_coord, padding=0.0)
-
Expand source code
def calculate_bounding_box(structure_coord, padding=0.0): """ Calculate the bounding box for a set of 3D coordinates. This function computes the minimum and maximum coordinates, dimensions, and center of a bounding box that encloses all points in the given coordinate set. Args: structure_coord (numpy.ndarray): Array of coordinates with shape (N, 3) where N is the number of points. padding (float, optional): Additional padding to add around the bounding box. Defaults to 0.0. Returns: tuple: A tuple containing: - min_coords (numpy.ndarray): Minimum coordinates of the bounding box (x,y,z) - max_coords (numpy.ndarray): Maximum coordinates of the bounding box (x,y,z) - dimensions (numpy.ndarray): Dimensions of the bounding box (width,height,depth) - center (numpy.ndarray): Center coordinates of the bounding box (x,y,z) """ min_coords = np.min(structure_coord, axis=0) - padding max_coords = np.max(structure_coord, axis=0) + padding dimensions = max_coords - min_coords center = 0.5 * (max_coords + min_coords) return min_coords, max_coords, dimensions, center
Calculate the bounding box for a set of 3D coordinates.
This function computes the minimum and maximum coordinates, dimensions, and center of a bounding box that encloses all points in the given coordinate set.
Args
structure_coord
:numpy.ndarray
- Array of coordinates with shape (N, 3) where N is the number of points.
padding
:float
, optional- Additional padding to add around the bounding box. Defaults to 0.0.
Returns
tuple
- A tuple containing: - min_coords (numpy.ndarray): Minimum coordinates of the bounding box (x,y,z) - max_coords (numpy.ndarray): Maximum coordinates of the bounding box (x,y,z) - dimensions (numpy.ndarray): Dimensions of the bounding box (width,height,depth) - center (numpy.ndarray): Center coordinates of the bounding box (x,y,z)
def calculate_fixed_bounding_box(structure_coord, box_size=24.0)
-
Expand source code
def calculate_fixed_bounding_box(structure_coord, box_size=24.0): """ Calculate a fixed-size bounding box centered on a molecular structure. Args: structure_coord (numpy.ndarray): Array of shape (N, 3) containing 3D coordinates of the structure points box_size (float, optional): Length of each side of the cubic bounding box. Defaults to 24.0. Returns: tuple: A tuple containing: - min_coords (numpy.ndarray): Array of shape (3,) containing minimum x, y, z coordinates of the box - max_coords (numpy.ndarray): Array of shape (3,) containing maximum x, y, z coordinates of the box - dimensions (numpy.ndarray): Array of shape (3,) containing the dimensions of the box (equal in all directions) - center (numpy.ndarray): Array of shape (3,) containing the coordinates of the box center (structure centroid) """ # Calculate the centroid of the structure center = np.mean(structure_coord, axis=0) # Calculate half the box size half_size = box_size / 2.0 # Define min and max coordinates based on the center and half_size min_coords = center - half_size max_coords = center + half_size # Dimensions of the box dimensions = max_coords - min_coords return min_coords, max_coords, dimensions, center
Calculate a fixed-size bounding box centered on a molecular structure.
Args
structure_coord
:numpy.ndarray
- Array of shape (N, 3) containing 3D coordinates of the structure points
box_size
:float
, optional- Length of each side of the cubic bounding box. Defaults to 24.0.
Returns
tuple
- A tuple containing: - min_coords (numpy.ndarray): Array of shape (3,) containing minimum x, y, z coordinates of the box - max_coords (numpy.ndarray): Array of shape (3,) containing maximum x, y, z coordinates of the box - dimensions (numpy.ndarray): Array of shape (3,) containing the dimensions of the box (equal in all directions) - center (numpy.ndarray): Array of shape (3,) containing the coordinates of the box center (structure centroid)
def create_bounding_box(ligand, padding=0.0, output_file=None, around_ligand=False, box_size=20.0)
-
Expand source code
def create_bounding_box(ligand, padding=0.0, output_file=None, around_ligand=False, box_size=20.0): """ Creates a bounding box around a molecular structure with specified parameters. Args: ligand (Structure): A molecular structure object containing 3D coordinates padding (float, optional): Additional space to add around the ligand's dimensions. Defaults to 0.0. output_file (str, optional): Path to save the bounding box as a structure file. Defaults to None. around_ligand (bool, optional): If True, creates a box that fits around ligand with padding. If False, creates a fixed size box. Defaults to False. box_size (float, optional): Size of the fixed bounding box when around_ligand is False. Defaults to 20.0. Returns: dict: A dictionary containing: - min_coords (numpy.ndarray): Minimum x,y,z coordinates - max_coords (numpy.ndarray): Maximum x,y,z coordinates - dimensions (numpy.ndarray): Box dimensions - center (numpy.ndarray): Box center coordinates - atom_array (AtomArray): Structure array of box vertices (only if output_file is specified) """ structure_coord = ligand.coordinates if around_ligand: min_coords, max_coords, dimensions, center = calculate_bounding_box(structure_coord, padding) else: min_coords, max_coords, dimensions, center = calculate_fixed_bounding_box(structure_coord, box_size) result = { 'min_coords': min_coords, 'max_coords': max_coords, 'dimensions': dimensions, 'center': center } if output_file: atom_array = create_bounding_box_atoms(min_coords, max_coords, dimensions) struc.io.save_structure(output_file, atom_array) result['atom_array'] = atom_array print(f"Bounding box atoms saved to {output_file}") return result
Creates a bounding box around a molecular structure with specified parameters.
Args
ligand
:Structure
- A molecular structure object containing 3D coordinates
padding
:float
, optional- Additional space to add around the ligand's dimensions. Defaults to 0.0.
output_file
:str
, optional- Path to save the bounding box as a structure file. Defaults to None.
around_ligand
:bool
, optional- If True, creates a box that fits around ligand with padding. If False, creates a fixed size box. Defaults to False.
box_size
:float
, optional- Size of the fixed bounding box when around_ligand is False. Defaults to 20.0.
Returns
dict
- A dictionary containing: - min_coords (numpy.ndarray): Minimum x,y,z coordinates - max_coords (numpy.ndarray): Maximum x,y,z coordinates - dimensions (numpy.ndarray): Box dimensions - center (numpy.ndarray): Box center coordinates - atom_array (AtomArray): Structure array of box vertices (only if output_file is specified)
def create_bounding_box_atoms(min_coords, max_coords, dimensions)
-
Expand source code
def create_bounding_box_atoms(min_coords, max_coords, dimensions): """ Create a bounding box represented by atoms at its corners. This function creates a box by placing atoms (XE elements) at strategic corner positions. The box is defined by its minimum and maximum coordinates along with its dimensions. Args: min_coords: List or array of 3 coordinates [x,y,z] representing the minimum corner of the box max_coords: List or array of 3 coordinates [x,y,z] representing the maximum corner of the box dimensions: List or array of 3 values [dx,dy,dz] representing the dimensions of the box Returns: struc.AtomArray: An array of 8 atoms representing the corners of the bounding box. Each atom has the following properties: - chain_id: "A" - res_id: 1 - res_name: "BOX" - atom_name: "XE1" through "XE8" - element: "XE" Note: The atoms are placed at the eight corners of the box, with XE1 at min_coords and XE5 at max_coords. The remaining atoms are placed at intermediate corners formed by adding or subtracting the dimensions from these points. """ atoms = [] atom_index = 1 # Create an atom at the minimum corner atom = struc.Atom(min_coords, chain_id="A", res_id=1, res_name="BOX", atom_name=f"XE{atom_index}", element="XE") atoms.append(atom) atom_index += 1 # Create atoms at the corners formed by adding dimensions to min_coords for i in range(3): coord = min_coords.copy() coord[i] += dimensions[i] atom = struc.Atom(coord, chain_id="A", res_id=1, res_name="BOX", atom_name=f"XE{atom_index}", element="XE") atoms.append(atom) atom_index += 1 # Create an atom at the maximum corner atom = struc.Atom(max_coords, chain_id="A", res_id=1, res_name="BOX", atom_name=f"XE{atom_index}", element="XE") atoms.append(atom) atom_index += 1 # Create atoms at the corners formed by subtracting dimensions from max_coords for i in range(3): coord = max_coords.copy() coord[i] -= dimensions[i] atom = struc.Atom(coord, chain_id="A", res_id=1, res_name="BOX", atom_name=f"XE{atom_index}", element="XE") atoms.append(atom) atom_index += 1 # Convert the list of atoms to an AtomArray atom_array = struc.array(atoms) return atom_array
Create a bounding box represented by atoms at its corners.
This function creates a box by placing atoms (XE elements) at strategic corner positions. The box is defined by its minimum and maximum coordinates along with its dimensions.
Args
min_coords
- List or array of 3 coordinates [x,y,z] representing the minimum corner of the box
max_coords
- List or array of 3 coordinates [x,y,z] representing the maximum corner of the box
dimensions
- List or array of 3 values [dx,dy,dz] representing the dimensions of the box
Returns
struc.AtomArray
- An array of 8 atoms representing the corners of the bounding box. Each atom has the following properties: - chain_id: "A" - res_id: 1 - res_name: "BOX" - atom_name: "XE1" through "XE8" - element: "XE"
Note
The atoms are placed at the eight corners of the box, with XE1 at min_coords and XE5 at max_coords. The remaining atoms are placed at intermediate corners formed by adding or subtracting the dimensions from these points.
def save_bounding_box(center, box_size, output_file)
-
Expand source code
def save_bounding_box(center, box_size, output_file): """ Save a bounding box structure to a file. Args: center (array-like): The (x, y, z) coordinates of the box center. box_size (array-like): The dimensions (width, height, depth) of the box. output_file (str): The path where the structure file will be saved. Returns: None: The function saves the structure to a file but does not return any value. Notes: The function uses the `calculate_box_min_max` to determine box boundaries and `create_bounding_box_atoms` to generate the atomic structure representation. The output is saved using the biotite structure module. """ min_coords, max_coords = calculate_box_min_max(center, box_size) atom_array = create_bounding_box_atoms(min_coords, max_coords, box_size) struc.io.save_structure(output_file, atom_array)
Save a bounding box structure to a file.
Args
center
:array-like
- The (x, y, z) coordinates of the box center.
box_size
:array-like
- The dimensions (width, height, depth) of the box.
output_file
:str
- The path where the structure file will be saved.
Returns
None
- The function saves the structure to a file but does not return any value.
Notes
The function uses the
calculate_box_min_max
to determine box boundaries andcreate_bounding_box_atoms()
to generate the atomic structure representation. The output is saved using the biotite structure module.
Classes
class StructureAligner
-
Expand source code
class StructureAligner: """A class for aligning and transforming structural coordinates using Principal Component Analysis (PCA). This class provides functionality to: 1. Calculate PCA components from input coordinates 2. Align structures using calculated PCA components 3. Restore structures from PCA-transformed coordinates 4. Track PCA fitting state The aligner ensures right-handed coordinate systems and provides error handling for common edge cases. pca (Optional[PCA]): Principal Component Analysis object from scikit-learn. _components_fixed (bool): Flag indicating whether PCA components are fixed. >>> aligner = StructureAligner() >>> aligner.calculate_pca(initial_coords) >>> aligned_coords = aligner.align_structure(target_coords) >>> restored_coords = aligner.restore_structure(aligned_coords) - The PCA model must be fitted using calculate_pca() before alignment operations - All coordinate arrays should be compatible with scikit-learn's PCA implementation """ def __init__(self): """Initialize the class with PCA components. Attributes: pca (Optional[PCA]): Principal Component Analysis object from scikit-learn, initially set to None. _components_fixed (bool): Flag indicating whether PCA components are fixed, initially set to False. """ self.pca: Optional[PCA] = None self._components_fixed: bool = False def calculate_pca(self, coords: np.ndarray) -> None: """ Calculates the Principal Component Analysis (PCA) for given coordinates. This method performs PCA on the input coordinates to find the principal axes of variation. It ensures right-handed coordinate system by checking and potentially flipping the third component's direction. Args: coords (np.ndarray): Input coordinates array for PCA calculation. Raises: ValueError: If coords is None. Exception: If PCA calculation fails for any other reason. Notes: - Sets self.pca with the calculated PCA object - Stores the components ensuring a right-handed coordinate system - Sets self._components_fixed flag to True upon successful calculation Example: >>> instance.calculate_pca(coordinates_array) """ try: if coords is None: raise ValueError("Coordinates are None.") self.pca = PCA(n_components=3) self.pca.fit(coords) components = self.pca.components_ if np.dot(np.cross(components[0], components[1]), components[2]) < 0: self.pca.components_ *= np.array([1, 1, -1]) self._components_fixed = True DEFAULT_LOGGER.log_info("PCA components calculated and stored.") except Exception as e: DEFAULT_LOGGER.log_error(f"PCA calculation failed: {str(e)}") raise @property def is_fitted(self) -> bool: """ Check if the PCA transformer has been fitted with data. Returns: bool: True if both PCA transformation is initialized and components are fixed, False otherwise. """ return self.pca is not None and self._components_fixed def align_structure(self, coords: np.ndarray) -> np.ndarray: """ Aligns structural coordinates using pre-calculated PCA components. Args: coords (np.ndarray): Input coordinates to be aligned. Should have the same number of features as the data used to fit the PCA model. Returns: np.ndarray: The transformed coordinates in the principal component space. Raises: ValueError: If PCA components haven't been calculated (is_fitted=False) or if input coordinates are None Exception: If alignment process fails for any other reason Notes: The PCA model must be fitted by calling calculate_pca() before using this method. """ if not self.is_fitted: raise ValueError("PCA components haven't been calculated. Call calculate_pca first.") try: if coords is None: raise ValueError("Coordinates are None.") # Transform coordinates aligned_coords = self.pca.transform(coords) DEFAULT_LOGGER.log_info("Coordinates aligned using PCA.") return aligned_coords except Exception as e: DEFAULT_LOGGER.log_error(f"Alignment failed: {str(e)}") raise def restore_structure(self, coords: np.ndarray) -> np.ndarray: """ Restores the original structural coordinates from PCA-transformed coordinates. This method performs the inverse PCA transformation to convert coordinates from the reduced dimensionality PCA space back to their original structural representation. Args: coords (np.ndarray): The coordinates in PCA space to be restored to original structure space. Returns: np.ndarray: The restored coordinates in the original structural space. Raises: ValueError: If PCA components haven't been calculated (is_fitted=False) or if coords is None. Exception: If restoration process fails for any other reason. Note: The method requires that calculate_pca() has been called previously to fit the PCA model. """ if not self.is_fitted: raise ValueError("PCA components haven't been calculated. Call calculate_pca first.") try: if coords is None: raise ValueError("Coordinates are None.") # Inverse transform coordinates restored_coords = self.pca.inverse_transform(coords) DEFAULT_LOGGER.log_info("Coordinates restored from PCA space.") return restored_coords except Exception as e: DEFAULT_LOGGER.log_error(f"Restoration failed: {str(e)}") raise
A class for aligning and transforming structural coordinates using Principal Component Analysis (PCA).
This class provides functionality to: 1. Calculate PCA components from input coordinates 2. Align structures using calculated PCA components 3. Restore structures from PCA-transformed coordinates 4. Track PCA fitting state
The aligner ensures right-handed coordinate systems and provides error handling for common edge cases.
pca (Optional[PCA]): Principal Component Analysis object from scikit-learn. _components_fixed (bool): Flag indicating whether PCA components are fixed. >>> aligner = StructureAligner() >>> aligner.calculate_pca(initial_coords) >>> aligned_coords = aligner.align_structure(target_coords) >>> restored_coords = aligner.restore_structure(aligned_coords) - The PCA model must be fitted using calculate_pca() before alignment operations - All coordinate arrays should be compatible with scikit-learn's PCA implementation
Initialize the class with PCA components.
Attributes
pca
:Optional[PCA]
- Principal Component Analysis object from scikit-learn, initially set to None.
_components_fixed
:bool
- Flag indicating whether PCA components are fixed, initially set to False.
Instance variables
prop is_fitted : bool
-
Expand source code
@property def is_fitted(self) -> bool: """ Check if the PCA transformer has been fitted with data. Returns: bool: True if both PCA transformation is initialized and components are fixed, False otherwise. """ return self.pca is not None and self._components_fixed
Check if the PCA transformer has been fitted with data.
Returns
bool
- True if both PCA transformation is initialized and components are fixed, False otherwise.
Methods
def align_structure(self, coords: numpy.ndarray) ‑> numpy.ndarray
-
Expand source code
def align_structure(self, coords: np.ndarray) -> np.ndarray: """ Aligns structural coordinates using pre-calculated PCA components. Args: coords (np.ndarray): Input coordinates to be aligned. Should have the same number of features as the data used to fit the PCA model. Returns: np.ndarray: The transformed coordinates in the principal component space. Raises: ValueError: If PCA components haven't been calculated (is_fitted=False) or if input coordinates are None Exception: If alignment process fails for any other reason Notes: The PCA model must be fitted by calling calculate_pca() before using this method. """ if not self.is_fitted: raise ValueError("PCA components haven't been calculated. Call calculate_pca first.") try: if coords is None: raise ValueError("Coordinates are None.") # Transform coordinates aligned_coords = self.pca.transform(coords) DEFAULT_LOGGER.log_info("Coordinates aligned using PCA.") return aligned_coords except Exception as e: DEFAULT_LOGGER.log_error(f"Alignment failed: {str(e)}") raise
Aligns structural coordinates using pre-calculated PCA components.
Args
coords
:np.ndarray
- Input coordinates to be aligned. Should have the same number of features
as the data used to fit the PCA model.
Returns
np.ndarray
- The transformed coordinates in the principal component space.
Raises
ValueError
- If PCA components haven't been calculated (is_fitted=False) or if input coordinates are None
Exception
- If alignment process fails for any other reason
Notes
The PCA model must be fitted by calling calculate_pca() before using this method.
def calculate_pca(self, coords: numpy.ndarray) ‑> None
-
Expand source code
def calculate_pca(self, coords: np.ndarray) -> None: """ Calculates the Principal Component Analysis (PCA) for given coordinates. This method performs PCA on the input coordinates to find the principal axes of variation. It ensures right-handed coordinate system by checking and potentially flipping the third component's direction. Args: coords (np.ndarray): Input coordinates array for PCA calculation. Raises: ValueError: If coords is None. Exception: If PCA calculation fails for any other reason. Notes: - Sets self.pca with the calculated PCA object - Stores the components ensuring a right-handed coordinate system - Sets self._components_fixed flag to True upon successful calculation Example: >>> instance.calculate_pca(coordinates_array) """ try: if coords is None: raise ValueError("Coordinates are None.") self.pca = PCA(n_components=3) self.pca.fit(coords) components = self.pca.components_ if np.dot(np.cross(components[0], components[1]), components[2]) < 0: self.pca.components_ *= np.array([1, 1, -1]) self._components_fixed = True DEFAULT_LOGGER.log_info("PCA components calculated and stored.") except Exception as e: DEFAULT_LOGGER.log_error(f"PCA calculation failed: {str(e)}") raise
Calculates the Principal Component Analysis (PCA) for given coordinates.
This method performs PCA on the input coordinates to find the principal axes of variation. It ensures right-handed coordinate system by checking and potentially flipping the third component's direction.
Args
coords
:np.ndarray
- Input coordinates array for PCA calculation.
Raises
ValueError
- If coords is None.
Exception
- If PCA calculation fails for any other reason.
Notes
- Sets self.pca with the calculated PCA object
- Stores the components ensuring a right-handed coordinate system
- Sets self._components_fixed flag to True upon successful calculation
Example
>>> instance.calculate_pca(coordinates_array)
def restore_structure(self, coords: numpy.ndarray) ‑> numpy.ndarray
-
Expand source code
def restore_structure(self, coords: np.ndarray) -> np.ndarray: """ Restores the original structural coordinates from PCA-transformed coordinates. This method performs the inverse PCA transformation to convert coordinates from the reduced dimensionality PCA space back to their original structural representation. Args: coords (np.ndarray): The coordinates in PCA space to be restored to original structure space. Returns: np.ndarray: The restored coordinates in the original structural space. Raises: ValueError: If PCA components haven't been calculated (is_fitted=False) or if coords is None. Exception: If restoration process fails for any other reason. Note: The method requires that calculate_pca() has been called previously to fit the PCA model. """ if not self.is_fitted: raise ValueError("PCA components haven't been calculated. Call calculate_pca first.") try: if coords is None: raise ValueError("Coordinates are None.") # Inverse transform coordinates restored_coords = self.pca.inverse_transform(coords) DEFAULT_LOGGER.log_info("Coordinates restored from PCA space.") return restored_coords except Exception as e: DEFAULT_LOGGER.log_error(f"Restoration failed: {str(e)}") raise
Restores the original structural coordinates from PCA-transformed coordinates.
This method performs the inverse PCA transformation to convert coordinates from the reduced dimensionality PCA space back to their original structural representation.
Args
coords
:np.ndarray
- The coordinates in PCA space to be restored to original structure space.
Returns
np.ndarray
- The restored coordinates in the original structural space.
Raises
ValueError
- If PCA components haven't been calculated (is_fitted=False) or if coords is None.
Exception
- If restoration process fails for any other reason.
Note
The method requires that calculate_pca() has been called previously to fit the PCA model.