
scipion-chem
scipion-chem is the core plugin for Virtual Drug Screening (VDS) in the Scipion platform (the rest of scipion-chem-* plugins). It is designed to manage and make interoperable all the the satellite plugins (Autodock, fpocket,…). It also includes several tools for:
Managing small molecules, protein structures or molecular dynamics simulations.
Consensus tools that extract the most relevant results from protein pocket search and docking.
Visualization of the results for each of the VDS steps.
Filter and operate the different sets obtained at each step of the workflow.
To do so, scipion-chem automatically installs several util third-party software for the management and visualization of the tasks in a typical bioinformatics and VDS workflow. These include:
OpenBabel and RDKit: the main small molecule handlers and converters.
MGLTools: utadditional utils for small molecules, docking, … (includes AutoDockTools).
JChemPaint: Java program to manually draw small molecules.
Pymol: main viewer of Scipion Chem for small molecules and structures.
VMD: secondary viewer of Scipion-Chem for structures and Molecular Dynamics.
AliView: main viewer for sequences.
These programs are managed through conda environments, which also includes different util Python modules.
Installation
A) Requirements
You will need to use Scipion3 to install and run this plugin.
B) Installation steps
Install the stable version
Through the plugin manager GUI by launching Scipion and following Others >> Plugin Manager, or
scipion3 installp -p scipion-chem
Developer’s version
Download repository:
git clone https://github.com/scipion-chem/scipion-chem.git
Install:
scipion3 installp -p /path/to/scipion-chem --devel
Warning
Installation of the stable version through the Plugin Manager
or with the command provided above is not available yet.
Please install the developer version until a stable version is released, or install the latest stable version with the following steps:
cd /path/to/scipion-chem && git checkout master && scipion3 installp -p . --devel
C) Packages & enviroments
Packages installed by this plugin can be located in /path/to/scipion/software/em/
.
The following packages will be created:
rdkit-
version
shape-it-
version
mgltools-
version
jchempaint-
version
pymol-
version
aliview-
version
vmd-
version
mdtraj-
version
Where version
is the current version of that specific package.
Warning
These environments are used by the code in the different plugins, and we strongly advise not to modify them manually in order to keep their functions.
TODO: COMPLETE THIS PART
The code inside this plugin also includes the python objects, viewers, wizards and other utils for the rest of the scipion-chem plugins. We give these python objects special importance since the interoperability of scipion-chem relies on them.
Protocols
scipion-chem includes around 40 different protocols subdivided in 4 groups of protocols according to their function:
Note
The user will notice that many protocols have a wand icon next to some of the parameters.
We call this button wizard
and they are designed to help the user to use the protocol.
One of the most common types of wizard will help the user to fill a parameter with the proper string.
We strongly recommend to use the wizards to fill these parameters (for some protocols, it is even compulsory), since inappropriate use of the parameters might lead the protocol to fail.
A) General
It includes protocols for managing the objects or files generated by Scipion.
Convert structure: Converts the format of the files stored for a set of Small Molecules, an Atom Structure or a Molecular dynamics system.
Operate set: Includes several functionalities to modify any Scipion Set inside the project.
Add attribute: Allows the user to add an attribute to an item or set object inside Scipion.
Export CSV: Allows the user to export the SQLite table of a set as a CSV file, containing the values of each attribute for each column and each item in a row.
B) Database
It includes protocols related to the main databases for protein sequences, structures or small molecules.
Import database IDs: Imports a set of database IDs from a file and stores them as a Scipion object.
Identify ligands: Tries to identify a set of Small Molecules based on the SMILES string for each of them.
UniProt CrossRef: Searches in the UniProt cross reference database for related entries of a set of UniProt IDs for specified databases.
ZINC filter: Filters a
SetOfSmallMolecules
by the presence/absence of each of the molecules in the specified ZINC subset(s).Fetch ligands: Extracts the ligands related to a
SetOfDatabaseIDs
.
C) Sequence
It incorporates protocols for managing biological sequences, including tools for defining sequence regions of interest.
Import set of sequences: Imports a set of sequences from one or several fasta files or from a database like UniProt using a
SetOfDatabaseIDs
as input.Pairwise Alignment: Performs a pairwise alignment using clustal omega over two input sequences.
Multiple Sequence Alignment: Performs a multiple sequence alignment (MSA) over a set of input sequences.
Define set of sequences: Allows the user to manually build a set of small molecules from individual elements.
Import variants: Imports a set of sequence variants.
Generate variant sequences: Generates a set of sequences from a list of specified variants from a
SequenceVariants
object.Import Sequence ROIs: Imports a
SetOfSequenceROIs
, meaning a set of Regions Of Interest (ROI) in a sequence.Define Sequence ROIs: Defines a
SetOfSequenceROIs
from aSequence
orSequenceVariants
object.Operate Sequence ROIs: Allows the user to operate sets of sequence ROIs, similarly to the operate sets.
Extract Sequence ROIs: Defines a
SetOfSequenceROIs
from an input set of sequences based on the conservation of each position in the alignment.Map Sequence ROIs: Maps a set of sequence ROIs to an atomic structure where the sequence can be mapped.
D) Virtual Drug Screening
Main group of protocols that incorporates most of the functionalities related to the VDS workflow.
Import Small Molecules: Imports a set of small molecules from one or several files or from default database libraries like ECBL or ZINC.
Extract Small Molecules: Extract the small molecules present in a
AtomStruct
object.Draw Small Molecules: Runs JChemPaint java program and allows the user to draw their own molecules.
OpenBabel Prepare Small Molecules: Prepares a
SetOfSmallMolecules
using OpenBabel.RDKit Prepare Small Molecules: Prepares a
SetOfSmallMolecules
using RDKit.Prepare Receptor: Provides a simple
AtomStruct
preparation with BioPython where the user can choose different cleaning options like removing waters, heteroatoms, keep only specific chains…ADME Small Molecules filter: uses RDKit to filter a
SetOfSmallMolecules
by applying the ADME (Absortion, Distribution, Metabolism, Excretion) filter to each of the small molecules stored.PAINS Small Molecules filter: Uses RDKit to filter a
SetOfSmallMolecules
by applying the PAINS (Pan-assay interference compounds) filter to each of the small molecules stored.Shape Small Molecules filter: Uses RDKit to filter a
SetOfSmallMolecules
by applying shape filters to each of the small molecules stored.FingerPrint Small Molecules filter: Uses RDKit to filter a
SetOfSmallMolecules
by applying fingerprint filters to each of the small molecules stored.Pharmacophore generation: Generate a
Pharmacophore
object that can be parse by RDKit from aSetOfSmallMolecules
.Pharmacophore modification: Modifies the properties of the features inside a
Pharmacophore
object.Pharmacophore filtering: Uses RDKit for filtering a
SetOfSmallMolecules
by matching them with aPharmacophore
.Define Structural ROIs: Allows the user to manually define a
SetOfStructROIs
fromAtomStruct
objects.Consensus Structural ROIs: Performs a consensus operation over several
SetOfStructROIs
, studying which of them are shared among all or a subset of the input sets.Score docking positions: Allows the user to rescore a
SetOfSmallMolecules
docked to a receptor using several ODDT scoring functions.RMSD docking: Allows the user to calculate the RMSD between a
SetOfSmallMolecules
to a reference molecule docked to the same receptor.Consensus docking: Performs a consensus operation over several docked
SetOfSmallMolecules
, studying which positions are shared among all or a subset of the input sets.SASA calculation: Uses BioPython to calculate the SASA (Solvent-Accessible Surface Area) for each residue in an
AtomStruct
.