Balto: AI-Powered Computational Drug Discovery & Molecular Modeling Platform
1. Overview
Balto is an AI-powered computational assistant developed by Deep Origin to accelerate drug discovery and democratize access to in silico molecular modeling tools. It provides a robust suite of functionalities, including:
- Molecular Docking: Simulate the interaction of ligands (small molecules) with protein targets
- Molecular Property Prediction: Predict key properties like lipophilicity, solubility, and other ADMET properties
- Protein-Ligand Interaction Analysis: Analyze the results of docking simulations, identifying key protein-ligand interactions across
- Database Integration: Access and query multiple public databases (PDB, ChEMBL, UniProt, PubChem, BindingDB, AlphaFold) and search scientific literature and patents
- Data Processing: Handle and analyze molecular datasets, including CSV files
- Visualization: Generate interactive 2D and 3D visualizations of molecules, proteins, and docking results
For updates and upcoming features, visit the Balto website.
2. Key Features
Balto offers a wide range of features to support drug discovery.
2.1 Molecular Data Retrieval & Informatics
Balto integrates with several key databases, allowing you to retrieve comprehensive molecular and protein data:
- Protein Data Bank (PDB): https://www.rcsb.org/ – Retrieve protein structural data
- ChEMBL: https://www.ebi.ac.uk/chembl/ – Access information on bioactive molecules, drug targets, and assay data
- UniProt: https://www.uniprot.org/ – Search protein sequences and annotations
- PubChem: https://pubchem.ncbi.nlm.nih.gov/ – Query information on small molecules and bioassay data
- BindingDB: https://bindingdb.org/rwd/bind/index.jsp – Retrieve protein-ligand affinity data
- Patent Check – Identify patented molecules by inputting a SMILES string or a list of SMILES strings to check for existing patents
- Scientific Literature Search – Automate retrieval of research publications
- Analyze PDF Documents – Extract valuable insights from scientific publications, including summarizing key findings, identifying important data points, and retrieving relevant sections
- Convert Between Molecular Identifiers – Facilitate conversion between different molecular representations such as SMILES strings, CAS numbers, and IUPAC names to ensure consistency across chemical databases
2.2 Molecular Property & ADMET Profiling
Balto provides in silico predictions for key molecular properties and toxicity assessments, helping you evaluate the drug-likeness of compounds:
- Aggregated Theoretical Descriptors:
- Synthetic Accessibility Score (SAS): Assess molecular complexity
- Quantitative Estimate of Drug-likeness (QED): Evaluate overall drug-likeness
- Physicochemical Properties:
- LogP: Estimate lipophilicity
- LogD: Calculate the pH-dependent distribution coefficient
- LogS: Predict aqueous solubility
- Toxicity Predictions:
- hERG Inhibition Prediction: Assess the risk of cardiotoxicity
- CYP450 Interaction Prediction: Analyze metabolic stability
- Ames Mutagenicity Estimation: Model the risk of carcinogenicity
2.3 Protein Structure Analysis & Ligand Preparation
Balto offers tools for manipulating protein structures and preparing ligands for docking:
- Protein Processing & Structural Manipulations:
- Binding Pocket Identification: Find potential binding sites on a protein
- Protein Structure Alignment: Align multiple protein structures
- Ligand Preparation:
- Protonation State Prediction: Determine the likely protonation state of a ligand at a given pH
- Functional Group Identification: Identify functional groups within a molecule
- Molecular Similarity Search: Find molecules similar to a query molecule
2.4 Molecular Docking
Balto supports ligand docking and subsequent analysis:
- Standard Docking: Perform molecular docking simulations
- Docking Analysis: Analyze the results of docking simulations, including binding energies and key protein-ligand interactions
- Docking Box Visualization: Visualize the defined docking box in 3D
- Ligand and Protein Preparation: Tools for preparing ligands and proteins for the docking process
2.5 Data Processing & Analysis
Balto automates data handling and analysis:
- Molecular Dataset Handling:
- CSV Data Extraction, Processing, and Querying:
- Data Extraction: Read and extract molecular data from CSV files containing SMILES strings, molecular properties, or assay results
- Data Processing: Perform cleaning, handle missing values, and transform the dataset into a suitable format for analysis
- Querying: Filter and query CSV data using SQL-like commands to extract specific rows, columns, and aggregated insights
- Batch Data Processing & Molecular Analysis Pipelines:
- Batch Processing: Handle multiple molecules simultaneously instead of one-by-one, improving efficiency for large datasets
- Molecular Analysis Pipelines: Automate workflows for molecular screening, such as batch docking, property prediction, and dataset filtering
- Batch Docking: Perform docking on multiple molecules at once, reducing computational time and increasing throughput for drug discovery applications
- CSV Data Extraction, Processing, and Querying:
- Computational Workflow Automation:
- File Management and Processing: List available files in the session’s directory to identify accessible data; Retrieve and stream file content for analysis without separate downloads.
- Basic Retrosynthetic Engine: Predict basic synthetic routes for complex molecules
2.6 Structural & Visualization Tools
Balto provides powerful visualization capabilities:
- Molecular Rendering & Visualization:
- SMILES Structure Drawing: Generate 2D representations of molecules from SMILES strings
- Protein Visualization: Create interactive 3D visualizations of protein structures
- Binding Pocket Visualization: Highlight and visualize identified binding pockets
- Docking Results Visualization: Display docking results interactively, showing ligand poses and interactions
- Structural Analysis:
- Calculate Structural RMSD: Calculate the Root Mean Square Deviation (RMSD) between structures
- Mutagenesis Analysis: Compare two protein structures to identify mutations and structural gaps
3. Getting Started
- Account Creation/Login: Sign up for an account here or log in to your existing account.
- Initial Interface: Upon logging in, you’ll be presented with the main Balto interface, which is organized into distinct panels (described below).
- Access Balto: If the interface above is not what you see, navigate to Balto within the Deep Origin platform from the left panel. You will find it listed among the available products.
4. User Interface (UI)
The Balto interface is designed for intuitive navigation and efficient workflow execution. It is divided into four main panels (see detailed descriptions in the following sessions):
- Products and Account Settings panel
- Chat Space panel
- Balto Workspace panel
- Chat Settings panel
4.1 Products and Account Settings Panel
The left panel provides access to various Deep Origin products and account settings. It is divided into three segments:
Products and Settings
- Top Segment: Displays a list of all Deep Origin products that can be accessed through your account (e.g., Balto)
- Middle Segment: Shows your remaining credits for paid actions within Balto
- Bottom Segment: Provides links to:
- Account: Manage your account information (email, name, title, company, password). Clicking “Account” takes you to the account settings page
- Settings: Access subscription and billing details, manage credits, add team members, set payment methods, and more. (See Settings Menu for details.)
- Documentation: Direct access to this documentation
- Support: Go to the Deep Origin support portal to request features or report issues
- Balto Getting Started: Access introductory guides and tutorials
- Logout: Log out of your Balto account
You can collapse or expand this panel by clicking the double arrow (<<
) next to your name.
4.2 Chat Space Panel
The middle panel is the primary interaction area where you communicate with Balto using natural language. It displays your current conversation with Balto. When you first log in, this area shows typical tasks that Balto can execute.
It consists of:
- Top Segment: The conversational area displaying your prompts and Balto’s responses.
- Middle Segment: Suggested actions that Balto can perform (e.g., Dock, Find, Model, Calculate, Visualize, Ask). Clicking an action provides example prompts.
The default Balto tasks will disappear once you start interacting with Balto.
- Bottom Segment: The input area where you type your requests. You can also upload files using the paperclip icon (
📎
).
4.3 Balto Workspace Panel
The right panel lets you access data produced by Balto. It includes intermediate files, visualizations, jobs, and more. It has four tabs:
- Models: Displays 3D visualizations generated by Balto (e.g., protein structures, ligand poses) that you can interact with (rotate, zoom, etc.).
- Files: Lists all files uploaded or generated during your session.
- Jobs: Shows the status of any background tasks (e.g., docking simulations) with progress updates and result links.
- Molecules: Lists all of the molecules gathered or uploaded during the session, displaying 2D structures, SMILES strings, and other properties.
You can extract a tab into a separate segment window by clicking the extract icon:
4.4 Chat Configuration Panel
The top panel provides options for managing your current Balto chat:
- Menu Button: Shows or hides the Balto chat menu
- Share Button: Generates a public link to share your current chat session with colleagues
- Settings Button: Opens a menu to configure auto-approval thresholds for paid Balto actions.
Here you can
- Enable/Disable Auto-Approval: Toggle automatic approval for Balto actions consuming credits
- Set Auto-Approved Credit Limit: Define the maximum credits that can be automatically used before manual approval is required
5. Pricing and Subscription
Balto operates on a subscription-based model. For a monthly fee of $32, subscribers receive 32 credits—which equates to $32 worth of usage per month under the Balto Basic Tier. In addition to the base subscription, simulations are billed on-demand through a progressive pricing system, ensuring that the cost per action decreases as the volume of simultaneous actions increases.
At present, Balto offers a single tier: the Basic Tier. For information on current and upcoming subscription options, please visit the Balto Home Page.
5.1 Progressive Pricing
While most Balto actions are available for free, certain simulations incur a cost. The following sections outline the pricing for specific actions. Balto’s progressive pricing model reduces the cost per action as you perform more actions simultaneously. For instance:
- Low Volume Usage: If you submit 5 molecules for docking, the cost is calculated as: 5 molecules × 1.0 credit = 5.0 credits.
- High Volume Usage: For 100 molecules submitted for docking, the pricing is tiered: 10 molecules at 1.0 credit each plus 90 molecules at 0.1 credit each, totaling 19.0 credits.
Docking and Molecular Property Predictions
Simulation Type | Job Size | Cost per Action |
---|---|---|
Docking | 1–10 dockings/predictions | 1.00 credit each |
After the 10th docking/prediction | 0.10 credits each | |
Molecular Properties* | 1–10 predictions | 0.50 credits each |
After the 10th prediction | 0.05 credits each |
Note: Molecular properties include predictions for aqueous solubility (LogS), lipophilicity (LogP or Log D), hERG blocker probability, Ames mutagenicity, and CYP interactions.
Pocket Finder
Each pocket prediction of novel pockets is priced at $5 per prediction.
PDB Data Extraction
Extraction Type | Document Size | Cost per Extraction |
---|---|---|
PDF Data Extraction | 1–5 pages | 1.00 credit each |
After the 5th page | 0.15 credits each |
5.2 Free Trial
New users can take advantage of a free trial that lasts for 1 month and provides 50 credits. This trial is designed to let you explore Balto’s capabilities. With the trial credits, you can perform actions such as:
- Docking up to 410 molecules
- Predicting up to 910 molecular properties
- Identifying novel pockets for up to 10 proteins
- Extracting data from up to 305 pages of PDF documents
In addition, users have access to numerous free actions, including pose overlays, protein visualizations, and integrated public database queries.
5.3 Subscription Management
Automatic Renewal
Subscriptions are set to renew automatically using the registered payment method. If no payment method is on file, you will be prompted to add one prior to the renewal date.
Changing Your Subscription
To modify your subscription, navigate to the Subscriptions & Plans section in the Settings panel and select the plan that best suits your needs.
Cancellation Process
Cancelling your subscription is straightforward:
- Navigate to Subscriptions & Plans section in the Settings
- Click the Unsubscribe button.
If you experience any issues or if the cancellation option is not visible, then please contact our customer support for assistance.
Restoring Your Subscription
If you cancel your subscription, then you can reactivate your account at any time to resume your work. Please note that if an account remains inactive for 90 days, then data may be subject to deletion.
5.4 Credits
Purchasing Additional Credits
If you require additional credits:
- Upgrade your plan from the Free Trial to Balto Basic by selecting the subscription plan and adding a payment method.
- Click on the Credits section settings tab.
- Select the Buy Credits button.
Purchase Orders (PO)
If you prefer to use a purchase order (PO) instead of a credit card, please contact our sales team for further assistance.
Credits: Paid vs. Discount
- Paid Credits: These are purchased directly or automatically allocated at subscription renewal.
- Discount Credits: These are provided during promotional periods or as part of the free trial.
6. Settings Menu
Clicking “Settings” in the left navigation panel takes you to your Deep Origin organization settings, where you can manage account details, subscriptions, team members, credits, and billing information.
The settings menu has the following tabs:
- Overview: The main page upon entering settings.
- Subscriptions: View your active subscriptions (e.g., “Balto - Enterprise plan”) with details such as seat count and renewal date, plus an option to upgrade
- Payment Methods: Lists your saved payment methods (e.g., credit cards), and allows you to add or update them
- Manage Resources: Direct links to manage credits
- Subscriptions & Plans: Detailed view of available plans and your current subscriptions
- Plans for Balto: Lists available plans such as “Professional,” “Enterprise plan,” and “Balto Basic Plan” with features and pricing
- Subscription History: A table displaying past and current subscription details
- Plans for Balto: Lists available plans such as “Professional,” “Enterprise plan,” and “Balto Basic Plan” with features and pricing
- Members: Manage members of your Deep Origin organization
- Invite Member: Button to invite new members
- Member List: Shows current members and roles (e.g., “Owner”) with options for management
- Credits: Overview of credit usage and balance.
- Buy Credits: Purchase additional credits
- Currently Available Credits: Snapshot of your credit balance
- Credit Breakdown: Detailed view of purchased credits, plan/trial credits, and pending usage.
- Monthly Usage: A chart and table showing monthly credit usage by service (e.g., “Docking - Tier 1”)
- Manage Resources/Billing/Support: Direct links to corresponding pages.
- Invoices: View your invoice history.
- Invoice History: A table with invoice dates, statuses (e.g., “Paid”), and totals. Click an invoice for details.
- Invoice History: A table with invoice dates, statuses (e.g., “Paid”), and totals. Click an invoice for details.
- Variables and Secrets: Manage environment variables and secrets securely for your workflows.
7. Support
For additional guidance, contact us through our Support Center for assistance.
8. Example 1: Dock a ligand to the specific protein structure
In this example we will dock a known small-molecule ligand to its target protein.
The target protein in this example is human soluble guanylate cyclase (SGC). SGC is one of the gasoreceptors responsible for vasodilation. The active form of the enzyme is a heterodimer consisting of alpha-1 and beta-1 subunits (UniProt entry names, GCYA1_HUMAN GCYB1_HUMAN, respectively).
We are asking to dock the ligand of interest, identified by the brand name, to the specific protein structure, identified by the PBD ID:
Balto will download the protein structure and associated information and ask you how to proceed with binding site determination. You can either select:
- A crystal ligand site (most popular)
- Find all possible binding sites with Deep Origin’s proprietary PocketFinder (most comprehensive)
- Enter amino-acid residue number(s) (e.g., 92, 145), the geometric center of which will be used for docking grid box placement
- Enter x, y, z coordinates (e.g., 1.8, 23.4, 36.2) for the docking grid box center
Note: PocketFinder costs 5 credits. Before using the second PocketFinder option, ensure the relevant protein chain(s) are selected for the pocket search. This requires an understanding of the drug discovery project and/or the biological unit of the target protein (e.g., monomer, homodimer, heterodimer, etc.). If needed, Balto can separate chains before running PocketFinder. In the example below, we keep both chains as we would like to target the functional unit of the SGC heterodimer.
If you select the most comprehensive option (find all binding sites with PocketFinder), Balto will ask for your permission to proceed with using credits and, if you approve, it will find all possible binding site and associated pocket parameters such as druggability score, volume, etc. Balto’s output will include a visual representation of the pockets it found in the protein and a table summary of pocket parameters:
You can adjust docking box size or proceed with default parameters using one or multiple identified pockets.
Balto automatically protonates the supplied ligand at pH 7.4 and performs docking. Docking scores appear in the chat space (left), while poses are displayed in the 3D viewer (right). Click a molecule in the viewer for a close-up of the binding site. You can download the pose or expand the view to full screen using the button in the top-right corner.
You can also export docking results in SDF format from the file system.
After the docking poses are available, you can execute post-docking analysis to visualize the key interactions between the ligand atoms and the binding site amino-acid residues using the integrated ProLIF method.
9. Example 2: Find a protein structure and dock a list of ligands
In this example we will select the specific protein structure and dock multiple ligands listed in a column named SMILES in a CSV file.
We will explore PI3K alpha (Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform). PIK3CA (UniPort name PK3CA_HUMAN) is an important protein in PIK3CA/MTOR/AKT signaling pathway, often targeted in oncology indications.
First, ask Balto to find relevant available structures of the PIK3CA protein. Balto will generate a sample CSV file listing 10 proteins that match the search criteria. Clicking on the file name opens a table with detailed information, including X-ray or cryo-EM resolution, protein residues, bound ligand, and its binding affinity.
You can refine the search parameters to generate a new CSV file or request a larger list of protein structures (more than 10):
You can manually review the list of proteins and select a structure or ask Balto for recommendations:
Once you selected the protein structure using steps above, you can upload your a list of ligands and ask Balto to proceed with docking.
Balto will ask you to confirm the docking site. You can proceed with the crystal ligand's binding site (default) or explore additional pockets using Deep Origin’s proprietary PocketFinder (more comprehensive option).
Once you confirm the binding site selection, Balto will automatically prepare the ligands for docking, including protonation state adjustments at pH=7.4. You will be asked to approve the use of credits before proceeding. If approved, Balto will begin the docking process. For ligand lists with 11 or more molecules, the task will be added to the Jobs queue and executed in the background.
Once docking is complete, you can visualize each docked pose and overlay it with the crystal ligand by asking Balto to display docked poses. Alternatively, click the “Show models” button to review each docked pose in full-screen mode.
You can also download the docking results as SDF and CSV files.
Finally, you can perform post-docking analysis to identify key protein-ligand interactions for the batch of ligands and cluster molecules by poses.
These examples demonstrate how Balto can perform complex tasks using plain English while leveraging its domain knowledge when needed. To explore additional tasks and workflows, simply ask Balto for guidance.
Happy drug hunting!