Antibody development — end-to-end map (tools, outputs, parameters)

Self-contained HTML: diagrams are static scalable vector graphics (SVG) (no internet or JavaScript required).

Contents
Legend

Overall pipeline

Diagram (nodes and edges)
overview Start Start In silico antibody development pipeline P0 Phase 0: Problem framing Define target context, objectives, inputs, and success criteria Start->P0 Gate0 Readiness gate Inputs, numbering, and schemas consistent? P0->Gate0 Gate0->P0 revise P1 Phase 1: Epitope discovery Predict, filter, and select epitopes Typical volume: hundreds → tens Gate0->P1 pass Gate1 Epitope gate Accessible, conserved, and functional? P1->Gate1 Gate1->P0 reframe Gate1->P1 refine P2 Phase 2: Paratope design Design antibodies and score antibody–antigen complexes Typical volume: thousands → hundreds → tens Gate1->P2 pass Gate2 Paratope gate Binding plausible, diverse, low immunogenicity? P2->Gate2 Gate2->P2 iterate design P3 Phase 3: Developability and safety Parallel screens and liability fixing Typical volume: tens → few Gate2->P3 pass Gate3 Developability gate Pass thresholds and aggregation rules? P3->Gate3 Gate3->P2 return to refinement Gate3->P3 fix and re-score P4 Phase 4: Integration and learning Multi-objective ranking, wet lab plan, validation, and updating Gate3->P4 pass P4->P0 update objectives P4->P1 update weights End End Decision-ready candidates and updated priors P4->End

Glossary of common terms

Terms used in the map
  • three-dimensional (3D)
  • complementarity determining region (CDR)
  • variable heavy chain (VH)
  • variable light chain (VL)
  • Protein Data Bank format (PDB)
  • angstrom (Å)
  • carbon alpha (Cα)
  • molecular dynamics (MD)
  • root mean square fluctuation (RMSF)
  • predicted local distance difference test (pLDDT)
  • post-translational modifications (PTM)
  • solvent accessible surface area (SASA)
  • Basic Local Alignment Search Tool for proteins (BLASTp)
  • major histocompatibility complex (MHC)
  • surface plasmon resonance (SPR)
  • graph neural network (GNN)

Phase 0: Problem framing

Diagram (nodes and edges)
phase0 Start Phase 0: Problem framing Establish target context and data structures Context Target definition and biological context Isoform and construct choice; oligomeric state and binding partners Post-translational modifications (PTM) (for example glycosylation); native environment Start->Context Obj Research objectives and constraints Mechanism of action, success criteria, safety constraints, format constraints Context->Obj Load Load inputs Antigen sequences and structures; known complexes; off-target list Obj->Load Schema Numbering and schema conventions Residue mapping across structures; complementarity determining region (CDR) boundaries Canonical chain naming; consistent indices for downstream tools Load->Schema Qual Data quality check Assess structural uncertainty in functional regions Parameters: • confidence filters (for example predicted local distance difference test (pLDDT)) • experimental confidence (for example B-factor) Schema->Qual Notebook Initialize lab notebook Define per-experiment records and per-candidate biography fields Qual->Notebook Output Phase 0 output Well-defined goals, inputs, and bookkeeping ready for Phase 1 Notebook->Output
Scientist thinking (hypotheses and setup)
  • High-level hypothesis: Good outcomes depend on correct target context, well-defined objectives, and consistent data schemas; downstream results are only as reliable as the framing and inputs.
  • Target context: Record isoform and construct choice, oligomeric state and binding partners, post-translational modifications (PTM) (for example glycosylation), and native environment assumptions.
  • Schema conventions: Define residue numbering and complementarity determining region (CDR) boundary conventions so every tool uses compatible indices.
  • Bookkeeping: Decide what will be stored per candidate (scores, flags, complex models, paratope masks, and decisions) before running large-scale screening.
Candidate “biography” data structure (example)
candidate c:
    epitope_identifier: [E_i]
    origin_program: [Program 2A (assembly), Program 2B (generative)]
    variable_heavy_chain_sequence, variable_light_chain_sequence
    three_dimensional_complex_model
    paratope_protection_mask
    scores:
        binding: {per_tool_scores, ensemble_score}
        developability: {therapeutic_antibody_profiler, aggrescan3d, camsol, stability_proxy}
        safety: {cross_reactivity_docking, motif_mimicry}
    risk_flags: [String]
    probability_of_success: P(success | evidence)

Phase 1: Epitope discovery

Diagram (nodes and edges)
phase1 Start Phase 1: Epitope discovery Interrogate the target and select epitopes BEPIPRED Experiment 1A-1: BepiPred Permissive sequence-based epitope prediction Parameters: • epitope threshold Start->BEPIPRED ELLIPRO Experiment 1A-2: Ellipro Permissive structure-based epitope clustering Parameters: • minimum score • maximum distance (angstrom (Å)) Start->ELLIPRO DISCOTOPE Experiment 1A-3: Discotope Permissive structure-based per-residue epitope scoring Parameters: • structure prediction confidence filter (optional) Start->DISCOTOPE Pool Candidate epitope pool (E_pool) Collect patches predicted by multiple tools Typical volume: hundreds to thousands patches BEPIPRED->Pool ELLIPRO->Pool DISCOTOPE->Pool ContextFilter     Experiment 1B: Contextual epitope filters (cheap first)     Filter and annotate epitopes in native context before expensive simulation     • accessibility across conformations (solvent exposure)     • masking (glycan shielding and complex occlusion, if relevant)     • conservation and escape-risk screen     • functional site annotation overlay (active sites, receptor interfaces)     • off-target surface-patch similarity screen (optional)     Typical result: shortlist of tens of epitopes     Parameters:     • accessibility threshold and masking rules     • conservation cutoff and escape-risk weighting     • off-target similarity threshold (optional)   Pool->ContextFilter RunMD Run molecular dynamics? Only for top N or when predictors disagree Expensive step ContextFilter->RunMD MD Experiment 1C: Molecular dynamics simulation Down-weight flexible or transiently exposed epitopes Parameters: • simulation length (approximately 100–500 nanoseconds) • explicit solvent box RunMD->MD yes Integrate Experiment 1D: Evidence integration and selection Rank epitopes using static consensus + contextual filters + dynamic stability + functional relevance Parameters: • weighting in combined epitope score RunMD->Integrate no ReRun Re-score dominant conformational states Cluster trajectory and re-run epitope scoring on dominant states Parameters: • number of clusters to re-score (for example top 3) MD->ReRun ReRun->Integrate Gate Epitope gate Meets accessibility, conservation, and functional criteria? Integrate->Gate Gate->Pool expand pool or add data Gate->ContextFilter fail: adjust filters Output Phase 1 output Selected epitopes with explicit binding hypotheses Typical: five to twenty epitopes Gate->Output pass
Scientist thinking (hypotheses and experiments)
  • Epitope-level hypothesis: Epitopes that are exposed in the native context, conserved, and functionally relevant are more likely to yield effective binders and less likely to fail later.
  • Cheap first, expensive later: Run permissive static prediction and contextual filters on a broad pool, then run molecular dynamics only on a shortlist.
  • Experiment 1A: Permissive static epitope prediction using Experiment 1A-1 (BepiPred), Experiment 1A-2 (Ellipro), and Experiment 1A-3 (Discotope) to generate a broad candidate pool.
  • Experiment 1B: Contextual epitope filters: accessibility and masking, conservation and escape risk, functional site overlay, and optional off-target similarity screening.
  • Experiment 1C: Conditional molecular dynamics simulation for top candidates or when static predictors disagree.
  • Experiment 1D: Evidence integration and an explicit epitope gate; failures loop back to adjust filters or expand the pool.
Tools, inputs, outputs, and parameters
Tool or step Input Output Parameters and thresholds
BepiPred
Epitope identification (sequence-based)
Protein sequencePer-residue epitope probability score and predicted linear epitope segments
  • Epitope threshold: Higher threshold increases precision (typically fewer residues predicted)
Ellipro
Epitope identification (structure-based clustering)
Protein three-dimensional (3D) structureClusters of residues forming linear or conformational epitopes
  • Minimum score: Higher values are more stringent and return fewer epitopes
  • Maximum distance (angstrom (Å)): Clustering radius for discontinuous epitopes
Discotope
Epitope identification (structure-based scoring)
Protein three-dimensional (3D) structurePer-residue epitope probability score
  • Structure confidence filter: Optionally ignore low-confidence regions
Accessibility and masking analysis (for example solvent accessible surface area (SASA) and glycan masking rules)
Contextual epitope filter
Epitope patches on one or more conformational states; optional glycan model and complex partnersAccessibility score per epitope; masking flags (glycan shielding, steric occlusion); exposure frequency across states
  • Accessibility threshold: Minimum solvent exposure required
  • Masking rules: What counts as occluded in the native context
  • State weighting: How to combine exposure over conformations
Conservation and escape-risk screen (multiple sequence alignment)
Contextual epitope filter
Target sequence set (homologs, variants, or strains) and epitope residue indicesConservation score per epitope; predicted escape-risk annotation
  • Sequence set definition: Which homologs or variants to include
  • Conservation cutoff: Minimum acceptable conservation
  • Escape weighting: How strongly to penalize variable residues
Functional site annotation overlay
Contextual epitope filter
Known or predicted functional residues (active sites, receptor-binding interfaces, allosteric sites)Functional relevance score per epitope and rationale notes
  • Functional distance threshold: How close an epitope must be to a functional region
  • Evidence weighting: How to treat curated versus predicted annotations
Off-target surface-patch similarity screen (optional)
Early cross-reactivity risk screen
Epitope patch descriptors and an off-target surface librarySimilarity hits and a pre-design cross-reactivity risk flag
  • Similarity metric: Shape and physicochemical similarity definition
  • Hit threshold: Similarity cutoff for flagging
  • Library scope: Which off-targets to include
Molecular dynamics simulation (for example GROMACS, AMBER)
Dynamic interrogation (expensive; shortlist only)
Static structure and a shortlist of epitopes or regionsTrajectory; residue flexibility metrics (root mean square fluctuation (RMSF)); dominant conformational states
  • Simulation length: Approximately 100–500 nanoseconds (or longer if needed)
  • Solvent model: Explicit solvent box
  • Selection rule: Run only on top N epitopes or when predictors disagree
Evidence integration and epitope gate
Decision step
All epitope scores and annotations (static, contextual, dynamic, functional)Ranked epitope list; explicit pass or fail decision with reason codes
  • Weighting scheme: Combine static consensus, accessibility, conservation, functional relevance, and dynamics
  • Hard constraints: Any non-negotiable criteria (for example must be exposed)

Phase 2: Paratope design

Diagram (nodes and edges)
phase2 Start Phase 2: Paratope design Design antibodies for each selected epitope OPTMAVEN Experiment 2A-1: OptMAVEn-2.0 Knowledge-based assembly of variable heavy chain and variable light chain modules Parameters: • epitope definition • allowable module libraries (germline choices) • energy and score thresholds Start->OPTMAVEN GEN Experiment 2B-1: De novo generative design Generate novel complementarity determining region loops and sequences Typical volume: thousands per epitope Parameters: • number of generated candidates • plausibility and diversity filters Start->GEN RABD Experiment 2A-2: RosettaAntibodyDesign (RAbD) Complementarity determining region grafting and redesign via Monte Carlo search Parameters: • which complementarity determining regions to design • interface physics options OPTMAVEN->RABD PREFILTER     Cheap pre-filters (before complex modeling)     Remove obvious problems before expensive refinement     • internal clashes and geometry sanity checks     • basic sequence liabilities and framework constraints     Typical result: hundreds → tens   RABD->PREFILTER CLUSTER Experiment 2B-2: Diversity management Cluster by sequence and structure; keep distinct binding modes Typical result: hundreds of representatives Parameters: • clustering radius and diversity targets GEN->CLUSTER CLUSTER->PREFILTER COMPLEX     Experiment 2C: Complex modeling and interface scoring     Generate antibody–antigen complexes and compute binding evidence     • docking or complex prediction     • interface plausibility checks (clashes, buried unsatisfied polar groups)     Parameters:     • scoring function and clash thresholds     • number of poses retained per candidate   PREFILTER->COMPLEX ABDESIGN Experiment 2D: AbDesign (constrained refinement) Sharpen interface and enforce key interactions (expensive) Parameters: • docking constraints and key epitope residues • number of refinement iterations COMPLEX->ABDESIGN PARATOPE Paratope definition artifact Store interface residues as a protection mask for Phase 3 optimization COMPLEX->PARATOPE IMMUNO     Experiment 2E: Humanization and immunogenicity screen (optional)     Confirm human germline proximity and reduce immune risk     • germline similarity and humanization rules     • major histocompatibility complex (MHC) class II peptide risk prediction     Parameters:     • maximum allowed non-germline changes     • peptide-risk thresholds and allele set   ABDESIGN->IMMUNO ABDESIGN->PARATOPE update mask Gate Paratope gate Binding plausible, diverse, low immunogenicity? IMMUNO->Gate Gate->RABD fail: redesign Gate->GEN fail: explore new modes Output Phase 2 output Diverse panel of candidate antibodies with complex models Typical volume: tens per target Gate->Output pass
Scientist thinking (hypotheses and programs)
  • Paratope-level hypothesis: For each selected epitope, designs with plausible antibody–antigen complexes and diverse binding modes are more likely to succeed than a large set of near-duplicates.
  • Cheap first, expensive later: Generate many candidates, cluster and pre-filter cheaply, then do complex modeling and constrained refinement on a shortlist.
  • Complex evidence is explicit: Binding scores come from explicit complex modeling and interface plausibility checks (not from refinement alone).
  • Paratope protection mask: Store interface residues so Phase 3 optimizations (for example solubility design) can avoid breaking binding.
  • Gate and loops: If binding is implausible, diversity collapses, or immunogenicity risk is high, loop back to redesign rather than proceeding forward.
Tools, inputs, outputs, and parameters
Tool or step Input Output Parameters and thresholds
OptMAVEn-2.0
Paratope design (knowledge-based module assembly)
Antigen three-dimensional (3D) structure with epitope residues and a database of variable heavy chain (VH) and variable light chain (VL) modulesRanked variable heavy chain and variable light chain designs (sequences and assembled structures) with binding energies or scores
  • Epitope definition: Patch definition used for design
  • Module libraries: Germline and framework constraints
  • Energy thresholds: Cutoffs for retaining designs
RosettaAntibodyDesign (RAbD)
Paratope design (complementarity determining region redesign)
Antigen structure with target epitope region and an antibody framework for complementarity determining region graftingRanked designs as Protein Data Bank format (PDB) structures with new complementarity determining region sequences and structures
  • Complementarity determining regions to design: Choose among light chain complementarity determining regions 1–3 (L1–L3) and heavy chain complementarity determining regions 1–3 (H1–H3)
  • Interface options: Penalize unsatisfied polar atoms; enforce hydrogen bonds and salt bridges
De novo generative design (diffusion model or graph neural network (GNN))
Paratope design (generative)
Epitope three-dimensional coordinates and a fixed antibody frameworkLarge sets of de novo complementarity determining region loop structures and sequences
  • Number of generated designs: Typically thousands per epitope
  • Initial filters: Plausibility, diversity, and internal energy
Diversity management and clustering
Down-selection step (cheap first)
Design set (sequences and structures) from one or more programsCluster representatives covering distinct sequences and binding modes; redundancy-reduced set
  • Clustering metric: Sequence identity and structural similarity
  • Diversity targets: Minimum number of clusters or modes to retain
Cheap pre-filters (geometry and sequence sanity checks)
Down-selection step (cheap first)
Candidate antibody models and frameworksFiltered set removing obvious steric clashes and out-of-distribution sequences
  • Clash thresholds: Maximum allowed internal clashes
  • Framework constraints: Allowed frameworks and complementarity determining region length bounds
Complex modeling and interface scoring
Binding evidence production
Candidate antibodies, epitope structures, and optional conformational ensemblesAntibody–antigen complex poses; binding scores; interface plausibility metrics
  • Pose count: Number of complex poses retained per candidate
  • Interface checks: Clash cutoff; buried unsatisfied polar groups penalty
  • Scoring function: Which binding score or ensemble score is used
AbDesign
Constrained refinement (expensive; shortlist only)
Antigen structure and antibody backbone fragments or ensemblesRefined antibody models with optimized interface and stability metrics
  • Docking constraints: Key epitope residues and desired interactions
  • Iteration budget: Number of refinement rounds
Humanization and immunogenicity screen (optional)
Risk-reduction gate
Candidate sequences and germline reference sets; major histocompatibility complex (MHC) class II peptide risk modelsHumanization recommendations; immunogenicity risk flags
  • Germline proximity target: Maximum divergence allowed
  • Peptide-risk thresholds: Cutoffs and allele set used
Paratope definition artifact (protection mask)
Interface bookkeeping
Complex model(s) for each candidateList of interface residues to protect during Phase 3 optimization (for example during CamSol design mode)
  • Interface definition: Distance cutoff for defining contact residues
  • Ensemble rule: Union or intersection across poses
Paratope gate
Decision step
Binding score evidence, diversity statistics, and immunogenicity flagsPass or fail decision with reason codes and loops back to design if needed
  • Binding plausibility thresholds: Minimum interface quality requirements
  • Diversity requirements: Minimum cluster coverage
  • Immunogenicity constraints: Hard fail versus flag rules

Phase 3: Developability and safety

Diagram (nodes and edges)
phase3 Start Phase 3: Developability and safety Parallel screens and liability fixing Seq     Experiment 3A: Sequence-based developability screens     Fast screens (cheap first)     • Therapeutic Antibody Profiler (TAP) sequence liabilities     • chemical liabilities (deamidation, isomerization, oxidation)     • variable-region glycosylation motif risk     • self-interaction, polyspecificity, and viscosity risk proxies   Start->Seq Struct     Experiment 3B: Structure-based developability screens     Use structure and paratope mask from Phase 2     • Aggrescan3D aggregation hotspots     • CamSol solubility prediction and variant suggestions (protect paratope)     • stability proxies (for example thermal stability prediction)   Start->Struct Safety     Experiment 3C: Safety cross-reactivity screens     Assess off-target binding risk     • AntiTarget-Dock docking against high-risk proteins     • Proto-Blast-Search motif mimicry screen   Start->Safety Integrate     Risk aggregation and decision thresholds     Combine evidence from parallel screens into a single decision     Parameters:     • hard-fail thresholds per screen     • weighted risk aggregation rule     • flag versus fail policy     • safety aggregation (docking and motif) thresholds   Seq->Integrate Struct->Integrate Safety->Integrate Gate Developability and safety gate Pass thresholds and aggregation rules? Integrate->Gate Fix     Fix liabilities and re-score loop     Propose variants and rerun screens until the gate passes     • apply CamSol suggestions with paratope protection     • if interface changes, return to Phase 2 refinement (optional)   Gate->Fix fail Output Phase 3 output Refined candidate set that passes developability and safety Typical volume: few candidates Gate->Output pass Fix->Seq re-score Fix->Struct re-score Fix->Safety re-score
Scientist thinking (hypotheses and gates)
  • Developability hypothesis: Strong predicted binders are viable only if they pass sequence liabilities, aggregation risk, solubility, stability proxies, and safety screens.
  • Parallel, not serial: Most screens are independent and can run in parallel, feeding a single gate with explicit aggregation rules.
  • Additional liabilities: Include chemical liabilities, variable-region glycosylation motifs, and self-interaction, polyspecificity, and viscosity risk proxies alongside aggregation and solubility.
  • Fix liabilities and re-score loop: Failures trigger mutation or variant proposals (protecting the paratope) and re-scoring; if the interface changes, return to Phase 2 refinement.
  • Decision clarity: The gate should distinguish “flag” from “fail” and state how docking and motif screens combine into a final safety decision.
Tools, inputs, outputs, and parameters
Tool or step Input Output Parameters and thresholds
Therapeutic Antibody Profiler (TAP)
Developability (sequence liabilities)
Variable heavy chain (VH) and variable light chain (VL) sequencesPass, flag, or fail status against core sequence metrics with residue drivers
  • Structure usage: Whether to model or use a provided structure
  • Flag versus fail policy: How many flags are tolerated before failing
Chemical liability scan
Developability (sequence liabilities)
Candidate sequencesLiability sites (for example deamidation, isomerization, oxidation) and suggested mitigations
  • Motif definitions: Which sequence patterns to flag
  • Context filters: Structural exposure requirements, if used
Glycosylation motif risk scan
Developability (sequence liabilities)
Candidate sequencesPotential glycosylation motifs in variable regions and risk flags
  • Motif definition: How N-linked motifs are detected
  • Region scope: Which regions are evaluated (for example complementarity determining regions only)
Self-interaction, polyspecificity, and viscosity risk proxy
Developability (formulation and in vivo risk proxies)
Candidate sequences and or structuresPredicted self-interaction or nonspecific binding risk, plus viscosity risk proxies, with flags and drivers
  • Proxy model choice: Sequence-only versus structure-informed predictor
  • Thresholds: Risk cutoffs used for gating
Aggrescan3D
Developability (aggregation risk in three-dimensional context)
Candidate structure in Protein Data Bank format (PDB); optional chain or region selectionAggregation-prone surface patches and suggested point mutations
  • Exposure and sphere radius: Around carbon alpha (Cα) for exposure calculations
  • Mutation scan options: Which residues and substitutions to try
CamSol
Developability (solubility prediction and optimization)
Candidate sequences and an optional paratope protection mask from Phase 2Solubility scores and ranked mutation suggestions or variant libraries
  • Regions to protect: Protect the paratope during design
  • Mutation budget: Number of mutations allowed per variant
  • Environment assumptions: pH and ionic strength
Stability proxy (for example thermal stability prediction)
Developability (stability)
Candidate sequences and or structuresStability score proxy and risk flags
  • Model choice: Which stability proxy is used
  • Cutoffs: Minimum acceptable stability proxy value
AntiTarget-Dock
Safety (cross-reactivity docking screen)
Candidate antibody models and a structural library of high-risk proteinsPredicted off-target binding energies and a docking-based risk score
  • Anti-target library: Which proteins are included
  • Docking thresholds: Binding energy or score cutoff for flagging or failing
Proto-Blast-Search (Basic Local Alignment Search Tool for proteins (BLASTp))
Safety (motif mimicry screen)
Candidate sequences and a reference proteome databaseMotif similarity matches and a motif-based risk score contribution
  • Motif length and thresholds: Similarity cutoff and permitted mismatches
  • Aggregation rule: How motif and docking risk combine into pass, flag, or fail
Risk aggregation and developability gate
Decision step
Outputs of all developability and safety screensPass or fail decision; prioritized liability list; triggers a fix-and-re-score loop on failure
  • Hard-fail thresholds: Non-negotiable cutoffs per screen
  • Weighted aggregation: How multiple risks are combined
  • Safety aggregation: How docking and motif screens are thresholded and combined

Phase 4: Integrated analysis and learning

Diagram (nodes and edges)
phase4 Start Phase 4: Integrated analysis and learning Turn in silico evidence into a ranked wet lab plan and learn from outcomes Dossier Candidate dossier synthesis Summarize epitope choice, complex models, developability, and safety per candidate Start->Dossier Prob     Probability of success model     Estimate calibrated probability of wet lab success per candidate     Parameters:     • logistic meta-model features and calibration method     • report P(success | evidence) per candidate   Dossier->Prob Pareto     Multi-objective ranking (Pareto selection)     Separate probability from constraints and trade-offs     • probability of success is one axis     • developability and safety are constraints or additional axes   Prob->Pareto Plan     Wet lab selection policy     Choose candidates for both validation and learning     • exploit: top-ranked candidates     • explore: high-uncertainty candidates for expected information gain     Parameters:     • selection budget and diversity constraints   Pareto->Plan Measure     Measurement model and data quality plan     Design experiments so outcomes update the model reliably     • replicates and controls     • assay variance and limits of detection     • batch randomization and batch-effect correction   Plan->Measure Wet     Wet lab validation     Run binding, functional, and expression assays     Parameters:     • surface plasmon resonance (SPR) or equivalent binding assay     • functional assay matched to mechanism of action     • expression and basic developability readouts   Measure->Wet Update     Feedback and updating     Update calibration, tool weights, design priors, and (if needed) objectives     Parameters:     • re-fit calibration model     • revise Phase 0 constraints if assays reveal new requirements   Wet->Update Update->Prob re-calibrate End Phase 4 output Ranked wet lab plan, validated outcomes, and updated evidence base Update->End
Scientist thinking (ranking and learning loop)
  • Separate prediction from selection: The probability of success model produces calibrated probabilities; multi-objective ranking then applies developability and safety constraints and trade-offs.
  • Multi-objective ranking: Use a Pareto-style view so candidates are compared across probability, developability, and safety rather than collapsed into a single score without explanation.
  • Explorer candidates: Select a small number of high-uncertainty candidates for expected information gain, not just “one or two by habit.”
  • Measurement model: Replicates, assay variance, controls, and batch handling determine how much the outcomes should update calibration.
  • Feedback: Outcomes update priors and tool weights, and can also trigger changes in Phase 0 objectives and constraints when reality disagrees with assumptions.
Tools and decision steps
Tool or step Input Output Parameters and thresholds
Candidate dossier synthesis
Evidence packaging
All candidate artifacts (epitope rationale, complex models, scores, flags)Per-candidate dossier suitable for review and decision making
  • Dossier fields: What is included and how it is versioned
Probability of success model
Calibration and prediction
In silico features and prior wet lab outcomes (if any)Calibrated probability estimates per candidate
  • Calibration method: Platt scaling, isotonic regression, or Bayesian calibration
  • Feature set: Which in silico signals are included
Multi-objective ranking (Pareto selection)
Decision support
Probability estimates plus developability and safety evidenceShortlist based on trade-offs (Pareto-efficient set) and constraints
  • Constraint definition: What must be satisfied before ranking
  • Trade-off policy: How to handle competing objectives
Wet lab selection policy with expected information gain
Experiment design
Ranked shortlist plus uncertainty estimatesFinal candidate set to test: exploit for success and explore for learning
  • Selection budget: Number of candidates that can be tested
  • Explorer rule: How expected information gain or uncertainty is quantified
Measurement model and data quality plan
Reliability plan
Assay definitions and operational constraintsReplicate plan; controls; batch randomization; data normalization strategy
  • Replicate count: Number of repeats per candidate
  • Batch policy: Randomization and correction approach
Feedback and updating
Learning loop
Wet lab outcomes and metadataUpdated calibration, tool weights, and design priors; potential updates to Phase 0 constraints
  • Update schedule: When and how models are re-fit
  • Change management: How updated priors and constraints are recorded