
Multispecific Antibody Naming is Solved
The multispecific antibody field has long lacked a unified naming infrastructure. BioGlyph solves this by embedding building block provenance, chain identity, and format topology directly into the design workflow, capturing structural identity automatically from the moment a molecule is conceived.
The naming problem in multispecific antibody R&D has finally been solved
Every company across pharma we talk with approaches the naming problem for antibodies in different ways, below outlines our arrival to the BioGlyph solution. Our team, while working at Gilead, Genentech and BMS all prior to BioGlyph simply could not ignore it any longer, the multispecific antibody field has a naming problem.
It's not a secret. For years, scientists across the industry have been pointing to the same frustrating reality: a single molecule can carry a dozen different names depending on which lab, database, company (or even company hallway) you pull it from. With the problem compounding as formats get more complex.
Naming complexity isn't unique to antibodies
In small molecule chemistry, a single compound routinely carries an IUPAC name, a CAS number, a generic name, a brand name, and a handful of internal identifiers but the field has built infrastructure to translate between all of them without losing track of what the molecule actually is.
Humans do the same thing… One person can be Dr. Dartayete, Facundo, Son, and darta@bioglyph.app depending on context, but we can navigate that without confusion because there are underlying systems that tie all those identities to the same individual. We have tools like Master Person Index or even LinkedIn and our contacts list that connect these attributes. The existence of multiple names isn't the failure, it’s when there's no structure connecting them. That's exactly where antibody R&D has been stuck.
Antibody-based formats are often referred to by names coined by their inventors, some of which are trademarks like BiTE, DART, and Nanobody, while others like scFv-Fc, Fab-IgG, or 2+1 are used inconsistently and can vary even across labs at the same company.
Scientists should not have to wait 6 months, or pay thousands for an upgrade to add another basic format. PipeBio and Benchling point out that the consequences extend well beyond aesthetics and that inconsistent naming hinders clarity in patent literature, INN submissions, and regulatory filings, where precise format descriptions are critical. Without a standardized nomenclature, it becomes difficult to reliably compare formats, track design innovations, or assess therapeutic equivalence across sources. Yet still no solution.
Across pharma and vendors, distinct pain points often trace back to a single root cause
Dotmatics has written about the operational chaos this creates for tracking multi-format antibody R&D programs. Their team identified the core challenge clearly: as programs scale across multiple teams, the absence of consistent naming infrastructure turns data management into a mess that ELNs and spreadsheets were never designed to handle.
Every existing system, whether from a vendor or built internally, was focused on supporting traditional antibodies and now we are limited to what can be made in these rigid systems. In an era where every IT group wants to support AI/ML, we pose the question: how is this possible without FAIR (Findable, Accessible, Interoperable, and Reusable) data?
And then there is Umesh Katpally at Bristol Myers Squibb whose recent Substack post deserves particular attention. He went further than identifying the problem, he actually began building a solution. His VERITAS Antibody Format Classifier is a 7-stage sequential pipeline that takes any raw antibody sequence and returns a standardized format name, full domain architecture string, CDR sequences with exact positions, isotype, pairing status, and a confidence score. The rigor of his approach is impressive. Rather than using a black-box machine learning classifier, he chose a rule-based decision tree because scientists need to understand why a sequence was classified a particular way, the system works on novel sequences that didn't exist when it was built, and new formats can be added simply by extending the decision tree rather than retraining a model.
What Umesh's work makes crystal clear is that the naming problem is tractable. The structural logic for describing any antibody format already exists, frameworks like VERITAS and AbML have laid the groundwork. The gap has never been the science of naming. The gap has been embedding that logic into the actual design workflow, so that names, chain identities, building blocks, and formats are captured automatically and consistently from the moment a molecule is conceived.
BioGlyph's Solution
BioGlyph unifies different representations that Umesh and others pointed out in a single design object:
Building block provenance
The variable regions, constant domains, and linkers you select are registered components. When you assemble a molecule, the system knows what building blocks went into it and that record travels with the design through every iteration and permutation.
Chain tracking
Every chain in a BioGlyph design is named, typed, and linked to its sequence. From heavy and light chains to scFv components and fusion proteins, each component within a chain has a defined identity, not just a position in a sequence file.
Format naming
The structural topology of the molecule is captured explicitly. Each one is checked for uniqueness and given a unique ID (BG number, for example BG-025) When you finalize a design, BioGlyph knows it is an IgG-scFv, a heteroFc fusion, a tandem scFv, or whatever format you have constructed because it was built from defined structural components, not assembled by copy-paste.
Automated annotation from sequence
Drop in an amino acid sequence and BioGlyph can parse its domain architecture, identify its building blocks, and assign it to the correct format class, the same kind of logic Umesh so elegantly demonstrated with VERITAS, embedded directly across the entire design and registration workflow. With the right data model and business rules BioGlyph helps anyone translate between each of these naming schemes and abstractions layers with ease.
We have been focused on building solutions to the underlying problem for years now. You can sketch any antibody format, generate chain-based and building block-based names, and download publication-ready images, all in your browser.
This is just scratching the surface. We did not stop at cute schematics and standard naming, but built the ability to automate annotation for any molecule, and has now been used for millions of antibody chains to correctly identify what molecule and format is actually in a dataset.
From Sequence to Format, Automatically
When you design a multispecific antibody in BioGlyph, you are not just stitching sequences together. You are building a structured virtual molecule. Every chain has an identity. Every building block, variable region, constant region, linker carries its annotation with it. The format topology you sketch on the design pad isn't a PNG in a PowerPoint file, a cartoon in a lab notebook, or a sketch on a napkin on your lab bench. It is a live structural description of what the molecule actually is, encoded in the platform.
This matters because it closes the loop that everyone else has been circling. The naming problem exists because sequence data and structural identity live in separate places.
Scientists generate sequences in one tool, sketch formats in another, write names in a spreadsheet, and hope it all stays coherent across a project that might span years and involve dozens of variants.
BioGlyph unifies all of it in a single design object.
Why this matters now
Since 2021, non-canonical antibodies: bispecifics, multispecifics, antibody-drug conjugates, and immunoconjugates, account for nearly half of all antibodies entering first-in-human studies and a third of those receiving marketing approval each year. The era when a biologics program could get away with ad hoc naming conventions and scattered tracking is over. The complexity of modern pipelines demands infrastructure that captures structural identity automatically and consistently, at the point of design.
There are elegant proof of concepts that any raw sequence can be classified and named with high accuracy. Benchling, Dotmatics, and too many experts in our field have articulated why this matters for data integrity, regulatory filings, and cross-team collaboration. What has been missing is a platform where this logic is native to the design process itself, not a downstream annotation step, not a bioinformatics script running after the fact, but a first-class feature of how molecules get built and tracked from day one.
BioGlyph is that platform.
