AlphaFold - AI Protein Structure Prediction
What Is AlphaFold?
AlphaFold An artificial intelligence system developed by Google DeepMind that predicts three-dimensional protein structures from amino acid sequences and, in AlphaFold 3, models biomolecular complexes involving proteins, nucleic acids, small molecules, ions, and chemical modifications.
AlphaFold is one of the most important applications of deep learning in bioinformatics and structural biology. It addresses a long-standing problem in biology: how the linear sequence of amino acids in a protein relates to the folded three-dimensional structure that helps determine molecular function.
AlphaFold 2 made high-quality sequence-based structure prediction practical for many proteins. AlphaFold 3 extends the approach from individual protein chains toward complexes that may include proteins, DNA, RNA, ligands, ions, and modified residues. In biotechnology, these predictions are useful because they provide fast structural hypotheses for proteins and molecular interactions that may be difficult, slow, or expensive to study experimentally.
AlphaFold predictions are not experimental structures. They are computational models that must be interpreted with confidence scores and biological context. Proteins can move, bind partners, switch conformations, undergo post-translational modification, or require membranes and cofactors. A prediction may be very informative, but it still needs validation when used to support mechanistic claims, drug discovery, enzyme engineering, or clinical interpretation.
Synonyms and related terms: AI protein structure prediction, computational protein structure prediction, structure prediction, AlphaFold 2, AlphaFold 3, AlphaFold DB.
Not to be confused with: experimental structure determination, such as X-ray crystallography, cryo-electron microscopy, and NMR spectroscopy. AlphaFold predicts structural models; experimental methods measure molecular structures under defined conditions.

AlphaFold uses artificial intelligence to predict how an amino acid sequence may fold into a three-dimensional protein structure, helping researchers generate structural hypotheses for biotechnology and drug discovery. (Image: Nanowerk)
Why AlphaFold Matters
Protein structure is central to biotechnology because enzymes, antibodies, receptors, transporters, and signaling proteins work through shape, motion, and molecular recognition. A structural model can help identify catalytic residues, binding pockets, interaction surfaces, flexible regions, and mutation-sensitive sites. This can make experiments more efficient by narrowing the number of hypotheses that need to be tested in the laboratory.
Before AlphaFold, researchers often relied on experimental structure determination, homology modeling, or lower-confidence computational prediction. Experimental methods remain essential, but they can be difficult for flexible proteins, membrane proteins, transient complexes, unstable proteins, and proteins that require a cellular context. AlphaFold has therefore become a practical starting point for many structural questions, especially when no experimental structure is available.
How AlphaFold Works
AlphaFold uses neural networks trained on known protein structures and sequence information. For AlphaFold 2, the main input is an amino acid sequence. The system uses related sequences, evolutionary patterns, and learned structural features to predict atomic coordinates and confidence estimates.
A key source of information is the multiple sequence alignment. If two amino acid positions tend to change together across related proteins, this can suggest that the residues are physically or functionally linked. AlphaFold combines such evolutionary signals with deep learning models that reason about residue relationships and three-dimensional geometry.
AlphaFold 3 uses a substantially updated architecture, including diffusion-based structure generation, to model joint structures of biomolecular complexes. This makes it more directly relevant to questions about molecular recognition, protein-DNA binding, protein-RNA binding, ligand interactions, and drug discovery.
AlphaFold 2, AlphaFold Multimer, and AlphaFold 3
| Version or resource | Main purpose | Why it matters |
|---|---|---|
| AlphaFold 2 | Predicts structures of many single protein chains from amino acid sequence | Made AI-based protein structure prediction widely useful in biology and biotechnology |
| AlphaFold Protein Structure Database | Public resource containing large numbers of AlphaFold-predicted protein structures | Lets researchers inspect predicted structures before designing experiments |
| AlphaFold Multimer | Extension for modeling protein-protein complexes | Supports hypotheses about oligomers, binding interfaces, and protein assemblies |
| AlphaFold 3 | Models complexes containing proteins, nucleic acids, small molecules, ions, and modified residues | Extends prediction toward molecular interactions relevant to cell biology and drug discovery |
AlphaFold 2 is the version associated with the major protein-structure-prediction breakthrough. It performed exceptionally well in CASP14, the community assessment of structure-prediction methods, and was later released with code and a large database of predicted structures. AlphaFold Multimer adapted the method for protein complexes, where the challenge includes both the fold of each chain and the relative placement of chains at an interface.
AlphaFold 3 broadens the problem from protein chains to biomolecular interactions. As of the latest official Google DeepMind information reviewed for this page, AlphaFold 3 model code and weights are available for academic use under specific terms. Users should check the current licensing and access rules before using it in academic, commercial, or regulated workflows.
Understanding Confidence Scores
AlphaFold predictions should always be read together with confidence metrics. The best-known AlphaFold 2 score is pLDDT, a per-residue confidence score from 0 to 100. Higher values generally indicate stronger confidence in the local structure around a residue. Lower values may indicate uncertainty, disorder, flexibility, missing context, or insufficient evolutionary information.
Another important metric is PAE, or predicted aligned error. PAE estimates confidence in the relative placement of residues or regions. This is important because individual domains may be predicted well while their orientation relative to one another remains uncertain. For multi-domain proteins and complexes, PAE is often essential for deciding how much to trust the overall arrangement.
A common mistake is to treat the whole predicted structure as equally reliable. One model may contain a high-confidence catalytic domain, a medium-confidence regulatory region, and a low-confidence disordered tail. Each region should be interpreted according to its own confidence and biological context.
Applications in Biotechnology
Drug discovery and target assessment
AlphaFold models can suggest binding pockets, allosteric sites, protein-protein interaction surfaces, and mutation-sensitive regions. They can help researchers evaluate a target, design constructs, interpret resistance mutations, and generate structural hypotheses for screening or medicinal chemistry. They do not by themselves prove that a compound binds or works as a drug.
Enzyme engineering
For industrial and therapeutic enzymes, structure can guide mutation choices. AlphaFold can help identify active-site residues, substrate channels, metal-binding sites, stabilizing interactions, and flexible loops. These predictions are most useful when combined with directed evolution, high-throughput screening, biochemical assays, and computational design.
Variant interpretation
Disease-associated variants may destabilize a protein, disrupt an active site, weaken a binding interface, or alter regulation. AlphaFold models allow scientists to map amino acid substitutions onto a structural context. This can support rare-disease research, cancer biology, pharmacogenomics, and personalized medicine, but pathogenicity cannot be inferred from structure alone.
Vaccines, antibodies, and synthetic biology
Structural models can support antigen design, epitope mapping, antibody-interface analysis, protein-fusion design, biosensor development, and synthetic biology pathway engineering. In each case, the model is a guide for experimental design rather than a substitute for biological testing.
Limitations and Common Pitfalls
AlphaFold often predicts a likely structural state, while many proteins function through motion or multiple conformations. Kinases switch between active and inactive states, receptors rearrange after ligand binding, transporters alternate access across membranes, and intrinsically disordered proteins may fold only when bound to partners. A static prediction can miss these features.
Context also matters. A protein may require a membrane, metal ion, chaperone, glycosylation, phosphorylation, proteolytic processing, or assembly partner to adopt its functional state. Low-confidence regions may be biologically meaningful flexible segments rather than failed predictions. High local confidence does not prove that a proposed ligand, interface, or mechanism exists in living cells.
For drug discovery, AlphaFold structures are useful starting points but not replacements for medicinal chemistry and biophysics. Binding pockets may shift, water networks may matter, induced-fit movements may be absent, and ligand poses must be tested experimentally.
Typical Research Workflow
A practical workflow starts with a biological question: What fold does this protein have? Which residues may form an active site? Could this mutation disrupt stability? Might two proteins interact? The researcher retrieves or generates an AlphaFold model, checks pLDDT and PAE, compares the prediction with known structures, and designs experiments to test the most plausible hypotheses.
For an enzyme, this may mean mutating predicted catalytic residues and measuring activity. For a disease variant, it may mean combining structural mapping with genetics, clinical data, biochemical assays, and cell models. For a drug target, it may mean using the model to guide construct design, docking hypotheses, or experimental structure determination.
Key Terms Related to AlphaFold
| Term | Meaning |
|---|---|
| Protein folding | The process by which an amino acid chain adopts a three-dimensional shape |
| Multiple sequence alignment | An alignment of related protein sequences used to reveal conserved and co-evolving residues |
| pLDDT | AlphaFold’s per-residue local confidence score, scaled from 0 to 100 |
| PAE | Predicted aligned error, a metric for confidence in the relative placement of residues or domains |
| Structural biology | The study of molecular structures using experimental and computational methods |
| Protein design | The engineering or creation of protein sequences intended to fold into desired structures or functions |
Frequently Asked Questions
What is AlphaFold? AlphaFold is an AI system developed by Google DeepMind that predicts protein structures from amino acid sequences. AlphaFold 3 also models complexes that may include proteins, DNA, RNA, ligands, ions, and modified residues.
Does AlphaFold solve the protein-folding problem? It solved a major practical version of protein structure prediction for many proteins. It did not solve every aspect of folding, including folding pathways, dynamics, disorder, aggregation, context-dependent conformations, or all molecular interactions.
Can AlphaFold predict whether a drug will work? No. AlphaFold can help generate structural hypotheses about a target or binding site, but drug efficacy depends on binding, selectivity, pharmacokinetics, toxicity, disease biology, and clinical response.
What are pLDDT and PAE? pLDDT is a local confidence score for each residue in an AlphaFold 2 model. PAE estimates confidence in the relative positioning of residues or regions. Together, they help researchers decide which parts of a prediction are likely to be reliable.
Selected References
Jumper J., Evans R., Pritzel A., et al. “Highly accurate protein structure prediction with AlphaFold.” Nature 596, 583–589 (2021). DOI: 10.1038/s41586-021-03819-2
Varadi M., Anyango S., Deshpande M., et al. “AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.” Nucleic Acids Research 50(D1), D439–D444 (2022). DOI: 10.1093/nar/gkab1061
Abramson J., Adler J., Dunger J., et al. “Accurate structure prediction of biomolecular interactions with AlphaFold 3.” Nature 630, 493–500 (2024). DOI: 10.1038/s41586-024-07487-w
AlphaFold Protein Structure Database. Developed by Google DeepMind and EMBL-EBI: https://alphafold.ebi.ac.uk/