Posted: October 23, 2008

Putting protein pieces together with algorithms: Solving the 'mass spec data mess'

(Nanowerk News) A new proteomics project promises to revolutionize routine blood tests, vaccine development, cancer diagnostics, and many other important biomedical challenges.
UC San Diego engineers and scientists have received a five-year $4.94M grant from the National Center for Research Resources (NCRR), a part of the National Institutes of Health (NIH), to develop algorithms and software for deciphering all the proteins that are present in biological samples. This “proteomics” work is led by Pavel Pevzner, a UC San Diego Jacobs School of Engineering computer science professor.
proteomics pioneers Nuno Bandeira, Pavel Pevzner, Ingolf Krueger, Vineet Bafna
A collection of the proteomics pioneers who recieved the nearly $5M NIH grant. (l-r) Nuno Bandeira executive director of the Center for Computational Mass Spectrometry, and Jacobs School of Engineering computer science professors Pavel Pevzner, Ingolf Krueger and Vineet Bafna (Not pictured Steven Briggs a professor of biology at UCSD’s Division of Biological Sciences.)
The new grant will also support development of the software infrastructure required to share these cutting edge computational mass spectrometry tools with researchers around the nation and the world. This effort will combat a global computational bottleneck that is currently holding back the field of proteomics, which by definition strives to glean biological insights from looking at all the proteins present in biological samples. While there are traditional tools to do some of this proteomics work, they are time consuming and expensive and have contributed to the computational bottleneck.
“Unanalyzed data from mass spectrometers is piling up in laboratories around the world. Our algorithms can turn much of these ‘dark’ data into the lists of modified proteins that researchers are looking for,” says Nuno Bandeira, the first executive director of the Center for Computational Mass Spectrometry at UCSD’s Jacobs School of Engineering, which is made possible by the new grant.
A wide variety of biomedical research projects will benefit from development of these computational resources including:
  • Elucidation of cancer biomarkers
  • Extensive characterization of changes in aged cataractous lenses
  • Understanding how bacteria adjust to antibiotics and other harsh conditions
  • Addressing the need to constantly reformulate the vaccines to make them efficient
  • De novo protein sequencing of antibodies and snake venoms that proved instrumental in drug design
  • UCSD bioinformatics experts have already pioneered computational methods for teasing out exactly what proteins are in biological samples such as blood and cancer tumors and have published extensively on this work. While they have only been able to share these tools with close collaborators so far, the $4.94M from NCRR will fund further development of the algorithms, as well as the software and computational infrastructure that will enable the researchers to offer their computational services to researchers at UCSD and around the world through open-access software platforms.
    Key collaborators on the new grant are Jacobs School of Engineering computer science professors Vineet Bafna and Ingolf Krueger as well as Steven Briggs, a professor of biology at UCSD’s Division of Biological Sciences.
    Bafna will oversee the development of algorithms for peptide identification (including modifications), proteogenomics, and protein quantification.
    Krueger will lead the team that is developing the service-oriented software architecture to enable the robust integration of proteomics research tools in an integrated public service. Krueger’s software expertise will allow for synergistic interactions with other available proteomics tools as well as Web service, data and compute clusters to form a community cyberinfrastructure for proteomics research and applications. Krueger directs the “Software & Systems Architecture & Integration” (SAINT) functional area at Calit2.
    Nuno Bandeira will develop new algorithms for revealing the modified proteome and its myriad interactions. As executive director of the center, Bandeira will also coordinate development of the infrastructure to easily transition research-grade software to user-friendly tools accessible to biologists worldwide.
    Insights from Snake Venom Studies
    The tools that the researchers are looking to further develop and share with the world have been and continue to be developed at UCSD’s Jacobs School of Engineering and Calit2 primarily, in collaboration with scientists from the UCSD Division of Biological Sciences, the UCSD School of Medicine, and the UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences. Bandeira, for example, did some of this work while looking for better ways to sequence the proteins in snake venom as a part of his UCSD computer science Ph.D.
    The insights that have arisen from the snake venom sequencing and other work at UCSD may even change the way the pharmaceutical industry generates antibody drugs. Today’s primary approach to sequencing antibodies are low-throughput and labor intensive Edman degradation techniques. “We are proposing to completely replace this approach with Shotgun Protein Sequencing, which is a combination of software and experimental protocols that capitalizes on fast-developing high-throughput mass spectrometry and automatically sequences mixtures of proteins,” says Bandeira.
    This ability to quickly determine antibody sequences and automatically characterize their diversity has the potential to further accelerate discovery and facilitate engineering and manufacturing processes.
    Blood tests could change as well. Today’s blood tests generally track just a small number of proteins even through there are thousands of proteins in any blood sample that could provide important information about a person’s health. Moreover, each of these proteins can be modified or simply cut somewhere in the middle and uncovering these modifications provides important clues about the health of the individual.
    Decoding all the proteins in a blood sample is a difficult computational puzzle that still awaits an automated solution—and UC San Diego’s computational mass spectrometry experts are working to resolve this bottleneck.
    Interdisciplinary by Design
    The new center will be highly interdisciplinary. Important collaborations already exist at UC San Diego, the Burnham Institute, 16 U.S. universities, as well as hospitals, biotechnology companies, and foreign research institutions. Further development of robust open-access mass spectrometry software will catalyze the exchanges between experimental and computational researchers in proteomics.
    The researchers will also develop educational activities including short courses, a seminar program, and an annual conference.
    Source: UC San Diego