You are on page 1of 2

Bioinformatics

Connecting bioinformatics and proteomics


Generally, the ever increasing flow of data from proteomics experiments has led to a big demand on bioinformatics to develop the necessary infrastructure for the storage, processing, analysis and visualisation of proteomics data. Three years ago the Gaining Momentum initiative was started to integrate bioinformatics in proteomics research. The theme leaders Bas van Breukelen and Peter Horvatovich assess the results so far.
Bas van Breukelen (Biomolecular Mass Spectrometry and Proteomics, Utrecht University) is the NPC coordinator of the Bioinformatics Research Hotel and coordinator of NPC-NBIC cooperation.

T
16

he aim of the Gaining Momentum initiative, which was started in 2007 by the NPC together with the Netherlands Bioinformatics Centre (NBIC), is to create a high quality workflow for the analysis of proteomics

| NPC Highlights

Special | March 2011

data. For some time now, the progression in proteomics has been hampered by what we call the data processing bottleneck, explains Peter Horvatovich, who together with Bas van Breukelen leads the initiative. Both combine their chemistry and biochemistry education with ample experience

in programming and bioinformatics. It is an ideal combination for the task of leading the Gaining Momentum initiative, which needs a global outlook to connect people from both sides. Closing the gap People in wet labs such as NPC researchers are producing more and more complex data using advanced technologies, says Horvatovich. Most of these researchers are using very basic bioinformatics methods to analyse their data. They have neither the time nor the knowledge to learn, develop and test new software. On the other hand, he says, bioinformatics software developers such as the people working within NBIC are experts at writing code but have limited knowledge of the problems within proteomics research. This is exactly the gap we want to close, Horvatovich explains. We are experts on both. The high quality workflows are built in practice from a set of bioinformatics tools. Based on the workflows developed within the Gaining Momentum initiative, the goal is to provide user friendly, high throughput data processing services to analyse proteomics LC-MS data. The programs are developed by members of the platforms or are obtained from open source code repositories and must apply to a set of rules defined by the Gaining Momentum initiative to ensure that the various parts are modular and can be combined. There is a tendency to reinvent the wheel again and again, observes Bas van Breukelen. He explains: By setting the standard for data representation following, for example, the HUPO Proteomics

Peter Horvatovich (Analytical Biochemistry, University of Groningen) is coordinator of the NPC Theme Bioinformatics in Proteomics.

Standards Initiative, we made sure that data is described in a uniform manner. It is now possible to connect the software building blocks so people can build their own workflow by combining modules written by others. Achieving this technical uniformity was very challenging indeed. Full suite of powerful software The second important aspect is the documentation and visual interface that the programs and workflows must have when published on the tool repository or when made available online as easy-to-use data processing services. Without proper documentation it is almost impossible for others to use the software. The academic goal in bioinformatics is not only to develop the code and publish it, but also to make it available to others via data processing services that are easy-to-use. Otherwise no one will use it. This is exactly what our platform is doing, says Horvatovich. The NBIC software repository is used to place the source code of the developed tools. The NBIC Galaxy server and the web page of the Netherlands Bioinformatics for Proteomics Platform (NBPP) are used to make online data processing

services available. The first workflows have recently been put online. This is just the beginning. The ultimate goal is to provide a full suite of powerful software, van Breukelen says. But on a small scale, things are happening. For example, the StatQuant program, which also has a graphical user interface, has been available from the NBIC repository for some time. We have more than a thousand users already, but this program also draws people to the repository to download other programs. Ensuring continuation The cooperation between NBIC and NPC has always been very good, both project leaders agree. Their main concern now is continuation of funding. One of the strategies to ensure continuation is to connect with other initiatives, says Horvatovich. For example metabolomics can use the tools we are developing, he explains. Van Breukelen agrees: We are set up with a solid core. It has been hard to find the right people to work on the projects. It would be a shame to waste the efforts. It is our task now to show that proteomics cannot live without us.

Protein data processing services


Galaxy server The NBIC Galaxy server is a development server where the implementation of new tools is tested. Hence, this is not a production server! The Galaxy server is maintained by the Netherlands Bioinformatics Centre (NBIC). For more info please visit: http://galaxy.nbic.nl. Gaining Momentum Initiative Together with the Netherlands Bioinformatics Centre (NBIC), the NPC has joined forces to set up a large task force: Gaining Momentum at the Proteomics-Bioinformatic Interface. This task force has the goals of building a platform for proteomics based on bioinformatics and of providing tools, workflows and high throughput data processing services for the proteomics community. Bioinformatics in Proteomics The ever increasing flow of data from proteomics experiments have led to a large demand on bioinformatics to develop the necessary infrastructure for the storage, processing, analysis and visualisation of proteomics data. The NPC Enabling Technology Program Bioinformatics in Proteomics (E4) especially focuses on proteomics driven bioinformatics. The E4 program aspires to cover as many proteomics bioinformatics topics as possible with a maximum cross connectivity between the NPC research themes.

| 17

Software repository Trac is used as a software repository system. In this repository you can develop your programs and collaborate with other NBIC-BioAssist programmers. The default versioning system is Subversion for new projects, but it is possible to use CVS for existing projects (http://trac.nbic.nl).

Netherlands Bioinformatics for Proteomics Platform Netherlands Bioinformatics for Proteomics Platform (NBPP, http://nbpp.nl) is a joint initiative of NBIC and the NPC. The goal of NBPP is to provide user friendly, high-throughput data processing services to analyse proteomics LC-MS data, based on open source tools or tools developed by and available to the Platform members. These services are provided by Data Analysis Framework using the Dutch Life Science Grid and other computational clusters of the platform members.

You might also like