First maps of the human proteome to unlock ‘mystery of life’

Home Technologist Online First maps of the human proteome to unlock ‘mystery of life’

Following the first complete sequencing of the human genome in 2001, a second key to unlocking the mystery of life has now been found with the mapping of the proteome – all the proteins in the human body.

Photo of Bernhard Küster standing in front of a blackboard with the figure 18,097 written on it,

The human organism is made up of at least 18,097 different proteins, according to a study led by Bernhard Küster from Technische Universität München (TUM) and published in the May 2014 issue of the scientific journal Nature. Another study published in the same journal issue by a team led by Akhilesh Pandey from Johns Hopkins University in the US arrived at the slightly lower figure of 17,294 proteins.

The two teams have produced by far the most comprehensive map of the human proteome to date. Küster’s group maintains that they have so far accounted for 92 per cent of all proteins in the human body. Nature dedicated its front cover to the finding, confirming the human proteome as one of the great scientific discoveries of the year.

The human genome codes for the proteome

The term proteome, coined in 1994 by Australian Marc Wilkins, describes the entire set of proteins in an organism and is based on a more well-known word ending in ‘ome’: the genome, or the genetic blueprint of an organism.

As far back as the 1990s, the sequencing instruments of genome researchers were already running at full speed. By 2003, the DNA sequence of humans was more or less correctly decoded.

Geneticists soon announced that they had also solved the human protein puzzle. In simple terms, they calculated that each gene encodes a protein, so the number of genes corresponds to the number of proteins. Based on this interpretation, the human proteome would comprise 19,629 proteins, all stored and named in freely accessible databases. Many, however, are just predictions derived from computer analysis of the human genome.

Some 15 years after the decoding of the genome, 2014 could well become known as the year the human proteome was decoded. Scientists have mapped the proteome and, as a result, arrived at a more solid and accurate prediction for the number of human proteins.

The catch is that, unlike the genome, which barely changes in the course of a person’s life, the proteome is highly dynamic. Proteins are constantly generated, transformed and broken down, depending on the organism’s exposure to stimuli, environmental factors, diseases and drugs. The proteome is highly complex, as it reflects all facets of our life and our environment.

Back in 2001, an international research project called the Human Proteome Organization (HUPO) was set up along the same lines as the genome research project. The ultimate target is to analyse every protein in every tissue – including their changes over time and variants of the basic form. This undertaking would mean analysing 500,000, possibly even a million proteins. “That is a lot of proteins,” noted the journal Nature, sceptical of the hugely ambitious plan in 2001.

The researchers are still a long way off their ultimate target, but now two research groups have at least reached the first milestone. The basic number of human proteins has been determined, the variants of which will be identified in further research.

Infographic depicting the human body as layers of all its major systems, annotated with information about important proteins in each system.

Illustration: ediundsepp Gestaltungsgesellschaft München.

Mapping the human proteome

Progress in mapping the human proteome was aided by advances in mass spectrometry. This method is as important to proteome researchers as sequencing instruments are for genome researchers. Küster has five mass spectrometers available at his chair, each one worth some EUR 750,000.

Over the last 18 months, these machines were tasked with drawing a map of the human proteome. Bernhard Küster’s team analysed 60 different tissues, 13 bodily fluids and 147 cancer cell lines.

The Munich-based scientists identified some 80 per cent of all human proteins. This work thus comprises the largest single data set of the human proteome. In addition, the team re-analysed several dozen individual mass spectrometry data sets on tissues and cell lines that other groups had uploaded to public databases.

They also developed ProteomicsDB – an all-in-one database and software system to perform a comprehensive analysis of the human proteome.

Küster sees the platform as a tool for all scientists engaged in human proteome research: “It’s available to anyone.” New data sets are continuously being added to ProteomicsDB. In mid-September 2014, 18,248 of the 19,629 predicted human proteins were available, or 93 per cent of the number forecast by the genome researchers.

Photo of a a gel electrophoresis apparatus is being loaded with protein samples

Bernhard Küster’s group analysed a large number of proteins. Most of these came from tissue samples provided by TUM’s pathology labs – but a number of saliva and even ear wax samples were provided by Küster’s own team members. Here, a gel electrophoresis apparatus is being loaded with protein samples to separate them according to size. (Photo: Wolfgang Filser/TUM.)

Missing proteins

Some details in the huge trove of data confirm previous expectations. For instance, a core set of 10,000 to 12,000 proteins can be identified in most cell types and tissues. In addition, many tissues are characterised by the presence of specific proteins.

The proteome map also raises some fundamental questions for scientists, such as how to find the seven per cent or so of all predicted proteins that have remained elusive until now. This is undoubtedly due, in part, to the detection limits of mass spectrometry.

To identify this last seven per cent, the TUM scientists are appealing to the world’s experts in the field for help – one of the sections in ProteomicsDB is called “Adopt a protein”. “There are very good labs out there doing in-depth research on a small region of the map, and we invite them to add their data to ProteomicsDB,” explains Küster.

It may turn out that some of the currently grey areas in the proteome map don’t even exist. A number of genes are permanently inactive, having been switched off as humans evolved.

One such inactive area can be found in the human nose. In order to smell, organisms from mice to men need special receptors on the surface of sensory cells in the olfactory epithelium located far back in the nose. Based on the predictions, we should have 853 of these olfactory receptors. But Küster and his group have been able to account for only one in four of these. Perhaps this is one reason why our sense of smell is much less well developed than that of, say, mice and dogs.

Another surprising finding is that the term “gene” might have to be interpreted more broadly than it has been to date. It is clear that our bodies create proteins whose genes have not yet been recognised by the genome researchers.

Photo of vials being moved from the liquid nitrogen tank in which they are stored

Using proteins to develop cancer therapies. Bernhard Küster and his team were able to show that the efficacy of 24 cancer drugs on 35 cancer cells bore a clear correlation with their protein profiles. Here, vials of cancer cells are removed from the liquid nitrogen tank in which they are stored. (Photo: Wolfgang Filser/TUM.)

Protein patterns can determine drug efficacy

“Our map is only the starting point,” says Küster. Scientists are sure to find more surprises while filling in the remaining spaces on the human proteome map and asking questions of the database.

But what is the point of proteome research, apart from gaining a better understanding of basic biological principles? According to Küster, there are already indications of possible medical implications, as proteins are the targets for almost all medicines. It seems that certain protein patterns influence the effectiveness of such drugs.

“This edges us a bit closer to even more individualised treatments for patients. If we knew the protein profile of, for instance, a tumour, we might be able to administer drugs in a more targeted way. This would also create a rationale for investigating new drug combinations and, generally, aligning treatments more closely with a patient’s individual needs,” says Küster.

Abbreviated version of an article by Bernhard Epping/Barbara Wankerl (TUM). Read the original, full-length article in TUM’s magazine Faszination Forschung


3D spheroid of cultivated breast cancer cells

New mechanism for the formation of metastases revealed.
Brain Chemistry EPFL

A minimally invasive way to collect data from our brains.