MLSB 2008




The scientific program will be composed of four sessions of contributed talks, each one starting and ending with an invited speech. Posters sessions will be held during the lunch and coffee breaks. On Saturday evening there will be a conference dinner at the restaurant "l'Atelier" in Brussels.

Schedule (provisional)

Invited speakers


V Saturday 13 September
V Morning
* Registration desk open
09:30 09:45
* Welcome
09:45 10:45
V Invited talk
* Pamela Silver (Harvard Medical School). Designing Biological Systems
10:45 11:15
* Coffee break
11:15 12:30
V session 1
11:15 11:40
* Saso Dzeroski and Ljupco Todorovski. Equation Discovery for Systems Biology
11:40 12:05
* Karoline Faust, J�r�me Callut, Pierre Dupont, and Jacques van Helden. Metabolic Pathway Inference using Random Walks and Shortest-Paths Algorithms
12:05 12:30
* Alexandre Irrthum and Louis Wehenkel. Predicting gene essentiality from expression patterns in Escherichia coli
12:30 12:55
V Invited talk
* Alain Chariot (ULg). Deciphering the molecular mechanisms underlying human diseases through interactome studies: a molecular approach
12:55 15:00
* Lunch break and poster session
V Afternoon
15:00 16:00
V Invited talk
* Lukas K�ll (University of Washington). Semi-supervised machine learning for shotgun proteomics
16:00 16:30
* Coffee break and poster session
16:30 17:50
V Session 2
16:30 16:55
* Artem Sokolov and Asa Ben-Hur. A Structured-Outputs Method for Prediction of Protein Function
16:55 17:20
* Omer Sinan Sarac, Rengul Cetin-Atalay, and Volkan Atalay. GOPred: Combining classifiers on the GO
17:20 17:45
V Invited talk
* Heribert Hirt (URGV Plant Genomics Institute & University of Vienna). Phosphoproteomic approaches to study stress signal transduction networks in plants
* Conference dinner at the restaurant "L'atelier"
V Sunday 14 September
V Morning
09:00 10:00
V Invited talk
* Yoav Freund (UCSD). From microscopy images to models of cellular processes
10:00 10:30
* Coffee break
10:30 11:45
V Session 3
10:30 10:55
* Koenraad Van Leemput and Alain Verschoren. Modeling Networks as Probabilistic Sequences of Frequent Subgraphs
10:55 11:20
* Michalis Titsias, Neil Lawrence, and Magnus Rattray. Sampling for Gaussian Process Inference
11:20 11:45
* Selpi, Christopher H. Bryant, and Graham Kemp. Using mRNA Secondary Structure Predictions Improves Recognition of Known Yeast Functional uORFs
11:45 12:10
V Invited talk
* Marc Muller (ULg). The zebrafish as a small vertebrate model system for bone development and homeostasis
12:10 14:15
* Lunch break and poster session
V Afternoon
14:15 15:15
V Invited talk
* Lodewyk Wessels (Netherlands Cancer Institute). Outcome prediction in breast cancer
15:15 15:45
* Coffee break and poster session
15:45 16:35
V Session 4
15:45 16:10
* Fan Shi, Geoff Macintyre, Christopher Andrew Leckie, Izhak Haviv, Alex Boussioutas, and Adam Kowalczyk. A Bi-ordering Approach to Linking Gene Expressions with Clinical Annotations in Cancer
16:10 16:35
* Vincent Botta, Sarah Hansoul, Pierre Geurts, and Louis Wehenkel. Raw genotypes vs haplotype blocks for genome wide association studies by random forests
16:35 17:00
V Invited talk
* Bernard Thienpont (KU Leuven). Endeavour pinpoints genes causing cardiac defects in regions identified by aCGH
17:00 17:15
* Closing


Invited Speakers


Yoav Freund (Professor, Computer Science and Engineering, UCSD)

Title: From microscopy images to models of cellular processes

Abstract: The advance of fluorescent tagging and of confocal microscopy is allowing biologists to image biochemical processes at a level of detail that was unimaginable just a few years ago. However, as the analysis of these images is done mostly by hand, there is a severe bottleneck in transforming these images into useful quantitative data that can be used to evaluate mathematical models.

One of the inherent challenges involved in automating this transformation is that image data is highly variable. This requires a recalibration of the image processing algorithms for each experiment. We use machine learning methods to enable the experimentalist to calibrate the image processing methods without having any knowledge of how these methods work. This, we believe, will allow the rapid integration of computer vision methods with confocal microscopy and open the way to the development of quantitative spatial models of cellular processes.

For more information, see


Lukas Käll (Department of Genome Sciences, University of Washington)

Title: Semi-supervised machine learning for shotgun proteomics

Abstract: Shotgun proteomics refers to the analysis of protein mixtures by cleaving the proteins with an enzyme, detecting the resulting peptides with tandem mass spectrometry and subsequently identifying the peptides with database search algorithms. The approach is currently considered the most accurate way to determine the protein content of a complex biological mixture. A limitation of existing machine learning efforts to improve peptide identification in shotgun proteomics datasets are that they are based on fixed training sets and are hence unable to compensate easily for variations in mass spectrometry conditions. Instead of curating representative training sets for individual conditions, which in most cases is not practically feasible, we have devised algorithms that are capable of learning directly from the individual shotgun proteomics datasets that we want to classify. Using semi-supervised learning to discriminate between correct and incorrect spectrum identifications we correctly assign peptides to up to 77% more spectra, relative to a fully supervised approach.


Pamela A. Silver (Professor, Department of Systems Biology, Harvard Medical School and Director of the Harvard University Graduate Program in Systems Biology)

Title: Designing Biological Systems

Abstract: Biology presents us with an array of design principles that extend beyond what is normally found in silico. However, we don't yet know how to make facile use what we know and there is a lot more to learn. As a start, we are interested in using the foundations of biology to engineer cells in a simple and logical way to perform certain functions. In doing so, we learn more about the fundamentals of biological design as well as engineer useful devices with myriad applications. For example, we are interested in building cells that can perform specific tasks, such as counting, measuring and remembering past events. Moreover, we design and construct proteins and cells with predictable biological properties that not only teach us about biology but also serve as potential therapeutics, cell-based sensors and factories for generating bio-energy.


Lodewyk Wessels (Professor, Bioinformatics and Statistics group, Netherlands Cancer Institute, Amsterdam, The Netherlands)

Title: Outcome prediction in breast cancer


Background: Michiels et al. (Lancet 2005; 365: 488-92) employed a resampling strategy to show that the genes identified as predictors of prognosis from resamplings of a single gene expression dataset are highly variable. The genes most frequently identified in the separate resamplings were put forward as gold . On a higher level, breast cancer datasets collected by different institutions can be considered as resamplings from the underlying breast cancer population. The limited overlap between published prognostic signatures confirms the trend of signature instability identified by the resampling strategy. Six breast cancer datasets, totaling 947 samples, all measured on the Affymetrix platform, are currently available. This provides a unique opportunity to employ a substantial dataset to investigate the effects of pooling datasets on classifier accuracy, signature stability and enrichment of functional categories.

Results: We show that the resampling strategy produces a suboptimal ranking of genes, which can not be con- sidered to be gold . When pooling breast cancer datasets, we observed a synergetic effect on the classification performance in 73% of the cases. We also observe a significant positive correlation between the number of datasets that is pooled, the validation performance, the number of genes selected, and the enrichment of specific functional categories. In addition, we have tested five hypotheses that have been postulated as an explanation for the limited overlap of signatures.

Conclusions: The limited overlap of current signature genes can be attributed to small sample size. Pooling datasets results in more accurate classification and a convergence of signature genes. We therefore advocate the analysis of new data within the context of a compendium, rather than analysis in isolation.


Alain Chariot (Laboratory of Medical Chemistry, Unit of Signal Transduction, GIGA-research, University of Liège, Belgium)

Title: Deciphering the molecular mechanisms underlying human diseases through interactome studies: a molecular approach

Abstract:Establishing the interactome of any given signalling protein is a powerful approach in order to better understand what its biological roles are but also to precise to which extent this interactome is specifically altered in human diseases. We have been using the yeast-two-hybrid approach in order to decipher the signalling pathways regulated by two families of transcription factors, namely NF-κB and IRFs. Both families have deregulated, constitutive activities in a variety of solid and haematological cancers as well as in chronic inflammatory and neurodegenerative disorders.

Our recent interactome data not only highlighted where, when and how these signalling proteins are involved in signal transduction but also helped us to better understand how the post-translational modifications of those proteins regulate their function. We will present examples of ongoing research projects in our laboratory dedicated to the establishment of interacting networks and demonstrate how those networks help to better understand why their deregulations lead to diseases.


Heribert Hirt (URGV Plant Genomics Institute, Paris, France & Department of Plant Molecular Biology, University of Vienna, Austria

Title: Phosphoproteomic approaches to study stress signal transduction networks in plants

Abstract: We are interested to study protein kinase networks that function in environmental stress responses. As such we have identified the MEKK1-MKK2-MPK4 signalling pathway which plays a role in resistance to both biotic and abiotic stresses (Teige et al., 2004, Nakagami et al., 2006, Brader et al., 2007). To obtain a more global view on signalling, the state of multiple signal pathways under any one condition and time is monitored by phosphoproteomics and phosphosite-specific microarrays (de la Fuente van Bentem et al., 2007). On the basis of these data, system hypotheses are developed to undergo reiterative experimental testing and remodeling. As an exemple for the usefulness of this approach, I will discuss recent work on the plant-microbe interaction system of Agrobacterium and Arabidopsis.

  • Teige, M., Scheikl, E., Eulgem, T., Doczi, R., Ichimura, K., Shinozaki, K., Dangl, J.L., and Hirt, H. (2004) The MKK2 pathway mediates cold and salt stress signaling in Arabdiopsis. Mol. Cell 15, 141-152.
  • Nakagami, H., Soukupova, H., Schikora, A., Zarsky, V. and Hirt, H. (2006) A mitogen-activated protein kinase kinase kinase mediates reactive oxygen species homeostasis in Arabidopsis. J. Biol. Chem. 28, 3267-78.
  • Brader, G., Djamei, A., Teige, M., Palva, T. Hirt, H. (2007) The MAP kinase kinase MKK2 affects diseasse resistance in Arabidopsis. Mol. Plant Micr. Int. 20, 589-596.
  • van Bentem, S. and Hirt, H (2007) Using phosphoproteomics to reveal signalling dynamics in plants. Trends Plant Sci. 12, 404-409
  • Djamei, A., Pitzschke, A., Nakagami, H., Rajh, I., Hirt, H. (2007) Trojan horse strategy in Agrobacterium transformation: Abusing MAPK defense signaling. Science 318,


Marc Muller (Unit of Molecular Biology and Genetic Engineering, GIGA-research, University of Liège, Belgium)

Title: The zebrafish as a small vertebrate model system for bone development and homeostasis

Abstract: Small fish models, mainly zebrafish (Danio rerio) and medaka (Oryzias latipes), have been used for many years as powerful model systems for vertebrate developmental biology. Moreover, these species are increasingly recognized as valuable systems to study vertebrate physiology, pathology, pharmacology and toxicology. In recent years, analysis of gene function by mutation or genetic manipulation has shown that the homologs of many genes previously described to be involved in bone development and homeostasis in mammals also play very similar roles in small fish species. Bone physiology is affected by homologous genes in mammals and zebrafish. Thus, small fish models represent a valuable tool to investigate bone development and pathology.

Small fish species present many advantages for studying development, such as transparency of the embryos, external development, possibility for large scale mutagenesis screening, rapid development. These include large number of embryos from one single clutch, small size, easy containment in water tanks. Many technologies for visualizing and characterizing bones, such as specific staining or fluorescent transgenic animals, have been adapted to small fish species and can be routinely performed on large numbers of larvae. Furthermore, its genome sequencing and annotation is close to completion making whole genome analysis feasible.

Our principal objective is to study bone pathologies in zebrafish, such as osteoporosis induced by menopause or prolonged space flight. We investigate the changes induced by mutations, bone-metabolizing drugs or microgravity in small fish species. One type of approach is to combine whole genome approaches, such as microarray expression analysis, chromatin immunoprecipitation (ChIP) or proteomics with a special emphasis on bone-related genes. Data are obtained by microgravity simulation on ground and compared to the changes observed in space. A complementary strategy is to carry out automated in vivo real time observations of transgenic larvae expressing a fluorescent reporter protein in bone-related structures.


Bernard Thienpont (Center for Human Genetics, University of Leuven, Belgium)

Title: Endeavour pinpoints genes causing cardiac defects in regions identified by aCGH

Abstract: Array Comparative Genomic Hybridisation (aCGH) is a novel tool for high-resolution detection of submicroscopic chromosomal insertions or deletions (indels). It opens opportunities in diagnostics as well as in the identification of novel loci involved in the patients phenotype. We analysed 130 patients with an idiopathic syndromic congenital heart defect (CHD) by array-CGH at 1Mb resolution, resulting in the detection of causal imbalances in 22 patients (17%).

All indels as well as indels and gene mutations described in the CHD literature were collected in a centralized repository CHDWiki, that allows a collaborative annotation of the genome. In 50% of the cases (11/22) the indel affects a gene annotated to cause CHDs.

The other indels pinpoint regions that contain novel candidate genes for CHD. To identify these genes, an /in silico/ prioritisation algorithm (based on Endeavour) was developed. Extensive /in silico/ testing demonstrated a high discriminative power. The results of prioritizing genes from the indel regions were further verified by analysing the expression of 45 high ranking genes by /in situ/ hybridisation on developing zebrafish embryos. These analyses supported the involvement of two novel genes in human CHD: /BMP4/and /HAND2/.

In conclusion, we show that aCGH can provide an etiological diagnosis in 17% of patients with a syndromic CHD. It can moreover contribute to the discovery of genes causing CHD in humans, and drive research on how they contribute to normal and pathogenic cardiovascular development.