At the end of the course the student is able to:
- detect homology between sequences using multiple techniques, ranging from pairwise sequence comparisons to comparisons at the level of sequence profiles.
- make a Hidden Markov Model of a sequence alignment and understand this process.
- derive phylogenies and interpret their results in the context what is known about the sequences and the species they come from.
- predict RNA secundary structures and use pattern recognition techniques on DNA sequences.
- compare genomes at the level of shared orthologous genes and interpret the results.
- predict function relationships between proteins using genomics data and their integration using Bayesian methodology.
- explain the theories about metabolic pathway evolution and how they can be tested using sequence data.
The topic of this course is how to convert genomic sequence data and other types of genomics data into biological knowledge and understanding. Sequence analysis and prediction of protein function are performed on multiple levels, ranging from comparing two protein sequences, to the prediction of their function by comparing multiple genomes and the analysis of the topology of protein interaction networks.|
The course starts with the elementary steps of sequence analysis, including gene prediction (which DNA encodes proteins or RNA), sequence comparison (how do we know two sequences are homologous and what is the best method to determine sequence homology), and RNA structure prediction (e.g. how do we predict miRNAs).
After covering the most elemental steps and their applications, we will move to the analysis of complete genomes and their comparison. Here we will show how one can predict which pathways are encoded in a genome, and also how genomes can be used to predict new pathways. Furthermore we cover aspects of systems biology directly related to genomics data. How can we integrate genomics data to obtain reliable predictions about e.g. the function of a protein, how can genomics data represented as networks and how to analyze such networks. As molecular sequences are a direct result of evolution the course also teaches how to use evolutionary information in the analysis of sequence data, and how to separate the effects of selection from those resulting from mutation pressures.