Practical Course in Bioinformatics/Software lab 2017

Location: MTZ Seminar Room, Pauwelstr, 19; 3rd Floor, Corridor B room 3.04.

Dates: Monday 9:30-12:30 (starting 24.04.2017)

Language: English

Prerequisite (desirable): Introduction to Bioinformatics

Credits: 7 (10 with extra work)

Lecturers: Ivan G. Costa & Fabio Ticconi

Evaluation: 20% prototypes / 60% final project / 20% presentation

Campus Description: Bioinformatik Praktikum

New sequencing technologies based on nanopores improve sequencing capabilities by allowing the sequence of large DNA molecules. Moreover, nanopores can also capture modifications in DNA, such as DNA methylation. This capabilities come with a higher sequencing error rate (1-2%) than previous next generation sequencing methods (0.1%). As usual, bioinformatics approaches are required to differentiate between technical noise and biological meaningful variants.

This practical seminar will focus on the use of Hidden Markov models for sequencing alignment and error correction for Oxford Nanopore sequencing reads. The lab task will be based on the implementation of bioinformatics pipeline including: integration of low-level analysis tools (i.e. short sequencing aligner) and development of a method for high level analysis (i.e. error correction and detection of Single-nucleotide polymorphism). We will use real data provided by the CAMDA 2017 challenges, which includes meta-genomes sequences in both Illumina and Nanopores technologies.

Schedule:

24.04.2017 –Introduction to Bioinformatics and Next Generation Sequencing

8.05.2017 – Intro to HMMs [example]

15.05.2017 – Practical Example of NGS [instructions, data]

21.05.2017 – Introduction to the Project [Problem 1 data, Problem 2 data, E. Coli K12 MG1655 Reference]

22.05.2017 to 10.07.217 – Project Development

17.07.2017 – Project Presentation

Literature

Richard Durbin, A. Krogh, G. Mitchison, S. Eddy, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1999.

Online material on NGS

Video courses

The genomics data science course in coursera many interesting aspects of the course. We recommend the following lectures, which introduce HMMs (Course 13 , Course 14, Course 15, Course 16).