Location: MTZ Seminar Room, Pauwelstr, 19; 3rd Floor, Corridor B room 3.04.
Dates: Monday 9:30-12:30 (starting 16.04.2017)
Prerequisite (desirable): Introduction to Bioinformatics
Credits: 7 (10 with extra work)
Lecturers: Ivan G. Costa & Zhijian Li
Evaluation: 20% prototypes / 60% final project / 20% presentation
Campus Description: Bioinformatik Praktikum
Next-generation sequencing (NGS) allows the measurement of molecular characteristics of individuals on a genome-wide scale. The application of NGS methods to large patient groups enables precise medicine, i.e. finding genetic features to guide medical treatment. The low level analysis of NGS data imposes large computational and statistical challenges. NGS data are typically large (1 to 100 GB per sample/patient) requiring efficient computational strategies for data analysis and storage. Moreover, NGS data contains artifacts and noise, which affects the reliability of predictions and leads to errors.
In this practical course, we will focus on the problem of detection of protein binding sites from open chromatin data. Groups will implement and propose statistical models or machine learning methods for the detection of putative binding sites within highly dimensional genomic data. The proposed tools will be used for analysis of public medical genomic data from the Human Epigenome, ENCODE or the ENCODE DREAM challenge. Students will learn computational pipelines necessary for the analysis of sequencing data including quality check, alignment and post-processing steps. We will use the high performance cluster from the ITC RWTH Aachen as computational platform for this course.
16.04.2018 –Introduction to Bioinformatics and Next Generation Sequencing
- Chip-seq protocol
- open chromatin protocol (DNase-seq and ATAC-seq)
- Transcription factor binding sites
23.04.2018 – Practical Course in NGS data analysis
30.04.2018 – Introduction to the Project
- cancer clustering (data is not public, need permission)
- DREAM challenge (too many data)
- supervised learning for footprinting
7.05.2018 to 9.07.2018 – Project Development
16.07.2018 – Project Presentation
Richard Durbin, A. Krogh, G. Mitchison, S. Eddy, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1999.
Online material on NGS