Advanced de novo assembly and resolving complex genomic regions


Monday, 1 June 2015 – Wednesday, 3 June 2015


Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, the Netherlands


Yahya Anvar, Johan den Dunnen


This course is made possible by the generous support of 3Gb-TEST and Generade.





This course is targeted at PhD students and Postdocs in Life Sciences or Bioinformatics with basic knowledge of next-generation sequencing (NGS) technology and data analysis. This course covers topics on what every biologist should know about de novo assembly and study design, overview of available technologies and sampling strategies, methodologies and framework for de novo assembly (haploid and polyploid genomes), gap closure, quality assessment and functional annotation. In addition, we will showcase recent achievements, novel discoveries and current limitations in resolving complex genomes. The practical sessions are designed in such a way to promote lively discussions on how to design your study, the best practices, dos and don’ts, and provide you with an outlook on how to move from sequencing data to high-quality assembly and meaningful biological interpretation. After following this cou
rse, participants will gain insights on how to design a genome assembly study and will have an overview of challenges and various techniques to choose the most fitting strategy in order to produce a high-quality genome assembly.

NB: EPS Postgraduate course ‘Genome Assembly’: The EPS Graduate school in Wageningen is organising a Postgraduate course ‘Genome Assembly’ on April 28-29, 2015. The  course is a compact, 2-day course, offering practical guidelines to genome assembly. It is aimed at PhDs in the green life sciences working on eukaryotic genomes. Both days consists of 4 blocks in which theory is combined with computer exercises in Galaxy. Basic knowledge of NGS data analysis is recommended but not necessary.

Target Audience

This course is primarily targeted at academic researchers such as PhD students and Postdocs in Life Sciences or Bioinformatics with experience in genomics or NGS data analysis. However, participants from private sector are also welcome. Participants are expected to have experience in NGS data analysis or to have followed the NGS data analysis course.

Course Description

Reconstructing a genome from a collection of significantly shorter sequencing reads constitutes a de novo genome assembly process. In spite of its importance to biology, high-quality genome assembly is a challenging process and often requires careful study design and complementary strategies to reliably resolve complex genomes, such as those of polyploid or repeat-rich eukaryotes. This course is aimed at providing a framework on sampling and study design, the choice of sequencing strategy and appropriate assembly approaches, insightful assessment and evaluation of the draft assembly, as well as potential complementary approaches for resolving complex regions and closing persistent gaps in the genome. In addition, we also cover topics on functional annotation of the genome, de novo transcriptomics and comparative genomics to further aid the biological interpretation.

This three-day course is structured according to four main themes:

  • What should every biologist know about de novo assembly
  • Technologies
  • Achievements, novel discoveries and current limitations
  • Available methodologies, strategies and common mistakes

Each day will be concluded by a keynote from a distinguished scientist in the field to leave the audience with an outlook on successful stories and future directions. Practical sessions are designed in such a way to encourage lively discussions on how to perform different steps of the genome assembly process as well as how to interpret the outcomes and spot often overlooked but common mistakes.

Keynote speakers:

  • Mark Chaisson (University of Washington, USA) – Resolving the complexity of the human genome by single-molecule sequencing
  • Alexandre de Kochko (Centre IRD de Montpellier, France) – Sequencing and assembling the genome of the allotetraploid Coffee arabica

Find more detailed  information about the course programme here (PDF-file).


The number of participants for lectures and discussion sessions is 60. However, we can only accommodate for 30-35 participants during the practical sessions. So, if you would also like to attend the practical sessions we strongly advise you to register soon! Please note that coffee, tea, soft drinks and lunch will be provided.

For more information and advise on travel/hotel, please contact Ms. Anita Remmelzwaal (LUMC).

Registration for the course is closed.

Previous edition

You can find the website of the previous edition of this course here