Course: Managing and Integrating Information in the Life Sciences (3rd edition)


17-21 June 2013, LUMC, Leiden


Marco Roos, Katy Wolstencroft, Jesse van Dam, Frank van Harmelen, Paul Groth and Egon Willighagen

Course Coordinators

Marco Roos and Katy Wolstencroft

Credits and grading

The total studyload of the course is 3 EC. There will not be a written examination.


Amsterdam, Netherlands


None. A computer will be provided for computer sessions, but a personal laptop would be an advantage


The amount of Life Science data available in the public domain is a vast and growing resource for bioinformatics research. There are over 20 million papers in PubMed and over 1600 biological databases. In many cases finding and applying the information from these resources is far from trivial. Following this course will show you new techniques for working with these distributed resources, including using the Semantic Web, Linked data and scientific workflows. It will also focus on methods for using or linking your own data into this large distributed web of resources.

Target audience

This course is for bioinformaticians who would like to learn about leading-edge data and knowledge integration solutions. You will learn (1) powerful and flexible approaches to data and information management for your bioinformatics application (Semantic web and Linked Data), (2) how to work with data across remote locations, for instance by applying Web Services and workflows, (3) how to publish your own data to get the most credit and make it available and reusable for the rest of the community.
We assume a basic understanding of bioinformatics programming for the hands on sessions.

Course description

This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.

In this course you will learn:

  1. how the ‘Linked Data’ principle works and how it can be applied for ‘meaningful’ data integration.
  2. how to expose your local data with rich metadata for use in other systems.
  3. how Web Services and workflows can be used to analyse distributed data.
  4. how to make publishable artefacts from your data for which you can get scientific credit.

Concept programme

Day 1: Introductions to Data integration

  • An introduction to the latest techniques in data and knowledge management
  • A semantic web primer
  • Hands on: A practical introduction to the semantic web – RDF and querying RDF

Day 2: Data to understandable data (generating and sharing data for reuse)

  • An introduction to minimum information models, identifiers and data standards
  • Hands on: A practical introduction data standards with RightField and Bioportal
  • RDF and dataset guidelines

Day 3: Data to understandable data, part 2 (publishing and sharing)

  • An introduction to Nanopublications – a new way of publishing your data and results
  • Hands On: Creating and using Nanopublications

Day 4: Integrating and using data (part 1)

  • An introduction to workflows and distributed
  • Hands on: A practical introduction to using the Taverna workbench
  • An introduction to Research Objects – describing the how and why of your experiments

Day 5: Integrating and using data (part 2)

  • An Introduction to provenance – recording the how and why of your experiments
  • The OpenPhacts project: using the Semantic Web for large-scale research projects
  • Hands on: Exploring OpenPhacts data

More information

For more information about the course programme you can contact Marco Roos or Katy Wolstencroft.
For more information about the registration or logistics you can contact Celia van Gelder.


More information about registration can be found on the enrollment page. You can pre-register for the course via the pre-registration form.

Course flyer