BioSB/ELIXIR-NL course: Managing and Integrating Life Science Information (5th edition): Approaches using Linked Data and Semantics

With endorsements for FAIR* data stewardship ranging from Nature Genetics to the G7, and increasing pressure from funders for much stricter data management, FAIR data stewardship skills will be among the most wanted for the next decade. By following this course, you add these skills to your CV and learn cutting-edge semantic techniques to search and integrate health and life science data for efficient, reproducible data science.

* FAIR: Findable, Accessible, Interoperable and Reusable for humans and computers


22-26 January 2018


Flyer of the 2015 edition of the course


Confirmed lecturers: Frank van Harmelen, Andre Dekker

Course Coordinators

Marco Roos and Katy Wolstencroft

Credits and grading

The total study load of the course is 3 EC. There will not be a written examination.


Amsterdam Science Park, The Netherlands

CWI, Science park 123, Amsterdam (L017)

Hosted by: Amsterdam Data Science


None. We assume participants will bring their own laptop. Please indicated in the remarks box of the registration form if this is not the case.


The amount of Life Science data available in the public domain is a vast and growing resource for bioinformatics research. There are over 20 million papers in PubMed and over 1600 biological databases. In many cases finding and applying the information from these resources is far from trivial. Following this course will show you techniques for working with these distributed resources, which includes using the web of Linked data and scientific workflows. It will also focus on methods for using or linking your own data into this large distributed Semantic Web of resources, in order to ensure that your data is FAIR (Findable, Accessible, Interoperable and Reusable).

Target audience

This course is for bioinformaticians who would like to learn about leading-edge data and knowledge integration solutions. You will learn (1) powerful and flexible approaches to data and information management for your bioinformatics application (Semantic Web and Linked Data), (2) how to work with data across remote locations, for instance by applying Web Services and workflows, (3) how to publish your own data to make it available and reusable for the rest of the community. We assume a basic understanding of bioinformatics programming for the hands on sessions. It would suit previous user participants of BYOD meetings who would like more hands-on experience of data integration. It would also suit data providers who would like to explore new ways of serving their data or integrating it with other resources.

Course description

This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.

In this course you will learn about:

  1. Linked Data and the Semantic Web technologies that underpin it
  2. How you can use Linked Data for data and knowledge integration in the Life Sciences
  3. Available Linked Data resources in the public domain and large-scale projects that use these resources
  4. How you can integrate your own data with Linked Data resources
  5. How you can combine data integration and analysis over distributed resources, using Web Services and workflows

Tentative programme

Day 1 (January 22)

09:30 coffee/tea/arrival
10:00 Welcome and course goals (Marco Roos/Katy Wolstencroft)
10:15 Introduction (Frank van Harmelen)
11:15 Coffee
11:30 Define course projects/participants’ interests (for hands-on)
12:00 Lunch
13:00 Conceptual modelling (Marco Roos)
14:00 Presentation of conceptual models (participants)
14:30 Coffee
15:00 RDF tutorial
16:30 Drinks

Day 2 (January 23)

09:30 Recap of day 1
10:00 Introduction to bio-ontologies (Marco Roos)
10:30 Ontology hands-on
11:30 Coffee
11:30 EBI ontology tools and their application (Katy Wolstencroft)
12:00 Introduction to WikiData (Andra Waagmeester)
12:30 Lunch
13:30 Hands-on with SPARQL (on Wikidata resources)
15:30 Coffee
16:00 Case-time: Define how to apply Ontologies and RDF to your own cases
17:00 End

Day 3 (January 24)

09:30 Recap of day 2
10:00 Guest speaker: Andre Dekker
11:00 Coffee
11:30 OpenPHACTS (Ronald Siebes)
12:30 Lunch
13:30 Workflows introduction (Katy Wolstencrot)
14:00 Workflow hands-on using OpenPHACTS resources
15:30 Coffee
16:00 Case-time: Define how to use workflows (and OpenPHACTS) for your case
17:00 End

Day 4 (January 25)

09:30 Recap of day 3
10:00 FAIRdom (Katy Wolstencroft)
10:30 Annotating and describing data FAIR with FAIRDOM (hands-on)
11:15 Coffee
11:30 Hands-on continued
12:00 Lunch
13:00 FAIR data services introduction (Mark Thompson)
13:45 Hands-on FAIRifying biobank data
15:30 Coffee
16:00 Case-time: FAIRifying data of your case (alternative for health participants: join NIH webinar)
17:00 End (NIH webinar continues)

Day 5 (January 26)

09:30 Recap of day 4
10:00 Case-time: apply what you have learned to your case
11:15 Coffee
11:30 Prepare presentation
12:30 Lunch
13:30 Participant presentations
15:00 Wrap-up and closing

More information

For more information about the course programme you can contact Marco Roos or Katy Wolstencroft.
For more information about the registration or logistics you can contact the BioSB office.2