17-21 June 2013, LUMC, Leiden
Marco Roos, Katy Wolstencroft, Jesse van Dam, Frank van Harmelen, Paul Groth and Egon Willighagen
Marco Roos and Katy Wolstencroft
Credits and grading
The total studyload of the course is 3 EC. There will not be a written examination.
None. A computer will be provided for computer sessions, but a personal laptop would be an advantage
The amount of Life Science data available in the public domain is a vast and growing resource for bioinformatics research. There are over 20 million papers in PubMed and over 1600 biological databases. In many cases finding and applying the information from these resources is far from trivial. Following this course will show you new techniques for working with these distributed resources, including using the Semantic Web, Linked data and scientific workflows. It will also focus on methods for using or linking your own data into this large distributed web of resources.
This course is for bioinformaticians who would like to learn about leading-edge data and knowledge integration solutions. You will learn (1) powerful and flexible approaches to data and information management for your bioinformatics application (Semantic web and Linked Data), (2) how to work with data across remote locations, for instance by applying Web Services and workflows, (3) how to publish your own data to get the most credit and make it available and reusable for the rest of the community.
We assume a basic understanding of bioinformatics programming for the hands on sessions.
This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.
In this course you will learn:
- how the ‘Linked Data’ principle works and how it can be applied for ‘meaningful’ data integration.
- how to expose your local data with rich metadata for use in other systems.
- how Web Services and workflows can be used to analyse distributed data.
- how to make publishable artefacts from your data for which you can get scientific credit.
Day 1: Introductions to Data integration
- An introduction to the latest techniques in data and knowledge management
- A semantic web primer
- Hands on: A practical introduction to the semantic web – RDF and querying RDF
Day 2: Data to understandable data (generating and sharing data for reuse)
- An introduction to minimum information models, identifiers and data standards
- Hands on: A practical introduction data standards with RightField and Bioportal
- RDF and dataset guidelines
Day 3: Data to understandable data, part 2 (publishing and sharing)
- An introduction to Nanopublications – a new way of publishing your data and results
- Hands On: Creating and using Nanopublications
Day 4: Integrating and using data (part 1)
- An introduction to workflows and distributed
- Hands on: A practical introduction to using the Taverna workbench
- An introduction to Research Objects – describing the how and why of your experiments
Day 5: Integrating and using data (part 2)
- An Introduction to provenance – recording the how and why of your experiments
- The OpenPhacts project: using the Semantic Web for large-scale research projects
- Hands on: Exploring OpenPhacts data