The OpenAIRE datathon aims at stimulating developers and data scientists at analysing the OpenAIRE Information Space with the intent of improving its consumption by users and third-party services. The OpenAIRE information space consists of a scholarly communication graph interlinking publications, datasets, software, research organizations, funders, and projects. The graph is the result of harvesting metadata from around 3000 data providers, harmonizing such metadata, and keeping or inferring links between graph objects described by such metadata. Inference is the result of text-mining a pool of Open Access article full-texts, which numbers around 6 Million full-texts. The graph counts around 60M objects, is openly accessible via APIs and a web portal, and is used today to offer research impacts statistics (e.g. number of products linked to given funders), Open Access trends (e.g. Open Access ratio of products published by given funders), and discovery of interlinked scholarly products (e.g. articles linked to datasets, software linked to articles for communities).
The datathon encourages teams of computer scientists, data scientists and experts from other fields to join the challenge of studying and analysing the OpenAIRE graph to enhance its discovery and statistical capabilities, in order to better serve the mission of Open Science. Buzz-topics leading the challenge are:
- Enabling multi-disciplinary or discipline-specific discovery or stats functionality
- Novel techniques to enable measurement of scientific impact, e.g. counters, links, provenance
- Novel techniques to measure scientific impact, e.g. measures of quality
- Enabling reproducibility, e.g. re-use oriented metadata, meaningful interlinking of objects
- De-duplication of the information space, e.g. disambiguation of authors, disambiguation of organizations
Data analysis can be performed using any cutting-edge methodology and technology, with the intention of deriving higher quality, statistics, enrichments from an original data collection, in order to make it more usable and interesting to the intended users. Teams can take advantage of the data analysis tools made availalble through this platform.
The two teams with the most outstanding and innovative solutions will win the datathon and be awarded a prize.
- The name of the Team (if none is provided, the name of the contact will be used);
- The name and email of the Team Contact, which will be granted access to the datasets and be the reference for the team;
- The names and emails of the rest of the team members.
Registered members will have privileged access to the datathon portal functionalities, be able to exchange messages with the organizers or other participants, and access to the datasets. Note that as a registered user your profile can be automatically enriched via LinkedIn information.
Teams can register any time before the 15th of January 2018.
This portal is dedicated to the participants of the first OpenAIRE datathon, to find input datasets and instructions, for Q&A with the organizers and to deposit their results by the end of the Datathon. For any information you may need, contact the organizers at firstname.lastname@example.org
The OpenAIRE datathon will start the 30th of November 2017 and will end the 28th of February 2018. During these three months, Teams will be able to register any time until the 15th of January 2018, after such a date registrations will be closed.