Capturing the Provenance of Data Analysis Using the PROV Standard

Description

Documenting how a result was obtained from data analysis involves documenting the software, software settings, and datasets used to obtain that result so it can be explained properly. The current ASSET interface enables users to document the provenance of data analysis no matter what infrastructure they used (R scripts, sk-learn, etc). This project will focus on capturing provenance records for data science projects and using the W3C PROV standard to export those records. It will also develop tools to mine provenance data to find common patterns of use.

Advisors

Skills Required by the team

  • Knowledge Graphs
  • Javascript
  • Firebase
  • UI Development