A framework for enabling software comparison and classification

Description

The number of scientific products, including scientific software, has been steadily growing in the last years. This growth makes it difficult for researchers to understand all the latest code and publications available. A great body of research has attempted at classifying similar papers and literature. However, there aren’t to date good approaches for finding similar or related code. In this project, the students will analyze different unsupervised methods to find scientific software similarities based on a) An automated analysis of their dependencies; b) By classifying the main functionality of software components.

Students

Advisors

Skills Required by the team

  • Python
  • Machine Learning
  • Sklearn
  • Data Manipulation