Content-based recommendation-system for the examples in sphinx-gallery
2023-08-17 , HS 120

The gallery of your project might group the examples by module, by use case, or some other logic. But as examples grow in complexity, they may be relevant for several groups. In this talk we discuss some possible solutions and their drawbacks to motivate the introduction of a new feature to sphinx-gallery: a content-based recommendation system.


Imagine a scikit-learn example on text clustering using silhouette scores, as a maintainer, would you assign it to the sklearn.cluster, the sklearn.feature_extraction.text or the sklearn.metrics group of examples? As a user where would you look for it?

Some solutions such as adding human implemented tags have been proposed to cross-link examples that can be grouped by different logics, with the disadvantage of requiring maintenance and consensus. Instead we could have a recommender system based on similarity (nearest neighbors tf-idf model) to automatically link to the most relevant related content . This could be introduced at the end of each example.

Libraries with several examples such as scikit-learn and matplotlib may benefit from this new feature.

For more information visit https://github.com/sphinx-gallery/sphinx-gallery/pull/1125


Expected audience expertise: Python

none

Category [Community, Education, and Outreach]

Other

Expected audience expertise: Domain

none

Abstract as a tweet

Keep the example gallery of a project easy to navigate with the help of an example-recommender system.

Project Homepage / Git

https://sphinx-gallery.github.io/stable/index.html

I did my PhD in theoretical quantum physics at the National Autonomous University of Mex-
ico (UNAM). I currently work at the INRIA foundation as part of the scikit-learn consortium, mostly in charge of maintaining the scikit-learn documentation.