Emma Garlock


Interventions

04/06
11:05
5minutes
Pirates vs Paywalls: Preliminary Investigation into the Utility of Sci-Hub Download Logs for Identifying Trends in User Behaviour.
Emma Garlock

Introduction
Sci-Hub is a well-known pirate repository that allows users to circumvent paywalls and download academic articles relating to various health sciences and STEM subjects. While Sci-Hub may increase access to information in a timely manner and may flag issues in established publishing practices, there are ethical, legal and security risks associated with the platform's use. This research uses data made public by Sci-Hub to better understand Canadian Sci-Hub user behaviour, which can help inform potential approaches for discussing Sci-Hub usage with users in need of health sciences and biomedical information.
Methods
This research analyzed the Sci-Hub download log for 2017. IP information was used to identify Canadian downloads. Other information analyzed includes the date of download, user city, and DOI of the accessed article. The DOIs of top articles were loaded into Zotero to retrieve publication dates and titles for further analysis.
Results
Results will showcase top Canadian cities for Sci-Hub use, temporal trends for 2017, and bibliographic information of frequently accessed articles.
Discussion
This research provides some of the first data-driven insights into Sci-Hub user behaviour in a Canadian context. Currently, only the 2017 data is publicly available, but Sci-Hub is clear about its intentions to make full download logs available in the future. The preliminary analyses shown here provide a blueprint for others who may be interested in conducting their own analyses on Sci-Hub usage in their own contexts.

Teaching & Learning
2306/2309
05/06
14:10
20minutes
Investigating the Impact of the NLM Automatic Indexer on Information Retrieval using citation metadata
Emma Garlock

Introduction
As the implementation of automatic indexing for MeSH terms becomes more well-known, concerns are being raised in the field of health sciences librarianship on how these changes impact established searching practices. While other ongoing research investigates the accuracy and reliability of automatic indexing on a per-citation basis, this work analyzes overall trends and performance of the algorithm for chemistry and genetics research.
Methods
4302 citations published between November 2020 and March 2023 had their relevant information fields extracted via NLM’s efetch and xtract tools on July 25th, 2024. This data was combined with MeSH data downloaded from the Ontology Lookup Service on September 4th, 2024. All analyses were completed in R. To evaluate the potential impact of stemming on term overlap, Porter Stemming was applied using the Tokenizer and SnowballC packages.
Results
Results generally support the claim that automatic indexing decreases the time required for a citation to receive indexing. There is variability in how well search fields overlap for indexing methods, but overall topic harmonization increases as terms are tokenized and stemmed. Manually indexed citations tend to have a higher degree of field overlap, which aligns with the finding that the average number of MeSH terms is higher for manually indexed citations.
Discussion
This work builds on feedback from previous presentations and provides a more detailed and large-scale investigation into the impact of the automatic indexing algorithm and its impact on health librarianship.

AI
2306/2309