Open Access
Subscription Access
Representing Gender in Library Catalogue: Developing Multilingual Homosaurus for Automated Subject Indexing
Subscribe/Renew Journal
The sexist and homophobic attitude of global, generic knowledge organization systems like LCSH and DDC (as reported by many researchers and critiques since the 1970s) made the availability of LGBTQ+ resources extremely limited in India. This research uses the Homosaurus, a domain-specific comprehensive vocabulary tool (but in the English language only and non-interactive mode), and focuses on developing a multilingual and collaborative software framework to host the Homosaurus (in Hindi and Bengali to start with), as an interactive, participative, and collaborative global vocabulary tool. The main deliverable of this study is a multilingual Homosaurus in RDF serialization formats, which will be used as the vocabulary backend for a machine learning framework to support semi-automated indexing of LGBTQ+ documentary re-sources. The research aims to counter the challenges posed by limited indexed resources within the LGBTQ+ knowledge domain in Indian libraries by formulating and implementing a semi-automated subject indexing system. The prototype developed deploys the following open source tools and open access data sources: (i) Annif as the AI/ML framework; (ii) Machine learning backends like FastText, Omikuji and Neural network; (iii) VocBench to host multilingual Homosaurus; (iv) Skosify to curate RDF serializing formats of multilingual Homosaurus; (v) Skosmos to develop user interface for the multilingual Homosaurus; and (vi) open access databases (CrossRef, CoRE, Lens, OpenAlex and Semantic Scholar) to col-lect, gather and process the required training datasets. The research is multifaceted, encompassing the development of a semi-automated indexing framework, the evaluation of its operational efficiencies, and exploring the feasibility of a REST/API call-based approach for expeditious indexing of a substantial volume of records pertinent to the LGBTQ+ domain. The proposed semi-automated subject indexing system aims to enhance access to LGBTQ+ knowledge and challenge the pre-vailing biases inherent in existing knowledge organization paradigms.
Keywords
Annif, Automated Indexing, Gender Bias, Gender Spectrum, Homosaurus, Machine Learning, Semantic Annotation, Skosmos, VocBench
User
About The Authors
Information
Abstract Views: 70