Determining subject headings of documents using information retrieval models

Evi Yulianti, Laksmita Rahadianti

Abstract


Subject heading is a controlled vocabulary that describes the topic of adocument, which is important to find and organize library resources. Assigning appropriate subject headings to a document, however, is a time-consuming process. We therefore conduct a novel study on the effectiveness of information retrieval models, i.e.,language model (LM) andvector spacemodel (VSM), to automatically generate a ranked list of relevant subject headings, with the aim to give a recommendation for librarians to determine the subject headings effectively and efficiently. Our results show that there are a high number of our queries (up to 61%) that have relevant subject headings in the ten top-ranked recommendations and on average, the first relevant subject heading is found at the early position (3rd rank). This indicates that document retrieval methods can help the subject heading assignment process. LM and VSM are shown to have comparable performance, except when the search unit is title, VSM is superior to LM by8-22%. Our further analysis exhibits three faculty pairs that are potential to have research collaboration as their students’ thesis often have overlap subject headings: i) economy and business-social and political sciences, ii) nursing-public health and iii) medicine-public health.


Keywords


Document retrieval; Information retrieval; Language model; Subject heading; Vector space model

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v23.i2.pp1049-1058

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics