Optimizing resume information extraction through TSHD segmentation and advanced deep learning techniques

Anmar Abuhamdah, Mohammed Al-Shabi, Sana Jawarneh

Abstract


This research focuses on a significant factor in the natural language processing area, which is extracting information from unstructured textual data through efficient methods in order to pull useful insights and structured representations from this data. This research attempts to boost the effectiveness of information retrieval systems through computational analysis. This paradigm is explored in this work using question answering models in an extractive style, a modern information extraction approach, creating a new methodology combining the topic segmentation based on headings detection (TSHD) segmentation algorithm and deep learning methods. The TSHD algorithm breaks documents into sections in which certain topics are addressed. Refined extraction models are then used to process these disjoint segments leading to more accurate and contextjudicious extraction compared to naive whole-document extraction approaches. We empirically validate this approach using the stanford question answering dataset (SQuAD) 1.1 dataset, with a specific adaptation to resumes. Experimental results show that the performance metrics increase by 7.4% in exact match (EM) and by 7.8% in F1-score. This can be concluded from these results illustrating the feasibility of the proposed approach in the automated information extraction frameworks such as resume processing.

Keywords


Deep learning; Information extraction; Natural language processing; Textual; Topic segmentation; Transformer models

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v40.i3.pp1453-1465

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

shopify stats IJEECS visitor statistics