Concept2Vec: concept vector generation for biomedical literature using concept modelling

Suneetha Vazrala, Thayyaba Khatoon Mohammed


In the biomedical field due to tremendous medical research, every year the medical papers generated is very huge. In order to process these tremendous amounts of textual data, the information retrieval techniques are combined with machine learning methods like natural language processing (NLP) for finding the useful insights in data for NLP tasks. In this work, we have addressed the challenges involved in existing NLP techniques like Word2Vec, Doc2Vec, and Biosent2Vec and followed by the traditional classification of articles with Concept2Vec and random forest classification. Concept vector techniques are emerging methods in NLP and in information retrieval systems. Pretrained concept encoders are available in Machine learning, whereas no such exists in biomedical domain. In this paper a proposal on Concept2Vec trained from biomedical text is highlighted. Our work accurately emphasizes on the quality of semantics defined between the medical terms, which can help the medical practitioners in taking errorless decisions. In this work 20 thousand biomedical documents are considered for Concept2Vec embeddings and the proposed approach have shown best concept similarity in terms of relevant measures compared with existing approaches such as Word2Vec, global vectors (GloVe), fastText, and BioSentVec, bidirectional encoder representations from transformers (BERT), biomedical-BERT.


Biomedical text mining; Concept modelling; Concept similarity; Conecpt2Vec; N-grams; Word2Vec

Full Text:




  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics