Improved interactivity and automated response for visual question answering
Abstract
Visual question answering (VQA) systems have made substantial progress, yet they still face limitations in handling complex or ambiguous queries and supporting real-time interaction due to reliance on large, computationally expensive models that increase latency and restrict practical deployment, particularly in educational contexts. This study aims to develop an efficient and interactive VQA system that enhances answer accuracy while enabling natural two-way communication with users. To achieve this goal, we propose a lightweight multimodal framework based on pre-trained vision language models such as BLIP and fine-tuning T5, combined with prompt engineering to improve question understanding and answer generation. The system further incorporates conversational context memory and a feedback mechanism that generates clarification questions when user inputs are ambiguous, thereby strengthening interaction capabilities. Experiments are conducted on public benchmark dataset Flickr8k, using single-GPU computational settings to evaluate accuracy, response latency, and interaction effectiveness. The experimental results demonstrate that the proposed approach achieves competitive or superior accuracy compared to heavier baseline models, while significantly reducing inference time and enabling real-time interaction. The main contributions of this work include a lightweight, prompt-driven VQA architecture, an interactive strategy for resolving ambiguous queries, and empirical evidence that efficient models can support accurate and conversational VQA for education and other real world applications.
Keywords
Automatic response; Interaction; Model evaluation; Natural language processing; Visual question answering
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v42.i3.pp742-752
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).