An efficient frequent itemsets finding in distributed datasets with minimum communication overhead
Abstract
Finding frequent itemsets is an essential researched technique and a challenging task of data mining. Traditional approaches for distributed frequent itemsets require massive communication overhead among different distributed datasets. In this paper, we adopt a new strategy for optimizing the time of communications/synchronizations from large datasets and, we present a novel algorithm for discovering frequent itemsets in different distributed datasets on the slave sites called finding efficient distributed frequent itemsets (FEDFI). The proposed algorithm is capable of generating the important frequent itemsets by applying an efficient technique for pruning the candidate itemsets. The experimental results confirm that our algorithm FEDFI performs better than Apriori and candidate distribution (CD) algorithms in terms of communication and computation costs.
Keywords
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v38.i1.pp496-507
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).