An efficient frequent itemsets finding in distributed datasets with minimum communication overhead

Houda Essalmi, Anass El Affar

Abstract


Finding frequent itemsets is an essential researched technique and a challenging task of data mining. Traditional approaches for distributed frequent itemsets require massive communication overhead among different distributed datasets. In this paper, we adopt a new strategy for optimizing the time of communications/synchronizations from large datasets and, we present a novel algorithm for discovering frequent itemsets in different distributed datasets on the slave sites called finding efficient distributed frequent itemsets (FEDFI). The proposed algorithm is capable of generating the important frequent itemsets by applying an efficient technique for pruning the candidate itemsets. The experimental results confirm that our algorithm FEDFI performs better than Apriori and candidate distribution (CD) algorithms in terms of communication and computation costs.


Keywords


Apriori; Communication scheme; Computation costs; Distributed database; Generation of candidates

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v38.i1.pp496-507

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics