An efficient frequent itemsets finding in distributed datasets with minimum communication overhead

Houda Essalmi; Anass El Affar

doi:10.11591/ijeecs.v38.i1.pp496-507

An efficient frequent itemsets finding in distributed datasets with minimum communication overhead

Houda Essalmi, Anass El Affar

Abstract

Finding frequent itemsets is an essential researched technique and a challenging task of data mining. Traditional approaches for distributed frequent itemsets require massive communication overhead among different distributed datasets. In this paper, we adopt a new strategy for optimizing the time of communications/synchronizations from large datasets and, we present a novel algorithm for discovering frequent itemsets in different distributed datasets on the slave sites called finding efficient distributed frequent itemsets (FEDFI). The proposed algorithm is capable of generating the important frequent itemsets by applying an efficient technique for pruning the candidate itemsets. The experimental results confirm that our algorithm FEDFI performs better than Apriori and candidate distribution (CD) algorithms in terms of communication and computation costs.

Keywords

Apriori; Communication scheme; Computation costs; Distributed database; Generation of candidates

Full Text:

PDF

DOI: http://doi.org/10.11591/ijeecs.v38.i1.pp496-507

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

IJEECS visitor statistics

Username
Password
Remember me