Sequence Clustering Algorithm Based on Weighed Sequential Pattern Similarity

Di Wu, Jiadong Ren

Abstract


Sequence clustering has become an active issue in the current scientific community. However, the clustering quality is affected heavily by selecting initial clustering centers randomly. In this paper, a new sequence similarity measurement based on weighed sequential patterns is defined. SCWSPS (Sequence Clustering Algorithm Based on Weighed Sequential Pattern Similarity) algorithm is proposed. Sequences with the largest weighted similarity are chosen as the merge objects. The last K-1 synthesis results are deleted from sequence database. Others sequences are divided into K clusters. Moreover, in each cluster, the sequence which has the largest sum of similarities with other sequences is viewed as the updated center. The experimental results and analysis show that the performance of SCWSPS is better than KSPAM and K-means in clustering quality. When the sequence scale is very large, the execution efficiency of SCWSPS is slightly worse than KSPAM and K-means.


Keywords


Data mining; Sequence clustering;Sequential pattern; Weighted similarity

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v12.i7.pp5529-5536

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics