CluSandra: A Framework and Algorithm for Data Stream Cluster Analysis

Jose R. Fernandez; Eman M. El-Sheikh

doi:10.14569/IJACSA.2011.021115

Back

CluSandra: A Framework and Algorithm for Data Stream Cluster Analysis

Journal article

Open access

Peer reviewed

CluSandra: A Framework and Algorithm for Data Stream Cluster Analysis

Jose R. Fernandez and Eman M. El-Sheikh

International journal of advanced computer science & applications, Vol.2(11), pp.87-99

01/01/2011

DOI: https://doi.org/10.14569/IJACSA.2011.021115

Web of Science ID: WOS:000219133400015

Metrics

81 Record Views

2 Times Cited - Web of Science

Abstract

The clustering or partitioning of a dataset's records into groups of similar records is an important aspect of knowledge discovery from datasets. A considerable amount of research has been applied to the identification of clusters in very large multi-dimensional and static datasets. However, the traditional clustering and/or pattern recognition algorithms that have resulted from this research are inefficient for clustering data streams. A data stream is a dynamic dataset that is characterized by a sequence of data records that evolves over time, has extremely fast arrival rates and is unbounded. Today, the world abounds with processes that generate high-speed evolving data streams. Examples include click streams, credit card transactions and sensor networks. The data stream's inherent characteristics present an interesting set of time and space related challenges for clustering algorithms. In particular, processing time is severely constrained and clustering algorithms must be performed in a single pass over the incoming data. This paper presents both a clustering framework and algorithm that, combined, address these challenges and allows end-users to explore and gain knowledge from evolving data streams. Our approach includes the integration of open source products that are used to control the data stream and facilitate the harnessing of knowledge from the data stream. Experimental results of testing the framework with various data streams are also discussed.

Files and links (1)

url

CluSandraView

Published (Version of record)link to articleCC BY V4.0, Open

Details

Title: CluSandra
Publication Details: International journal of advanced computer science & applications, Vol.2(11), pp.87-99
Resource Type: Journal article
Publisher: Science & Information Sai Organization Ltd
Format: link
Copyright: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
Identifiers: WOS:000219133400015; 99380461296206600
Academic Unit: Office of Teaching, Learning, and Technology; College of Arts, Social Sciences, and Humanities; Center for Cybersecurity and AI
Language: English

CluSandra: A Framework and Algorithm for Data Stream Cluster Analysis

Metrics

Abstract

Files and links (1)

Related links

Details

University of West Florida Social media