Computer Science Technical Reports
CS at VT

A Genetic Algorithm Approach to Cluster Analysis

Cowgill, Marc C and Harvey, Robert J and Watson, Layne T (1998) A Genetic Algorithm Approach to Cluster Analysis. Technical Report ncstrl.vatech_cs//TR-98-16, Computer Science, Virginia Polytechnic Institute and State University.

Full text available as:
Postscript - Requires a viewer, such as GhostView
TR-98-16.ps (1176446)

Abstract

A common problem in the social and agricultural sciences is to find clusters in experimental data; the standard attack is a deterministic search terminating in a locally optimal clustering. We propose here a genetic algorithm (GA) for performing cluster analysis. GAs have been used profitably in a variety of contexts in which it is either impractical or impossible to directly solve for a globally optimal solution to complex numerical problems. In the present case, our GA clustering technique attempted to maximize a variance-ratio (VR) based goodness-of-fit criterion defined in terms of external cluster isolation and internal cluster homogeneity. Although our GA-based clustering algorithm cannot guarantee to recover the cluster solution that exhibits the global maximum of this fitness function, it does explicitly work toward this goal (in marked contrast to existing clustering algorithms, especially hierarchial agglomerative ones such as Ward's method). Using both constrained and unconstrained simulated datasets, Monte Carlo results showed that in some conditions the genetic clustering algorithm did indeed surpass the performance of conventional clustering techniques (Ward's and K-means) in terms of an internal (VR) criterion. Suggestions for future refinement and study are offered.

Item Type:Departmental Technical Report
Subjects:Computer Science > Historical Collection(Till Dec 2001)
ID Code:495
Deposited By:User autouser
Deposited On:05 December 2001
Alternative Locations: URL:ftp://ei.cs.vt.edu/pub/TechnicalReports/1998/TR-98-16.ps, URL:http://historical.ncstrl.org/tr/ps/vatech_cs/TR-98-16.ps