JSM 2012 Home

JSM 2012 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Online Program Home

Abstract Details

Activity Number: 295
Type: Contributed
Date/Time: Tuesday, July 31, 2012 : 8:30 AM to 10:20 AM
Sponsor: IMS
Abstract - #305484
Title: An Efficient Initialization of the K-Means Algorithm
Author(s): Igor Melnykov*+ and Volodymyr Melnykov
Companies: Colorado State University Pueblo and North Dakota State University
Address: 2200 Bonforte Blvd, Pueblo, CO, 81001, United States
Keywords: cluster analysis ; K-means ; initialization ; Mahalanobis distance

K-means is a popular method of clustering thanks to its simplicity and high computational speed. In its basic form, the algorithm minimizes the average squared Euclidean distance from points to their cluster centers. Thus, the quality of the solution suggested by the algorithm depends on the initial set of cluster centers as well as the shapes of clusters, since the use of Euclidian distances is beneficial in the case when the data form homogeneous spherical groups. One challenge in choosing the initial set of cluster centers is to have all clusters represented, which makes the convergence to the optimal solution more likely. Our proposed technique aims at achieving the proper cluster representation in the initial set of points. We also make the method more flexible by considering the Mahalanobis distance in the computations.

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2012 program

2012 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.