The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
            
             
                    Online Program Home
            
            
             
            
        
	Abstract Details
	
	
		
			| 
				
					
						| Activity Number: | 174 |  
						| Type: | Contributed |  
						| Date/Time: | Monday, July 30, 2012 : 10:30 AM to 12:20 PM |  
						| Sponsor: | Section on Statistical Computing |  
						| Abstract - #305991 |  
						| Title: | Model-Based Clustering Analysis of Large Climate Simulation Data Sets |  
					| Author(s): | Wei-Chen Chen*+ and George Ostrouchov and David Pugmire and Mr Prabhat and Michael Wehner |  
					| Companies: | Oak Ridge National Laboratory and Oak Ridge National Laboratory and Oak Ridge National Laboratory and Lawrence Berkeley National Laboratory and Lawrence Berkeley National Laboratory |  
					| Address: | , Oak Ridge, TN, 37830, United States |  
					| Keywords: | model-based clustering ; 
							unsupervised learning ; 
							EM ; 
							APECM ; 
							CAM ; 
							SPMD |  
					| Abstract: | 
							We develop a parallel expectation-maximization (EM) algorithm for model-based clustering, utilizing high-performance computing techniques. We utilize the single program multiple data (SPMD) programming model to reduce communication between processors. Our parallel EM algorithm scales for clustering ultra-large (hundreds of terabytes) datasets. We can apply the same technique for improving the scalability of EM-alike algorithms, such as AECM and APECM. Moreover, these parallel algorithms are easily generalized for optimizing other finite mixture models. We demonstrate the performance of our parallel algorithm on a high resolution climate dataset produced by the community atmosphere model (CAM5). An accompanying R package 'pmclust', for parallel model-based clustering is released on CRAN.     
						 |  
 
					The address information is for the authors that have a + after their name.Authors who are presenting talks have a * after their name.
 
					Back to the full JSM 2012 program
				 | 
	
	
	
	
		
		2012 JSM Online Program Home
		
		For information, contact jsm@amstat.org or phone (888) 231-3473. 
		
		If you have questions about the Continuing Education program, please contact the Education Department.