Abstract:
|
An important objective of data mining is finding features hidden in massive streams of data to produce models that predict behavior and enable action. But consumers, markets, industrial processes as well as plant and operational reliabilities all exhibit behaviors that seem to follow laws that govern complex systems. High dimensionality, non-linearity, heterogeneity, noise, sheer mass of data, lack of right knowledge, and the inherent difficulty of predicting changing behavior all combine to make data mining typically a slow cost-loaded off-line analytical process that is at times unreliable. The following discusses the business needs and barriers facing data mining. It reviews statistical and probabilistic methods together with concepts from complexity theory and data engineering (i.e., data capture and data fusion). We examine a framework in which both causality and landscape structures may be quickly and reliably extracted as part and function of data acquisition. We further explore types of adaptive modeling techniques required to establish action-strategies, associated risks, and uncertainty and effectively predict in the face of changing behaviors and landscapes.
|