Statistical mining in data streams
Recent years have seen a steady rise of a new class of data management systems called Data Stream Management Systems (DSMS). These systems manage rapid, high-volume data-streams with transient relations instead of static data with persistent relations. Data streams are common to applications such as network traffic and transaction monitoring systems, click-stream processors, industrial process control, and sensor networks. A DSMS operates on these continuous and time-varying data streams to facilitate on-the-fly query answering, and to support data acquisition, monitoring and analysis. In this dissertation, we present statistical stream mining solutions for effective on-line processing of streaming data. We focus research issues related to adaptive stream resource conservation and online mining in a DSMS. We have developed statistical linear and non-linear filtering techniques based on the Kalman Filter to capture temporal correlations in the streaming data. Such correlations help in stream resource conservation. We also propose techniques that capture spatial correlations between the streaming sources that further helps improving resource conservation and facilitates answering group-queries in an efficient manner. In addition to resource management and query processing, a DSMS needs to ad-dress issues related to online stream mining. Once the data stream arrives at a central server, effective mining techniques are necessary for stream analysis, before the data can be discarded. Since a stream continuously evolves with time, stream mining techniques need to be adaptive and should operate under a given memory constraint. We propose adaptive clustering solutions that use the kernel trick to capture non-linear relations in the streaming data. We also present OCODDS, a change-detection approach that can track evolutionary changes in the stream in both linear and non-linear settings. Finally, we present our techniques for effective acquisition and processing of data streams common to video sensor networks.