Current students


Section: Computer Science and Engineering

Major Research topic:
Online Learning from Time-dependent Evolving Data Streams

Data stream mining is a subject of much current interest in Machine Learning (ML), particularly with the increasing pervasiveness of sensor networks and the Internet of Things. Most ML methods implicitly or explicitly assume a fundamental principle: that data points are independent and identically distributed (i.i.d.). Although this assumption may hold with historical data, it is admittedly dubious that it does with data streams. Indeed, each data point is usually strongly correlated to those acquired at previous time steps (e.g., the stock market or the human heartbeats). While Streaming Machine Learning (SML) and Time-Series Analysis (TSA) attack some aspects of the problem, a comprehensive solution is missing. SML copes with an unbounded data flow generated at a staggering rate, constantly learning from fewer data. It relaxes the assumption that data points are identically distributed (defining changes in the underlying distribution as concept drifts). However, it still assumes their independence. TSA models sequences of observations collected over a specific past time interval and arranged in chronological order. It considers temporal dependence among data points, but it assumes they are stationary, i.e., drawn from the same distribution.

This Ph.D. research investigates how to fill the gap between TSA and SML. Our goal is to conceive, design and evaluate a class of methodologies for learning time-dependent evolving data streams by exploiting the temporal dependence present in the data stream with adaptive techniques.