|BRONDOLIN ROLANDO||Cycle: XXXII |
Section: Computer Science and Engineering
Tutor: BONARINI ANDREA Major Research topic
:Energy Efficient Architecture for Streaming sYstems
Advisor: SANTAMBROGIO MARCO DOMENICOAbstract:
With the rise of Big Data, there is the increasing need to extract knowledge from the massive amount of data generated across several applicative domains. Among the others, data generated by Social Network users, Mobile applications data, Infrastructure monitoring logs and IoT sensors offer a huge set of continuously generated data that can be exploited to provide value for researchers, investors and companies. The common factor of this diverse applicative domains resides in the timeliness at which the data should be processed, as often this kind of content is meaningful at the time it is generated.Tools and techniques like Map Reduce, Hadoop, Spark and many others were developed in the last decade to cope with the increased amount of data processing required by the aforementioned applicative domains. However, these tools work only using batches of offline data, thus making real-time stream computing unfeasible. In the last few years tools like Apache Storm, Apache Spark Streaming and Apache Flink then rose to tackle stream computing in a distributed and scalable way, delivering continuous analytics on streaming data.Of course, streaming systems should be sufficiently over provisioned to cope to sudden input spikes and in general they should be resilient to the typical fluctuation of the input data rate. This situation poses novel challenges, as these systems usually work sub-optimally from an energy efficiency point of view. The goal of this research project (named E2ASY - Energy Efficient Architecture for Streaming sYstems) is to introduce novel techniques in the resource management of streaming systems to directly address power and energy consumption as an optimisation metric. Within this research project, the streaming system should be able to provide the performance and the latency requirements the hosted streaming applications require while at the same time minimising the overall energy consumption. This major goal poses novel challenges at the computer architecture level, the distributed system level and the software management level. At the software management level, in particular, E2ASY should seamlessly integrate the streaming system with the data sources and data sinks of each application to explore power saving strategies that require to move data and computation outside the streaming system itself.