ProbSlack is a probabilistic framework for analyzing late arrival of events in distributed stream processing systems. We have implemented a prototype of our framework on top of Apache Flink (1.4.0). Our prototype has been implemented by extending the process function of the Flink operators.

More details can be found in the following publication: Probabilistic Management of Events’ Late Arrival, Nicolo Rivetti, Nikos Zacheilas, Avigdor Gal and Vana Kalogeraki, Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS), Hamilton, New Zealand, June 2018

For accessing the code and the datasets send an e-mail at: vana[*at*] or zacheilas[*at*]

DIsCO is a novel framework that runs as a module of Apache Storm 0.10.2 and enables the automatic compression of incoming streaming data for the Storm topologies. DIsCO can automatically determine whether compression is beneficial for a Storm topology and which compression algorithm should be used.

In order to be able to exploit the auto-compression features we offer, users must extend two abstract classes that we provide in the DIsCO library. More specifically, the CompressionSpout class must be extended by users’ spouts while the CompressionBolt class must be extended by users’ bolts. Users still have to provide the implementation of nextTuple (for spouts) and execute (for bolts) methods in order to be able to utilize their components in their topologies.

Users have to specify in the Storm configuration object (i.e., StormConfig) the compression algorithms that should be considered. DIsCO supports both lossy and lossless compression techniques and users can easily plugin their own custom compression algorithms. DIsCO currently supports the following well-known compression algorithm:

  • ZIP
  • LZ4
  • Snappy
  • JPEG

For accessing the code send an e-mail at: vana[*at*] or zacheilas[*at*]