Code Release of DIsCO framework

DIsCO is a novel framework that runs as a module of Apache Storm 0.10.2 and enables the automatic compression of incoming streaming data for the Storm topologies. DIsCO can automatically determine whether compression is beneficial for a Storm topology and which compression algorithm should be used.

In order to be able to exploit the auto-compression features we offer, users must extend two abstract classes that we provide in the DIsCO library. More specifically, the CompressionSpout class must be extended by users’ spouts while the CompressionBolt class must be extended by users’ bolts. Users still have to provide the implementation of nextTuple (for spouts) and execute (for bolts) methods in order to be able to utilize their components in their topologies.

Users have to specify in the Storm configuration object (i.e., StormConfig) the compression algorithms that should be considered. DIsCO supports both lossy and lossless compression techniques and users can easily plugin their own custom compression algorithms. DIsCO currently supports the following well-known compression algorithm:

  • ZIP
  • LZ4
  • Snappy
  • JPEG

For accessing the code send an e-mail at: vana[*at*]aueb.gr or zacheilas[*at*]aueb.gr