Scalable Distributed Stream Processing

Mitch Cherniack, Hari Balakrishnan, Magdalena Balazinska, Donald Carney, Ugur Cetintemel, Ying Xing, Stan Zdonik
CIDR 2003 - First Biennial Conference on Innovative Data Systems Research, Asilomar, CA, January 2003

Stream processing fits a large class of new applications for which conventional DBMSs fall short. Because many stream-oriented systems are inherently geographically distributed and because distribution offers scalable load management and higher availability, future stream processing systems will operate in a distributed fashion. They will run across the Internet on computers typically owned by multiple cooperating administrative domains. This paper describes the architectural challenges facing the design of large-scale distributed stream processing systems, and discusses novel approaches for addressing load management, high availability, and federated operation issues. We describe two stream processing systems, Aurora* and Medusa, which are being designed to explore complementary solutions to these challenges.

[PDF (503KB)]

Bibtex Entry:

@inproceedings{cherniack2003scalable,
   author =       "Mitch Cherniack and Hari Balakrishnan and Magdalena Balazinska and Donald Carney and Ugur Cetintemel and Ying Xing and Stan Zdonik",
   title =        "{Scalable Distributed Stream Processing}",
   booktitle =    {CIDR 2003 - First Biennial Conference on Innovative Data Systems Research},
   year =         {2003},
   month =        {January},
   address =      {Asilomar, CA}
}