Article ID: 2024DAT0001
Stream processing engines need to process multiple queries over streams simultaneously, and continuous window aggregation plays a critical role in various applications as a part of data analysis pipelines. However, the system suffers from scalability issues when dealing with massive queries with different window and slide sizes over data streams with high input rates. To this problem, we propose LSiX (longest-shortest-window-based indexing) to aggregate multiple queries over data streams efficiently. More precisely, we employ two arrays based on the longest and shortest windows among all registered queries, and all query results are computed by using the shared partial aggregations in the two arrays using only two operations at most for each query, enabling efficient aggregation computation. We have conducted extensive experiments, and the results show that LSiX can be at least 3 times faster than the comparative methods, including the state-of-the-art method, MCQA.