Abstract
An effective scheduling strategy is critical for achieving better performance in real-time stream processing systems. How to quickly and efficiently process real-time data stream is always challenging, especially when clusters are collaborating in a Geo-Distributed computing environment. To address these challenges, we propose an elastic scheduling strategy with Latency Constraints in Geo-Distributed stream computing environments called Lc-Stream. This article discusses our work from the following aspects: (1) An optimized data stream redirection method that is proposed based on queuing network algorithm, along with a computing resource model, a latency constrained scheduling model and a communication energy consumption model. (2) An updated node selection method based on the inter-layer task correlation, to reduce the communication latency between groups at the executor granularity. (3) A network cluster distribution for Geo-Distributed computing environment to ensure energy saving under low transmission latency. Experimental results show that compared to R-Storm, Lc-Stream reduces total latency by over 19% and increases throughput by over 37% in typical cross-domain multi-task topologies. Compared to Ts-Stream, Lc-Stream also reduces total latency by over 15% and increases throughput by over 21%. At the same time, it helps to balance the load among the systems and avoid overuse of compute nodes.
Original language | English |
---|---|
Article number | e8085 |
Number of pages | 22 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 36 |
Issue number | 14 |
DOIs | |
Publication status | Published - 25 Jun 2024 |
Keywords
- geo-distributed stream computing
- latency constraints
- load balancing
- resource scheduling
- storm