Auto-scaling of Web applications in clouds: a tail latency evaluation

Mohammad S. Aslanpour, Adel N. Toosi, Raj Gaire, Muhammad Aamir Cheema

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)


Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the "average"response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users' requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the "tail"of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing, UCC 2020
EditorsIvona Brandic, Rizos Sakellariou
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages10
ISBN (Electronic)9780738123943
ISBN (Print)9780738123950
Publication statusPublished - 2020
EventIEEE/ACM International Conference on Utility and Cloud Computing 2020 - Virtual, Online, Leicester, United Kingdom
Duration: 7 Dec 202010 Dec 2020
Conference number: 13th (Proceedings) (Website)


ConferenceIEEE/ACM International Conference on Utility and Cloud Computing 2020
Abbreviated titleUCC 2020
Country/TerritoryUnited Kingdom
Internet address


  • Auto-scaling
  • Cloud computing
  • Performance evaluation
  • Resource provisioning
  • Tail latency

Cite this