Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable cloud applications

Joarder Mohammad Mustafa Kamal, Manzur Murshed, Rajkumar Buyya

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    3 Citations (Scopus)

    Abstract

    Cloud applications often rely on shared-nothing distributed databases that can sustain rapid growth in data volume. Distributed transactions (DTs) that involve data tuples from multiple geo-distributed servers can adversely impact the performance of such databases, especially when the transactions are short-lived in and require immediate response. The k-way min-cut graph clustering algorithm has been found effective to reduce the number of DTs with acceptable level of load balancing. Benefits of such a static partitioning scheme, however, is short-lived in Cloud applications with dynamically varying workload patterns where DT profile changes over time. This paper addresses this emerging challenge by introducing incremental repartitioning. In each repartitioning cycle, DT profile is learnt online and k-way min-cut clustering algorithm is applied on a special sub-graph representing all DTs as well as those non-DTs that have at least one tuple in a DT. The latter ensures that the min-cut algorithm minimally reintroduces new DTs from the non-DTs while maximally transforming existing DTs into non-DTs in the new partitioning. Potential load imbalance risk is mitigated by applying the graph clustering algorithm on the finer logical partitions instead of the servers and relying on random one-to-one cluster-to-partition mapping that naturally balances out loads. Inter-server data-migration due to repartitioning is kept in check with two special mappings favouring the current partition of majority tuples in a cluster - the many-to-one version minimising data migrations alone and the one-to-one version reducing data migration without affecting load balancing. A distributed data lookup process, inspired by the roaming protocol in mobile networks, is introduced to efficiently handle data migration without affecting scalability. The effectiveness of the proposed framework is evaluated on realistic TPC-C workloads comprehensively using graph, hyper graph, and compressed hyper graph representations used in the literature. Simulation results convincingly support incremental repartitioning against static partitioning.

    Original languageEnglish
    Title of host publicationProceedings - 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, UCC 2014
    EditorsShrideep Pallickara, Chunming Rong
    Place of PublicationDanvers MA USA
    PublisherIEEE, Institute of Electrical and Electronics Engineers
    Pages213-222
    Number of pages10
    ISBN (Electronic)9781479978816
    DOIs
    Publication statusPublished - 29 Jan 2014
    EventIEEE/ACM International Conference on Utility and Cloud Computing 2014 - London, United Kingdom
    Duration: 8 Dec 201411 Dec 2014
    Conference number: 7th
    https://ieeexplore.ieee.org/xpl/conhome/7027114/proceeding (Proceedings)

    Conference

    ConferenceIEEE/ACM International Conference on Utility and Cloud Computing 2014
    Abbreviated titleUCC 2014
    Country/TerritoryUnited Kingdom
    CityLondon
    Period8/12/1411/12/14
    Internet address

    Keywords

    • Cloud databases
    • Data migration
    • Distributed transactions
    • Incremental repartitioning
    • Load-balance
    • Workload

    Cite this