Cost effective dynamic data placement for efficient access of social networks

Hourieh Khalajzadeh, Dong Yuan, Bing Bing Zhou, John Grundy, Yun Yang

Research output: Contribution to journalArticleResearchpeer-review

18 Citations (Scopus)

Abstract

Social networks boast a huge number of worldwide users who join, connect, and publish various content, often very large, e.g. videos, images etc. For such very large-scale data storage, data replication using geo-distributed cloud services with virtually unlimited capabilities are suitable to fulfill the users’ expectations, such as low latency when accessing their and their friends’ data. However, service providers ideally want to spend as little as possible on replicating users’ data. Moreover, social networks have a dynamic nature and thus replicas need to be adaptable according to the environment, users’ behaviors, social network topology, and workload at runtime. Hence, it is not only crucial to have an optimized data placement and request distribution – meeting individual users’ acceptable latency requirements while incurring minimum cost for service providers – but the data placement must be adapted based on changes in the social network to keep it efficient and effective over time. In this paper, we model data placement as a dynamic set cover problem and propose a novel approach to solve this problem. We have run several experiments using two large-scale, open Facebook and Gowala datasets and real latencies derived from Amazon cloud datacenters to demonstrate our novel strategy's efficiency and effectiveness.

Original languageEnglish
Pages (from-to)82-98
Number of pages17
JournalJournal of Parallel and Distributed Computing
Volume141
DOIs
Publication statusPublished - Jul 2020

Keywords

  • Access latency
  • Cost optimization
  • Data placement
  • Data replication
  • Social networks

Cite this