In a large distributed database, data are geographically distributed across several separate servers (or data centers). This helps in distributing load in the access network. It also helps to serve data locally where it is required. There are various approaches based on the granularity of data for efficient data distribution in a communication network. The file allocation problem (FAP) locates files to servers, the segment allocation problem (SAP) locates database segments, and the mirror location problem (MLP) locates replicas of the entire database. The placement of such data to multiple servers can be modeled as an optimization problem. The major decisions influencing optimization involves the location of servers, allocation of content and assignment of users. In this paper, we study the segment allocation problem (SAP), which is also known as the partial mirroring problem. This approach is more tractable than the file allocation problem in realistic cases and also eliminates the overhead of (constant) update costs that is incurred in the mirror location problem. Our contribution is two-fold: Firstly, earlier works on SAP assume pre-defined segments. We build a data partitioning method using well-known facility location models. We quantify the performance of the partitioning method. We show that the method partitions the database within a reasonable limit of error. Secondly, we introduce a new model for the segment allocation problem in which the segments are completely connected to each other by high-bandwidth links and contains a cost benefit for inter-segment traffic flows. We formulate this problem as an MILP and build exact solution approaches to solve large scale problems. We demonstrate some structural properties of the problem that make it solvable, using a Benders decomposition algorithm. Computational results validate the superiority of the decomposition approach.
Sen, G., Krishnamoorthy, M., Rangaraj, N., & Narayanan, V. (2015). Exact approaches for static data segment allocation problem in an information network. Computers and Operations Research, 62, 282 - 295. https://doi.org/10.1016/j.cor.2014.05.023