The effectiveness of a distributed system hinges on the manner in
which tasks and data are assigned to the underlying system resources. Moreover,
today's large-scale distributed systems must accommodate heterogeneity in both
the offered load and in the makeup of the available storage and compute
capacity. The ideal resource assignment must balance the utilization of the
underlying system against the loss of locality incurred when individual tasks
or data objects are fragmented among several servers. In this paper we
describe this
locality-maximizing placement problem and show that an
optimal solution is NP-hard. We then describe a polynomial-time algorithm that
generates a placement within an additive constant of two from optimal.
Pre-2018 CSE ID: CS2004-0777