Contains the title page, preface, acknowledgments, table of
contents, list of figures and list of tables.
|
Ch 1: Dipsea: An Overview
| pdf�ps
|
Dipsea is a modular architecture for Distributed Hash
Tables. Two important layers in its design are: ID
Management and Overlay Routing Network Management (which
consists of three modules: Emulation Engine, Ring
Management and Choice of Routing Network). Efficient
algorithms for ID Management are presented in Chapters 2
and 3. The Emulation Engine is described in Chapter 4.
Various deterministic and randomized routing networks are
studied in Chapters 5 through 9. A summary is presented in
Chapter 10.
|
Ch 2: Balanced Binary Trees
| pdf�ps
|
An efficient ID management algorithm for DHTs works as
follows: upon arrival, a new host sends a random probe,
followed by a local probe of size (c log n), and splits
the largest manager encountered. When a randomly chosen
host departs, a local probe of size (c log n) in its
vicinity identifies the smallest manager which is
re-assigned to occupy the position of the departed
host. The scheme requires O(R + log n) messages per
arrival/departure. The ratio between the largest and
smallest paritions is 4. The algorithm is independent of
the overlay routing network.
Variations of the algorithm diminish the ratio between the
largest and the smallest partition to (1+ε), for
any ε > 0, albeit at the cost of re-assigning the
IDs of O(1/ε) existing hosts per
arrival/departure. This is the first algorithm to allow
such fine-tuning.
|
Ch 3: Coupon Collection over Cliques
| pdf�ps
|
We study Structured Coupon Collection over n/b disjoint
cliques with b nodes per clique. Nodes are initially
uncovered. At each step, we choose d nodes independently
and uniformly at random. If all the nodes in the
corresponding cliques are covered, we do nothing.
Otherwise, from among the chosen cliques with at least one
uncovered node, we select one at random and cover an
uncovered node within that clique. We show that as long as
bd ≥ c log n, O(n) steps are sufficient to cover all
nodes w.h.p. and each of the first Ω(n) steps
succeed in covering a node w.h.p. These results are then
utilized to analyze a stochastic process for growing binary
trees that are highly balanced -- the leaves of the tree
belong to at most four different levels w.h.p. The
stochastic process constitutes the core idea underlying a
practical P2P load balancing scheme that beats earlier
proposals for the same, in terms of message complexity.
|
Ch 4: Scalable and Dynamic Emulation
| pdf�ps
|
The Emulation Engine handles issues pertaining to
dynamism, scale and physical network-proximity that
arise when we mimic/emulate arbitrary families of
routing networks defined over successive powers of
two. Families over non-powers of two are first
mapped to powers of two and then emulated.
Emulation is done for RANDOM and BALANCED
distribution of IDs. RANDOM distribution results
from each participant choosing a random number in
[0, 1) as its ID. BALANCED distribution results from
any of the ID management algorithms developed in
Chapters 2 and 3. Our approach for handling
physical network-proximity is unique in that it
handles arbitrary network families, not
just hypercubes or Chord.
|
Ch 5: Optimal Routing in Chord
| pdf�ps
|
Chord is an undirected graph on 2^b nodes arranged in a
circle, with edges between pairs of nodes that are 2^k
positions apart for any k ≥ 0. The standard Chord
routing algorithm uses edges in only one direction. Our
algorithms exploit the bidirectionality of edges for
optimality. At the heart of the new protocols lie
algorithms for writing a positive integer d as the
difference of two non-negative integers d′ and
d″ such that the total number of 1-bits in the binary
representation of d' and d″ is minimized. The
solution to this problem can be encoded as a finite state
machine. By treating the state machine as a Markov chain
and identifying its steady-state distribution, we are able
to compute the average length of shortest paths in Chord,
which turns out to be b/3 + Θ(1).
|
|
Ch 6: Papillon: Greedy Routing on a Circle
| pdf�ps
|
Papillon is a deterministic network that supports efficient
GREEDY routing over n nodes placed in a circle. The
distance metric for GREEDY routing is the clockwise or the
absolute distance along the circle. With d out-going links
per node, the longest GREEDY route is O(log n / log d)
hops, which is asymptotically optimal. Papillon is a
varaint of multi-butterfly networks.
|
Ch 7: Symphony: Routing in a Small World
| pdf�ps
|
Symphony is an adaptation of Kleinberg's small-world
construction to arrive at a randomized routing
network for DHTs. The idea is to place nodes along a
circle. Each node makes a short-distance link with
its successor and a few long-distance links with
other nodes. The challenge lies in maintaining the
network in the face of arrivals and departures of
nodes. With k out-going links per node, GREEDY
routes are are O( (log^2 n) / k) on average. We
experimentally demonstrate that GREEDY with
1-LOOKAHEAD with bi-directional links diminish
average route length by 40%. This observation
motivated the theoretical analysis in Chapter 8.
|
Ch 8: Greedy Routing with Lookahead
| pdf�ps
|
Several randomized routing networks permit efficient GREEDY
routing: Symphony, randomized hypercubes, randomized Chord,
skip-graphs and constructions based upon small-world
percolation networks. In each of these networks, a node
has out-degree Θ(log n), where n denotes the total
number of nodes, and GREEDY routing is known to take O(log
n) hops on average. We establish lower-bounds for GREEDY
routing for these networks, and analyze GREEDY with
1-LOOKAHEAD routing. The idea behind 1-LOOKAHEAD, as the
name suggests, is to take a neighbor's neighbors into
account for making better routing decisions.
The following picture emerges: Deterministic routing
networks like hypercubes and Chord have diameter
Θ(log n) and GREEDY routing is optimal. Randomized
routing networks like randomized hypercubes, randomized
Chord, and constructions based on small-world percolation
networks, have diameter Θ(log n / log log n) with
high probability. The expected diameter of Skip graphs is
also Θ(log n / log log n). In all of these networks,
GREEDY routing fails to find short routes, requiring
Ω(log n) hops with high probability. Surprisingly,
GREEDY with 1-LOOKAHEAD routing is able to diminish
route-lengths to Θ(log n / log log n) hops, which is
asymptotically optimal.
|
Ch 9: Mariposa: A Randomized Butterfly
| pdf�ps
|
In Chapter 7, randomized networks were constructed with the
following sequence of operations: (a) Generation of local
random bits, (b) Establishment of long-distance links, and
(c) Inspection of non-local random bits for routing. In
this Chapter, we study a variation with the following
sequence of operations: (a) Generation of local random
bits, (b) Inspection of non-local random bits for
establishment of long-distance links, and (c)
Routing.
Mariposa is a randomized adaptation of butterfly
networks. With 3k+3 out-going links per node, Mariposa can
route in O(log n / log k) hops in the worst-case, which is
asymptotically optimal. The construction improves upon
Viceroy, an earlier butterfly-based construction, which
routes in Θ(log n) hops with Θ(1) links per
node.
|
Contains a summary of Dipsea and an outline of
possible future research directions.
|
The appendix contains the proof of a Lemma in Chapter 2.
|
References and Index
| pdf�ps
|
List of references and index entries.
|
|
Dipsea: List of Publications
|
|
The Degree-Diameter Greedy Routing Problem ���by I Abraham, D Malkhi and G S Manku ���Unpublished manuscript. |
pdf ps |
Structured Coupon Collection over Cliques for P2P Load Balance ���by K Kenthapadi and G S Manku ���Submitted for publication, available as DB-Group TR 2004-38, Stanford University, 2004. |
pdf ps ppt |
Balanced Binary Trees for ID Management and Load Balance in Distributed Hash Tables ���by G S Manku ���PODC 2004 (23rd ACM Symposium on Principles of Distributed Computing), July 2004. |
pdf ps ppt |
Know thy Neighbor's Neighbor: the Power of Lookahead in Randomized P2P Networks ���by G S Manku, M Naor and U Wieder ���STOC 2004 (36th ACM Symposium on Theory of Computing), p 53-64, June 2004. |
pdf ps ppt |
Optimal Routing in Chord ���by P Ganesan and G S Manku ���SODA 2004 (15th Annual ACM-SIAM Symposium on Discrete Algorithms), p 169-178, Jan 2004. |
pdf ps ppt |
Routing Networks for Distributed Hash Tables ���by G S Manku ���PODC 2003 (22nd ACM Symposium on Principles of Distributed Computing), p 133-142, June 2003. |
pdf ps HTML ppt |
Symphony: Distributed Hashing in a Small World ���by G S Manku, M Bawa and P Raghavan ���USITS 2003 (Proc. 4th USENIX Symposium on Internet Technologies and Systems), p 127-140, Mar 2003. |
pdf ps ppt |
SETS: Search Enhanced by Topic Segmentation ���by M Bawa, G S Manku, and P Raghavan ���SIGIR 2003 (26th Annual Intl. ACM SIGIR Conference), p 306-313, July 2003. |
|
|