C1: Distributed Balls-and-Bins

April 24, 2024

Reminder: This post contains 4098 words · 12 min read · by Xianbin

Model
Problem 1
Algorithm 1
The Lower Bound
- Thinking before looking at the paper
- The answer in the paper
Reference

Balls and bins models are quite useful in a lot of areas. In this post, let us see distributed balls and bins model. This post is based on [1]. (By the way, I really admire one of the authors, i.e., Lenzen, for his work on distributed algorithms).

Model

The system consists of \(\textbf{n}\) bins and \(\textbf{n}\) balls, and we assume this system is fault-free. In each round, the behaviors of balls and bins are as follows.

Balls \(\sqcup\) do local computations and sends messages to arbitrary bins.
Bins \(\circ\) receive these messages, do local computations and send messages to any ball they have been touched by in this or previous rounds.
Balls receive messages and may commit to a bin.

You can think of the communication graph is a clique!

Besides, all balls and bins have access to unlimited source of unbiased random bits.

Problem 1

Parallel Balls-into-Bins. We put each ball into a bin. The goals are to minimize the total number of rounds until the task is finished, the maximum number of balls in a bin, and the total number of messages.

Algorithm 1

Let \(k(1) = 1, i = 1\).
Balls contact \(k_i = \lfloor k(i) \rfloor\) bins u.r.i, asking permission for being placed
A bin only admits permission to one of the requests
Any ball receiving at least one permission selects a bin to place arbitrarily
Set \(k(i+1)=\min\{ k(i) e^{k_i/5}, \sqrt{\log n}\}, i = i + 1\).

The above algorithm is quite simple. Also, we should notice that the cost is low.

\(\textbf{Theorem 2}\). Algorithm 1 can solve problem 1 within \(\log ^* n\) rounds with message complexity \(O(n)\).

The Lower Bound

\(\textbf{Theorem 1}\). Algorithm 1 is tight.

That is to say, using \(O(n)\) messages and the maximum load of each bin is \(L\) w.h.p, the lower bound of time complexity for problem 1 is at least \(\log ^*n-\log^* L\).

Actually, the above statement is not precise. The scope of algorithms is only on symmetric algorithms in which balls and bins identify each other by uniform, independent and random port numbering.

Thinking before looking at the paper

First, let us try to solve this problem.

Since we only consider symmetric algorithm, then the behavior of the algorithms should be symmetric, i.e., there are a lot of indistinguishable copies of subgraphs.

Also, we know that the maximum load is L .

The question is:

How to use the above fact to prove a lower bound?

The subgraph should like some trees. If the time complexity \(t\) is large, it is hard to control the size of maximum load. In round \(t\), the depth of the tree that we construct to mimic the progress algorithms is at most \(c^t\) where \(c\) is some parameter. The number of equivalent trees can be \(n/d^{t+1}\) where \(d\) is some parameter. The worst case is that the load is \(n/d^{t+1}\).

Since \(n/d^{t+1}\leq L\), we obtain that \(t \geq \log_{d} \frac{n}{L}\).

The conclusion is still not as the same as Theorem 1.

I guess to make the target lower bound, we need to use similar construction in the Algorithm 1. Let \(X(i)\) be the size of a subgraph in the \(i\) round. We should obtain something like \(X(i+1) = c^{X(i)}\). Then, we can show the lower bound.

That is to say, the constructed tree in the \((i+1)\)-th round is obtained by a tree in \(i\)-th round contacting other \(c^{X(i)}\) trees.

There are too many things to prove and define. Now, let us check the answer.

The answer in the paper

It is so hard to understand this paper because of the writing. I believe the key idea is simple.

\(\textbf{Definition}(\text{Symmetric balls-into-bins)}\).

An instance of the balls-into-bins problem is symmetric, if balls address bins by port numbering uniformly, independently and randomly. We name algorithms that can be implemented under this assumption to be symmetric.

\(\textbf{Key Observation}\). At any point of the execution of a symmetric algorithm, if a ball contacts a bin that it has not contacted yet, the contacted bin is drawn uniformly at random from all bins it has not contacted yet.

This holds when conditioning on arbitrary other events, if these do not constrain the port numbers of bins the ball has not contacted so far.

\(\textbf{Definition}(\text{Balls-into-bins Graph})\). The bipartite balls-into-bins graph \(G_A(t)\) associated with an execution of the symmetric algorithm \(A\) that has run for \(t\) rounds as the following.

The vertex set \(V:=B\cup U\) where \(\lvert B \rvert = \lvert U\rvert = n\).
In each round, each ball \(b\in B\) adds an edge connecting itself to a bin \(u\in U\) if \(b\) contacts a bin \(U\) for the first time in that round. We use \(E_A(i)\) to denote the added edges in round \(i\) and \(G_A(t) = (V, \cup E_A(i))\) is the whole graph containing all edges until the last round.

To be continued ...

Reference

[1]. Lenzen, Christoph, and Roger Wattenhofer. “Tight bounds for parallel randomized load balancing.” Proceedings of the forty-third annual ACM symposium on Theory of computing. 2011.