C1: Multi-party Communication Model Applications (1): Lower bounds for distributed sketching of maximal matchings and maximal independent sets
- Communication Model
- A Hard Distribution for Maximal Matching
- The Lower bound
- Exercises:
- Summary and Extension
- Reference
This post is based on [1]. We will see how to prove lower bonds in multi-party communication model using information theory.
We say that a graph is an -Ruzsa-Szemerédi graph if and only if the set can be partitioned into induced matchings (there is no edges between endpoints of non-matching edges) each of which has size .
(RS graph). For infinitely many integer , there are -RS graph on nodes where and .
RS graphs are used a lot in proof of lower bounds. It has an interesting property: sparse locally (disjoint edges), dense globally.
Communication Model
We consider distributed sketching model, or broadcast congested clique (one-round) model. Consider a undirected graph , each node has a player. Each player of node knows
- the ID of node
- the set of neighbors of , i.e.,
The referee has no input.
Players can simultaneously send a message to the referee and the referee outputs the solution.So, the round complexity depends on the length of the message sent by players!
Smart readers might find that this is exactly “one-round” lower bound!
A Hard Distribution for Maximal Matching
Roughly speak, the hard instance is a graph where . To be a little more specific, we have the following process:
- Each an modified graph from a fixed RS graph by randomly dropping edges with some probability.
- Let some vertices be public vertices, i.e., .
For the second one, we can use the same nodes in when making the union of all .
Therefore, we have two kinds of vertices
- A set of public vertices .
- A set of private vertices (with a much larger size than the public vertices set)
More details about . From , we randomly select an induced matching . Let denote the index. Then we have incident vertices (because the size of matching is ). We let the rest of be the public vertices.
The goal to the above construction is to guarantee the following property. We use to denote the distribution of graphs by the above process.
. With probability at least , every maximal matching of has at least edges whose both endpoints are private nodes.
Let denote the nodes in . and where .
.
In a word, the construct graph G is like this: . Then, for each use as private nodes . So, the total number of nodes is .
The Lower bound
To prove the lower bound, we can make any algorithm smarter or give algorithms some information. It is totally OK. In the following, we add some extra information (power) to the distributed sketching model.
- We have more than players, i.e., players.
- Each public player of a public vertex knows in G.
- Each has private players and each player of a node only knows in .
Note that, the public players have more power than private players.
Preparation of the Proof
Next, it comes to my favorite part.
The most important thing in probability theory and communication theory is to define events, variables correctly and efficiently.
Fix a deterministic protocol for finding a maximal matching on the graph sampled from with error probability at most 0.01.
We use to denote the the collective messages of all public players. For each , we use to denote the collective messages of each private players. We use to denote the messages of all private players, and denotes all the messages.
We use to denote the variables representing the value of in . Recall that we randomly drop edges in during construction of . Let indicate the states of edges in . For example, means that the first edge in is not removed, the second edge is removed, and so on.
In summary, we have
- : the total messages.
- : the messages of all public messages.
- : the messages of all private messages.
- represent the input of the referee ( we can make referee smarter when proving lower bounds).
Overview
Let be the number of bits each player communicates to the referee in the worst case.
.
. .
.
By Lemma 1, Lemma 2 and Lemma 3,
So, .
As , we get that
Exercises:
Prove Lemma 1, Lemma 2 and Lemma 3.
Hints:
Intuitively, Lemma 1 means that to reveal the most of matching that is between private nodes, any transcript needs to know which edge survives the random sampling. The size of matching is , so basically, it needs to edges for each . Then, the size is .
Summary and Extension
Actually, the instances of many matching problems are disjoint edges. In this post, if we remove all public nodes, they are disjoint edges.
Reference
[1]. Assadi, S., Kol, G. and Oshman, R., 2020, July. Lower bounds for distributed sketching of maximal matchings and maximal independent sets. In Proceedings of the 39th Symposium on Principles of Distributed Computing (pp. 79-88).
[2]. Ruzsa, I. and Szemerédi, E., 1978. Triple systems with no six points carrying three triangles. Combinatorics (Proceedings of Fifth Hungarian Colloquium, Keszthely, 1976), Vol. II, 18: 939–945, Mathematical Society Colloquium János Bolyai, 18.