Toolbox

C1: Combinatorial Correlation Clustering

Reminder: This post contains 1871 words · 6 min read · by Xianbin

This post is based on [1].

Efficient MPC Algorithms for Correlation Clustering

Local Search is a powerful method for clustering problems. Again, this tool shows us its strength in correlation clustering.

Algorithm 1 (Basic 2-approximation local search algorithm)

Let SS be the solution output by the algorithm and let OPT be the optimum solution.

  For each cluster C in OPT, 
             1) Consider S_C that is by inserting C 
             into S as a cluster and removing 
             elements from S. 
             2) Update S by inserting S_, reducing the cost.

The naive way of finding the SCS_C is using an exponential algorithm. Now, let us forget how to obtain SCS_C first and focus on the approximation ratio.

2-app

In the above picture, we use x1,x2x_1, x_2 with signal ++ to denote the increased cost by inserting CC into SS (for the nodes in CC). We use y1,y2y_1, y_2 with signal - to denote the saved cost by inserting CC into SS (for nodes in CC). Let commcomm to denote the common cost between the the optimal algorithm and an algorithm.

Due to local optimality, (it means that we cannot reduce the cost by inserting CC), we have that

comm+x1+x2comm+y1+y2comm + x_1 + x_2 \geq comm + y_1 + y_2

We use cost(OPT)cost(OPT) to denote the optimum cost and use cost(S)cost(S) to denote the cost of an algorithm. Now, we get that

cost(OPT)=Ccomm+x1/2+x2cost(OPT) = \sum_C comm+x_1/2 + x_2 cost(S)=Ccomm+y1+y2/2cost(S) = \sum_C comm+y_1 + y_2/2

Then, we have

cost(S)Ccomm+y1+y2Ccomm+x1+x22(Ccomm+x1/2+x2)=2cost(OPT).\begin{aligned} cost(S) \leq \sum_C comm+y_1 + y_2 & \\ \leq \sum_C comm+x_1+x_2 & \\ \leq 2(\sum_C comm + x_1/2 + x_2) &\\ =2\cdot cost(OPT). \end{aligned}

An improved version of 15/8-approximation

Let AlgiAlg_i be a local optimum clustering with weights wiw_i.

The Process:

  1. Set Algo1Algo_1 with weight function w1w_1;

  2. Double the weight of edges in E+EXT(Alg1)E^+ \cap EXT(Alg_1) to get a new weight function w2w_2 where EXT(?)EXT(?) is the edges across clusters.

  3. Set Algo2Algo_2 with weight function w2w_2;

  4. Output the best of Algo1Algo_1 and Algo2Algo_2.

A Complex Version


To be continued \color{red} \text{To be continued \ldots}

Reference

[1]. Combinatorial Correlation Clustering, Vincent Cohen-Addad, David Rasmussen Lolck, Marcin Pilipczuk, Mikkel Thorup, Shuyi Yan, and Hanwen Zhang. Proceedings of the Symposium on Theory of Computing (STOC) 2024.