Reminder: This post contains 937 words
· 3 min read
· by Xianbin
I(A:B∣C)=c∑p(c)a,b∑p(a,b∣c)logp(a∣c)p(b∣c)p(a,b∣c)
I(A:B∣C)=c∑p(c)a,b∑p(a,b∣c)logp(a∣c)p(b∣c)p(a,b∣c)=c∑p(c)a,b∑p(c)p(a,b,c)logp(a∣c)p(b∣c)p(c)p(a,b,c)=a,b,c∑p(a,b,c)logp(a∣c)p(b∣c)p(c)p(a,b,c)=DKL(p(a,b,c)∣∣p(a∣c)p(b∣c)p(c))
Notice that
p(a∣c)p(b∣c)p(c)p(a,b,c)=p(a∣c)p(a∣bc)=p(b∣c)p(b∣ac)
So,
I(A:B∣C)=a,b,c∑p(a,b,c)logp(a∣c)p(b∣c)p(c)p(a,b,c)=a,b,c∑p(a,b,c)logp(a∣c)p(a∣bc)=a,b,c∑p(a,b,c)logp(b∣c)p(b∣ac)=a,b,c∑p(bc)p(a∣bc)logp(a∣c)p(a∣bc)=Ep(bc)DKL(p(a∣bc)∣∣p(a∣c))=a,b,c∑p(ac)p(b∣ac)logp(b∣c)p(b∣ac)=Ep(ac)DKL(p(b∣ac)∣∣p(b∣c)