∇v,πlogp(y∣X,v,π)=p(y∣X,v,π)1∫∇v,πlogp(y,α∣X,v,π)p(y,α∣X,v,π)dα=p(y∣X,v,π)1∫∇v,πlogp(y,α∣X,v,π)p(y∣X,v,α)p(α∣π)dα=p(y∣X,v,π)1∑k=1K∇v,πlogp(y,αk∣X,v,π)p(y∣X,v,αk)=p(y∣X,v,π)1∑k=1K∇v,π(logp(y∣X,v,αk)+logp(αk∣π))p(y∣X,v,αk)
=p(y∣X,v,π)1∑k=1K∇v(logp(y∣X,v,αk)p(y∣X,v,αk))+∇πlogp(αk∣π)p(y∣X,v,αk)
=∑k=1Kp(y∣X,v,π)p(y∣X,v,αk)∇vlogp(y∣X,v,αk)+∑k=1Kp(y∣X,v,π)p(y∣X,v,αk)∇πlogp(αk∣π)