论文:
《A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning》
算法描述:
=====================================================
Mixing Policy:
=====================================================