Simple Preference Optimization (SimPO) by Yu Meng, Mengzhou Xia, and Danqi Chen proposes a simpler and more effective preference optimization algorithm than DPO without using a reference model. The ...