@article{liu2023sophia, title={Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training}, author={Liu, Hong and Li, Zhiyuan and Hall, David and Liang, Percy and Ma, Tengyu} ...
Some results have been hidden because they may be inaccessible to you