1. Improving Generalization Performance by Switching from Adam to SGD

    Improving Generalization Performance by Switching from Adam to SGD

    1