본문 바로가기

728x90

Scaling Laws1

Scaling Laws for Neural Language Models (2020) Review Scaling Laws for Neural Language Models We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitu arxiv.org 0. 핵심 요약 Cross-entropy loss를 사용할 때, Transformer 계열의 Language Model들은 model size, dataset si.. 2024. 3. 20.

이전 1 다음

728x90

티스토리툴바