I'm starting to think loss is harmful. Our loss has been a flat 2.2 for the past five days training GPT-2 1.5B. Yet according to human testing, it's been getting noticeably better every day. It's now good enough to amuse /r/dota2: reddit.com/r/DotA2/comments/…
I've been working on an /r/dota2 simulator. (It's like...
Posted in r/DotA2 by u/shawwwn • 96 points and 25 commentsreddit.com
Dec 31, 2019 · 11:40 PM UTC
The dota2 data is only 0.73% of the overall training data (73MB out of 10GB). Yet the bot is adept enough to convince /r/dota2 that it's talking about dota. Again: Loss has been a flat 2.2 for the last five days. And five days ago, the model wasn't this good.
The history of science shows that when something seems out of place, we should pay close attention. Loss != quality. This topic deserves thorough analysis. And as far as I know, no one has done it yet. Otherwise people wouldn't be using loss as a quality metric.