Comments by "Jym Caballero" (@jymcaballero5748) on "This AI model just trolled the world" video.
-
4
-
even if you use their numbers, if you create a variable named, "time to response" it will be in the last position in all the tests.
this is a problem of even human kids have.
Let me give you an example, if you give a child a 3-digit multiplication, and when he finds a result, he is forced to perform a check to verify whether the result is correct, but what happens if the arithmetic of the check fails? The child is left wondering whether he failed in the operation or in the check, which leads us to a vicious circle of checks because even if the check is successful, how reliable is it? ;D
the only benchmark that should be used, is a IA Arena with brakets like a human championship, each match is a vs of questions and answers.,
100 rounds, of questions, each LLM make a question to the other LLM
the LLM that wins the championship is the king, no metrics, just what is the best model.
2