General statistics
List of Youtube channels
Youtube commenter search
Distinguished comments
About
bycloud
comments
Comments by "" (@bycloudAI) on "A Slightly Technical Breakdown of DeepSeek-R1" video.
google is not really transparent with its specs, so that, I dont know either tbh
29
I wanted to talk about GRPO but that formula might be too scary for this video lol and i think for R1 training they used deepseek-v3-base (which they explicitly specified in the R1 report), not the SFT'd deepseek-v3 On page 23 (thanks for referencing btw), they were talking about applying SFT on the base model "Following this, we conduct post-training, including Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on the base model of DeepSeek-V3, to align it with human preferences and further unlock its potential" so ya deepseek-v3-base is not a sft model but deepseek-v3 is. and r1 used deepseek-v3-base
4
DeepSeek-R1 paper page 3 https://arxiv.org/pdf/2501.12948 https://imgur.com/a/UvMhzbr
4
check out this guy's videos and blogs https://jalammar.github.io/
2