Comments by "Mikko Rantalainen" (@MikkoRantalainen) on "DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters | Lex Fridman Podcast #459" video.

The self-domestication isn't that novel idea. Studies from year 2012 already discuss about self-domestication of Bonobos with very similar rationale as the AI emitted here.
1
Looking forward for the full interview. This short teaser clip appeared really good!
1
3:03:30 One interesting way for future UIs to interact with AI is one that I've seen on video only (I don't know if it was fake but the idea seems solid): allow the user to interrupt the thinking process of R1 and modify the text on the fly and then allow the R1 to continue it's thinking. That way you can guide it's thinking if you don't like some specific part of its chain of thought or you think it made a specific mistake (for example, if it wonders if thing X is true and concludes that "yes, it's true") and you know it made a mistake, you can fix the mistake and then let the R1 to continue from there. This is obviously possible because the AI models are deterministic in sense that they only depend on model, system prompt, user input, PRNG seed results that far. If you modify the "history of generated output" and let the AI to continue from there, it will happily do so and take your correction into account correctly.
1
1:45:00 This investment nationwide into "low tech" technologies has been important in solar panels, too. Majority of the solar panels are being manufactured in China so is it really a wonder that there's also majority of the people who can then try tweaking the processes to maybe manufacture improved parts? When US (or Europe) outsources stuff to China, they not only outsource the physical manufacturing of the given part, they only outsource the whole subvendor chain in long run. And getting all that back to US or Europe later is next to impossible.
1
8:30 As a software developer, the MIT licensing is the biggest thing about DeepSeek. All the other published weights, including Llama, are too restricted when it comes to licensing. If you cannot have totally free licensing, why would you spend your own resources to build something on top of the model? Do you really want to position yourself as a customer of a commercial in monopoly situation and still spend extra money to make yourself fully dependant on that position?
1
Failed training run in AI training is not that different from SpaceX Starship exploding before reaching orbit. This far failed AI training run has been cheaper but if we scale the models 10x more, the cost per try starts to be in the same range!
1
What we really need is a distilled model of R1 where the distilled model is not based on Llama or Qwen. And that distilled model should be around 10B-16B parameters to be usable with real world gaming GPUs. Hopefully it would be MIT licensed, too.
1
59:30 I think the new export rules are just stupid. If you want to limit the performance of AI supercomputers, you have to limit the interconnect technology, not the computing power per card. This is because you can always run more GPUs in parallel to get the same amount of GFLOPS to the restricted GPU. However, if you limit the interconnects, you can only effectively run the GFLOPS that a single GPU can do because you cannot effectively combine the GPUs to create big and fast clusters. If you allow fast interconnect technology, the workaround is "just buy many small parts and use interconnects to put those together to create one big system". That is, unless the plan was to just make it more expensive to China to use Nvidia GPUs. When you allow high performance interconnect but reduce max GFLOPS to 4x slower than could be implemented otherwise, you can instantly sell 4x more GPUs for the same use cases!
1