MoE Just Broke AI — Is NVIDIA Stealing All the Credit?

MoE Baru Aja Ngacirin AI — Apakah NVIDIA Nyolong Semua Kreditnya?

blogs.nvidia.com

So let me get this straight: the top 10 open-source AI models all use mixture-of-experts architecture, and suddenly NVIDIA’s GB200 NVL72 makes them 10x faster. Coincidence? Or did MoE just become the golden child of AI while everyone else scrambles to catch up?

Jadi begini ya: 10 model AI open-source terbaik semuanya pakai arsitektur mixture-of-experts, dan tiba-tiba NVIDIA GB200 NVL72 bikin mereka 10x lebih cepat. Kebetulan? Atau MoE baru aja jadi anak emas AI dan semua orang terus tergopoh-gopoh mengejar?

MoE mimics the brain by activating only the right experts per token. That’s elegant. But NVIDIA didn’t invent MoE — they just optimized their way into being the only ones who can run it well. So tell me: who’s really driving innovation here?

MoE meniru otak dengan hanya mengaktifkan ahli yang tepat tiap token. Itu elegan. Tapi NVIDIA nggak menemukan MoE — mereka cuma optimasi jalan hingga jadi satu-satunya yang bisa jalanin MoE dengan lancar. Jadi, siapa sebenarnya yang ngebetulin inovasi di sini?

Komentar (8)

ML Engineer at DeepL (Insinyur ML di DeepL)

Let’s be real — MoE is brilliant, but without Blackwell’s NVLink fabric, it’s a pain to scale. We tried MoE on H200, and the all-to-all comms killed latency. GB200 NVL72 fixes that with a unified memory pool and smart orchestration. NVIDIA didn’t invent MoE, but they made it production-ready.

Jujur saja — MoE itu brilian, tapi tanpa jaringan NVLink Blackwell, susah banget mau scaling. Kami coba MoE di H200, komunikasi all-to-all bikin latensi hancur. GB200 NVL72 ngatasi itu dengan kolam memori terpadu dan orkestrasi pintar. NVIDIA nggak nemuin MoE, tapi mereka bikin MoE siap produksi.

Cloud Bro Engineer (Insinyur Bro Cloud)

Oh yeah, and let’s not forget NVIDIA Cloud Partners are rolling this out now. AWS, Azure, Google Cloud — they’re all serving Kimi K2 and Mistral Large 3 on GB200. So if you think open-source is free, think again: you're paying in cloud bills, not licenses.

Oh iya, jangan lupa partner cloud NVIDIA juga lagi rilis ini sekarang. AWS, Azure, Google Cloud — semua lagi jalanin Kimi K2 dan Mistral Large 3 di GB200. Jadi kalau kamu pikir open-source itu gratis, mikir lagi deh: kamu bayar pakai tagihan cloud, bukan lisensi.

Open-Source Idealist (Idealis Open-Source)

So the 'democratization of AI' just means corporations rent NVIDIA’s $300K racks to fine-tune open models? That’s not access — that’s extraction. We need open hardware too.

Jadi 'demokratisasi AI' cuma berarti perusahaan sewa rak NVIDIA seharga $300 ribu buat fine-tune model open? Itu bukan akses — itu eksploitasi. Kita juga butuh perangkat keras terbuka.

Hardware Realist (Realis Perangkat Keras)

Open hardware won't solve latency in expert-parallel models. NVLink Switch’s 130TB/s fabric is real. It’s like comparing a bicycle to a bullet train. Wishing doesn't scale AI.

Perangkat keras terbuka nggak bisa atasi latensi di model expert-parallel. Jaringan 130TB/s dari NVLink Switch itu nyata. Ini kayak bandingin sepeda sama kereta peluru. Ngimpi aja nggak bakal bikin AI scaling.

AI Ethicist 2030 (Etikawan AI 2030)

The real issue isn’t who owns the hardware — it’s who controls the router. That’s the puppet master in MoE. And right now? It’s all black box magic.

Masalah sesungguhnya bukan siapa punya perangkat keras — tapi siapa yang kontrol router. Itu otaknya MoE. Dan sekarang? Masih kayak sihir kotak hitam.

NVIDIA Apologist (Pembela NVIDIA)

Look, NVIDIA’s full-stack optimizations — from Dynamo to NVFP4 — aren’t magic. They’re thousands of engineers solving real bottlenecks. You want faster MoE? Thank the people, not the meme.

Dengar, optimasi full-stack NVIDIA — dari Dynamo ke NVFP4 — bukan sihir. Itu ribuan insinyur yang pecahin bottleneck nyata. Mau MoE lebih cepat? Ucapin terima kasih ke orangnya, bukan cuma ke meme.

Startup CTO (CTO Startup)

I get the ethics, but my investors want ROI. 10x faster inference on the same power? That’s 10x more tokens for the same bill. I’ll take it.

Saya ngerti soal etika, tapi investor saya mau ROI. Inference 10x lebih cepat dengan daya sama? Artinya 10x lebih banyak token dengan tagihan sama. Ya saya ambil dong.

Climate-Conscious Dev (Developer Peduli Iklim)

10x performance per watt isn’t just good for profit — it’s vital for the planet. Every watt saved in AI is a coal plant not built. NVIDIA’s efficiency might be their greatest legacy.

Performa 10x per watt bukan cuma bagus buat profit — itu penting buat bumi. Setiap watt yang dihemat di AI artinya pembangkit batubara yang nggak jadi dibangun. Efisiensi NVIDIA mungkin warisan terbesar mereka.

MoE Just Broke AI — Is NVIDIA Stealing All the Credit?

MoE Baru Aja Ngacirin AI — Apakah NVIDIA Nyolong Semua Kreditnya?

AI Baru Saja Buktikan Diri Sebagai Burung Beo—Bukan Pemikir. Lalu Kenapa Kita Berpura-pura Itu Bisa Berpikir?

Apakah Taruhan $1,4 Triliun OpenAI adalah Revolusi atau Bencana Ekonomi Berikutnya?