Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GMKtec, maker of the EVO-X2 mini-PC that uses a Ryzen AI Max 395+, posted a blog post with a comparison between the DGX Spark and their EVO-X2 miniPC.

from https://www.gmktec.com/blog/evo-x2-vs-nvidia-dgx-spark-redef... (text taken from https://wccftech.com/forget-nvidia-dgx-spark-amd-strix-halo-... since the GMKtec table was an image, but wccftech converted to an HTML table - EDIT-reformatted to make table look nicer in monospace font w/o tabs)

  Test Model    Metric                          EVO – X2        NVIDIA GB10     Winner
  Llama 3.3 70B Generation Speed (tok/sec)       4.90            4.67           AMD
                First Token Response Time (s)    0.86            0.53           NVIDIA
  Qwen3 Coder   Generation Speed (tok/sec)      35.13           38.03           NVIDIA
                First Token Response Time (s)    0.13            0.42           AMD
  GPT-OSS 20B   Generation Speed (tok/sec)      64.69           60.33           AMD
                First Token Response Time (s)    0.19            0.44           AMD
  Qwen3 0.6B Model Generation Speed (tok/sec)  163.78          174.29           NVIDIA
                First Token Response Time (s)    0.02            0.03           AMD


And additionally Framework apparently benchmarked GPT-OSS 120B (!) on the maxed out 395+ Desktop and reached a 38.0 tok/sec Generation Speed. Given that Nvidia can't even keep up on a 20B model, I assume they can't keep up on the 120B model aswell.

https://frame.work/nl/en/desktop?tab=machine-learning

So to me the only thing which seems to be interesting about the Spark atm is the ability to daisy link several units together so you can create a InfiniBand-ish network at InfiniBand speeds of Sparks.

But overall for just plain development and experimentation, and since I don't work at Big AI, I'm pretty sure I would not purchase Nvidia at the moment.


Unfortunately comparing tok/sec right now in a vacuum and especially across weeks of time is kind of pointless. Everything is still evolving; there were patches within days that bumped GB10 performance by double digit percentiles in some frameworks. You just kind of have to accept things are a moving target.

For comparison, as of right now, I can run GPT-OSS 120b @ 59 tok/sec, using llama.cpp (revision 395e286bc) and Unsloth dynamic 4-bit quantized models.[1] GPT-OSS 20b @ 88 tok/sec [2]. The MXFP4 variant comes in the same, at ~89 tok/sec[3]. It's probably faster on other frameworks, llama.cpp is known to not be the fastest. I don't know what LM Studio backend they used. All of these numbers put the GB10 well ahead of Strix Halo, if only going by the numbers we see here.

If the AMD software wasn't also comparatively optimized by the same amount in the same timeframe, then the GB10 would be faster, now. Maybe it was optimized just as much; I don't have a Strix Halo part to compare. But my point is, don't just compare numbers from two various points in time, it's going to be very misleading.

[1]: https://huggingface.co/unsloth/gpt-oss-120b-GGUF/tree/main/U... [2]: https://huggingface.co/unsloth/gpt-oss-20b-GGUF/resolve/main... [3]: https://huggingface.co/unsloth/gpt-oss-20b-GGUF/resolve/main...


These are valid points but the numbers are still useful as a floor on performance.

Given Strix Halo is so much cheaper I'd expect more people to work on improving it, but the NVIDIA tools are better so unclear which has more headroom.


Yeah that's fair. 60 tok/sec on a gpt-oss-120b is certainly nice to know if you should even think about it at all. I'm quite happy with it anyway.

The pricing is definitely by far the worst part of all of this. I suspect the GB10 still has more perf left on the table, Blackwell has been a rough launch. But I'm not sure it's $2000 better if you're just looking to get a fun little AI machine to do embeddings/vision/LLMs on?


This is nonsense. The NVIDIA will slightly win on all generation speed, and be _much_ faster on first token response time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: