Traditional wisdom says you need a mountain of NVIDIA GPUS that will have the opportunity to run the latest AI model. But not. Exo Labs (through Indian Defense Review) Claim that Lama has found 2 LLM Windows 98 Box Sarka in 1997 just pantheum II processor running out of the processor. Hore! Catch? It is approximately 20,000 times slowly compared to modern GPUs. Hardu
Apparently, Exo Labs picked up the machine for less than $ 120 on eBay, after which perhaps the largest headache was getting pairfares to work, with heritage PS2 ports and just one USB input.
In fact, getting the required files on the machine was a serious headache. The files were then compiled in a format that was compatible with the ancient instruction set of Pentium II.
It was time to run Lama 2, according to the code and hardware. Allegedly, the 260k parameter version of the model received 39.31 tokens per second at Pentium II, while the 15 meter parameter version was killed in only 1.03 tokens per second.
He even tried to run a partial data model using a billion parameter version of Lama 3.2, which returned the glacier 0.0093 tokens in a second. ARM, a billion parameter 3.2 model references to put it in the context that hit the arm CPU 40 tokens in 40 seconds and 200 tokens per second on GPU.
In other words, it is running about 20,000 times slowly on Pentium II. But hey, it’s going. The comparison is not perfect, there are all kinds of variables in how the models are arranged. But 20,000 times the data probably provides the Delta’s correct idea to some extent in the terms of severity.
In fact, although the running of modern LLM on such an old CPU is impressive, but the performance difference is a reminder that is rapidly important. In fact, it is like a slightly 3D gaming.
Correctly compiled, you have no doubt that Cyberpank is running in full -way mode at Pentium II on 4K on 4K. But you may be watching the frame rate like PI II’s 0.0093 tokens. At the place where it is all a little educational.
But it may be fun to see the pixels presenting together. On the other hand, it may take years to complete the benchmark run. Maybe we will leave it all for now.