Building a PC for Large Language Models: Prioritizing VRAM Capacity and Choosing the Right CPU and GPU
Building a PC for running large language models (LLMs) requires a balance of hardware components that can handle high amounts of data transfer between the CPU and GPU. While VRAM capacity is the most critical factor, selecting a high-performance CPU, PSU, and RAM is also essential. AMD Ryzen 8 or 9 CPUs are recommended, while GPUs with at least 24GB VRAM, such as the Nvidia 3090/4090 or dual P40s, are ideal for GPU inference. For CPU inference, selecting a CPU with AVX512 and DDR5 RAM is crucial, and faster GHz is more beneficial than multiple cores. Dual 3090 NVLink with 128GB RAM is a high-end option for LLMs. It is worth noting that VRAM requirements may change in the future, and new GPU models might have AI-specific features that could impact current configurations. While it is best to avoid overspending for future needs, waiting for the next generation of hardware could be beneficial.
Regarding future hardware, Nvidia and other manufacturers may offer large VRAM GPUs with less performance that are designed to work with CPUs and system RAM to allow adequate speed for running large models. However, current hardware limitations make it challenging to build a PC that can handle LLMs for the next five years. The most practical solution is to build a PC based on current needs and upgrade the GPU as needed when new models with more VRAM become available. In the future, efficient LLMs could run on less than 4GB VRAM, but this is currently uncertain. Lastly, running LLMs in the cloud is an affordable option for those who prefer not to build a PC, while a MacBook Pro M2 with 96GB RAM could be an alternative to a PC.
Entities: AMD, Ryzen 8, Ryzen 9, Nvidia, 3090, 4090, P40, AVX512, DDR5 RAM, GHz, NVLink, GPU, CPU, LLMs, VRAM, PC, MacBook Pro