To do local inference, it is important to choose a graphics card (GPU) that offers a good balance between performance, memory and price. Considering that, More complex and larger models require more memory, a higher number of CUDA Cores and Tensor Cores generally mean better parallel processing performance and finally if you have no budget concerns, here are some NVIDIA card options that are recommended for deep learning and local inference:

1. NVIDIA GeForce RTX 3080

Memory: 10 GB GDDR6X
CUDA Cores: 8704
Performance: Excellent performance for deep learning and inference. Available used starting at €450 (27/07/2024)
Price: More affordable than professional GPUs.

2. NVIDIA GeForce RTX 3090

Memory: 24 GB GDDR6X
CUDA Cores: 10496
Performance: Excellent performance for large deep learning models.
Price: More expensive, but offers a large amount of memory for very large models. Available used starting at €1450 (27/07/2024)

3. NVIDIA GeForce RTX 4070 Ti

Memory: 12 GB GDDR6X
CUDA Cores: 7680
Performance: Good balance between price and performance.
Price: Competitive compared to other 40-series GPUs. Available used starting at €650 (27/07/2024)

4. NVIDIA GeForce RTX 4090

Memory: 24 GB GDDR6X
CUDA Cores: 16384
Performance: The best consumer GPU for deep learning, excellent for any type of model.
Price: Very expensive, suitable for those who need the best possible performance. Available used starting at €1500 (27/07/2024)

5. NVIDIA A100

Memory: 40 GB HBM2
CUDA Cores: 6912
Performance: Designed specifically for deep learning and artificial intelligence, it offers exceptional performance.
Price: Very expensive and generally used in data centers or for professional applications. Available used starting from €5800 (27/07/2024)

A good price-performance ratio can be obtained with the RTX 3080 or the RTX 4070 Ti. If your budget is generous and you need the best possible performance, the RTX 4090 or the NVIDIA A100 are the best options. But on the contrary, if your budget is rather limited, here are the cheapest solutions to accelerate your computer and obtain good performance on local inference processes:

1. NVIDIA GeForce RTX 3060

Memory: 12 GB GDDR6
CUDA Cores: 3584
Performance: Good performance for inference of moderately sized models.
Price: Relatively cheap, one of the best choices for quality/price ratio. Available used starting at €220 (27/07/2024)

2. NVIDIA GeForce RTX 2060

Memory: 6 GB GDDR6
CUDA Cores: 1920
Performance: Decent for inference and less complex models.
Price: Very cheap compared to the newer series. Available used starting at €150 (27/07/2024)

3. NVIDIA GeForce GTX 1660 Super

Memory: 6 GB GDDR6
CUDA Cores: 1408
Performance: Sufficient for basic inference and small models.
Price: One of the cheapest options with acceptable performance. Available used starting at 100€ (27/07/2024)

4. NVIDIA GeForce GTX 1650 Super

Memory: 4 GB GDDR6
CUDA Cores: 1280
Performance: Good for light inference tasks and small models.
Price: Very cheap and affordable. Available used starting at 100€ (27/07/2024)

Considering that even if your budget is very limited, the GTX 1650 Super or GTX 1660 Super are good choices, even if the cheaper GPUs have less memory, they are still sufficient for many inference models. More cores means better performance, but even cards with fewer cores can be efficient for light inference. For a limited budget but with good performance, the RTX 3060 is probably the best choice. If you want to save even more, the GTX 1660 Super or GTX 1650 Super offer a good balance between cost and performance.