Introduction


Sometimes our clients ask if they can use GeForce GPUs rather than Quadro boards,  even in multi-GPU external boxes,  which is what some other post-production manufacturers are doing to reduce the costs.  So we will dedicate a small technical white paper to explain the differences:


Note: The next points only apply when comparing GeForce and Quadro boards from the same generation.


Question: Does GeForce works with Mistika Ultima systems or do I need a Quadro model?  


Answer: Yes it works, but we recommend NVidia Quadro for a number of important reasons:


- Mistika Ultima is faster with Quadro:  In general, when you have a post production application that can get the same performance with Quadro and GeForce what it means is that this application is poorly optimised, or only partially based on GPU and can not get full advantage of a Quadro board. But that is not the case of Mistika:


A key aspect for Mistika realtime performance is not about the GPU processing but about the bus management.  For example, when you need to move several layers of 4K images in realtime at high frame rates,  it is usually the transfer of input and output images between the RAM and GPU what creates the ultimate performance limit,  and not the GPU processing itself.


 In a gaming application you only need to upload images to the GPU and then display those images directly in a computer monitor, but in a post-production system like Mistika Ultima the results need to download to the SDI video board for playbacks, and also to the storage for the case of rendering. Also some complex effects, plugins and IO codecs, and decoding of complex formats like R3D also need several trips of the images between RAM and GPU that can take place in parallel using a Quadro board rather than a GeForce model.


And that is the main performance advantage of a Quadro board against a Gforce model,  as the Quadro permits to move images in both directions at the same time and at faster speeds. For example  Mistika can be uploading the layers for the next frames, while processing the current frame and also  downloading the finished images all at the same time. This is not possible in a GeForce.


-  Quadro has more memory


The extra GPU memory in Quadro models is also important, as to do that we have to move extra images and keep multiple buffers where the realtime effects are running in the GPU.  


Please note that as difference to other applications, the whole Mistika rendering engine is based on GPU, not only a small part of it as in other applications, so the extra memory provided by Quadro boards provides an stability advantage. In complex projects, the amount of graphics memory is what ultimately makes the difference between having a stable Mistika session or repeated crashes.


-  Quadro is more stable and robust:  In a same way than a  proper workstation is much more robust than a PC and can last  for much longer.  GeForce boards are designed for gaming, where stability is not a critical issue.  Some of them run pretty hot and can also be used as nice barbecues, but they are not really designed for client attended sessions. 


Finally, SGO maintains a strong collaboration with NVidia to test at both sides and to make sure the Quadro drivers are free of bugs affecting the Mistika software.


However, there are still situations where you may want to use a Gforce model:


- When there are severe budget restrictions


- When upgrading very old systems, where a modern Quadro can not really deliver all its performance anyway. In these situations a Gforce can be a low cost option for a second life and extra performance.


- For generic render nodes not used for playbacks, except for very high end formats (ask support in case of doubt)



Question: What about external enclosures with several boards for mulitGPU support?


Answer:   It may work, but in the case of Mistika it is not efficient.  


The reason is the same as the first point. Applications that are only partially GPU optimised can benefit from this formula up to a certain point, just by using brute force.  But when you have a software 100% GPU optimised like Mistika then the speed limiting factor is usually not the processing but the bus bandwidth. Typical external enclosures only use one PCIe slot, which put a hard limit when working in complex post production tasks. 


Instead, what we recommend for multiGPU is to install each GPU on its own dedicated slot, and always using the fastest model available first, rather than using several small units.


With this formula, Mistika Totem technology can send different frames to each GPU without any bandwidth interference, which is much more efficient than using SLI or external enclosures that need to share a common bus.  For example, a Mistika system with two identical boards in two different slots can really render 200% faster in most situations (providing that there is enough disk speed to feed both of them) 


Question: What particular GPU model is recommended for Mistika Ultima systems?


Answer:  At the date of this document (April 2018) we specially recommend two models:


- NVidia Quadro P4000:  For up to 4K and UHD workflows with effect stacks not too complex.  Also excellent as a second GPU for background rendering, the P4000 is very cost effective.


- NVidia Quadro P6000: For high end performance, like 4K Stereo3D &  8K,  and  for complex production in general.



Question: What particular GPU model is recommended for Mistika render nodes?


Answer:  Similar considerations apply, but in this particular case there is no SDI video board involved, which reduces the need for full duplex bandwidth. And as render speeds are typically slower than realtime  (that is why you render...)  then the full duplex bandwidth is not so important, and a high end GeForce board can be used to reduce costs, while still providing similar render performance in many cases. 


But always check that there is enough memory in the GPU for the type of work that you plan (at least 10GB of GPU memory are recommended for serious 4K rendering), as the Gforce typically have half or less memory than the Quadro recommended models.  


At the date of this document, the only Gforce that can be recommended for serious render is the Gforce  GTX 1080ti, which has 11GB memory. Even with that, it will be slower than a P6000 when decoding some formats supporting full duplex transfers like R3D. The P6000 can also render more complex jobs thanks to its 24GB of memory, so it can render jobs that which will simply crash with the GeForce.