![]() ![]() But harnessing parallel hardware encoders is technically very difficult. The Resolve developers have done a great job on performance - on both Intel and Mac. While traditional GPU functions can benefit from multiple GPU cards, video encode/decode is a separate issue. By monitoring certain things, you can tell it is encoding the file segments in parallel, but there is a significant combining phase that eats up a lot of the savings. However, it is not a huge benefit in the current software versions. In that case, it segments the file, then sends each segment to a separate encoding process, combining the results for final output. For multicam reading, it may use more than one.Īpple Compressor (if configured for multiple instances) may be capable of using all four M1 Ultra ProRes engines for single-stream ProRes export. As far as I can tell, it generally only uses one for export.įor reading, it also seems only to use one video engine for a single stream. It has four ProRes encode engines and four regular encode engines. ![]() However, even on an M1 Ultra Mac Studio running FCP, it does not normally use multiple encode engines in parallel for export. I understand why you would like the best export performance. It's possible what you observed was render processing, not encode processing. The Resolve caching system is quite complex. Re your observation that on your previous hardware both GPU's were active during rendering, unless you did very specific testing it's impossible to state if that activity was rendering or encoding. However All Intra is normally less difficult to encode, so (even if it worked) the benefit would be less. If the encoding output format is All Intra, in theory it would be easier to segment the file and assign the segments to different encoders, then reassemble them before final output. So in either case, I would not expect both GPUs to be used when encoding a single output stream. If the format uses dependent GOPs, it's unclear how parallel encoding could be used. In that case if multiple parallel encoders are available, the programmer in theory could segment the input file, dispatch segments to each encoder running on a separate thread, write the output to an interim buffer, then splice together the encoded H264 segments before writing the output file. If independent, each GOP is a stand-alone entity and does not need to references other GOPs for encoding. Those are also called closed GOPs and open GOPs. Long GOP encoding can use either "independent" GOPs or "dependent" GOPs. Splitting a single-stream H264 encode task between two different accelerators is difficult and in some cases impossible. ![]() Note that is whether the NVENC accelerator can be used AT ALL, not whether two of them on different cards can be used simultaneously. H264 typically would benefit, IF (and only if) the exact export criteria match the parameters the fixed-function logic can handle. If DNxHD is All-Intra, it might not benefit from hardware-accelerated encoding - I don't know. Technically it is a different subsystem, is not accessed via normal GPU APIs and does not use traditional GPU texture/shader hardware. If the cached frames are available, the primary remaining task is encoding.Įncoding uses a fundamentally different part of the GPU, the NVENC fixed-function logic. However when exporting, I think if cached frames exist it will read those to avoid recomputing. If you have (for example) background/user caching enabled, during the automatic render to cache, it should use both GPUs since that is a parallelizable task. When troubleshooting "render" performance issues, it is important to distinguish between which of those is the bottleneck. What we loosely call "rendering" is composed of two discrete aspects: processing the edit directives/Fx, then encoding that result to an output file. Paul Steinberg wrote.Did something change in Resolve? Windows? If only 1x GPU us active during renders, would a single 4090 now provide more performance than the dual 3090's?. ![]()
0 Comments
Leave a Reply. |