After several tests and experiments I have made significant improvement in OpenCL code.
One part of our HEVC encoder is Motion Estimation module.
This module used as generator of quarter-pel motion vectors for Motion Compensation testing.
Numbers below is actual performance of both modules.
Motion Estimation (executed on CPU, one core i5-3570K, 3.4 GHz)
// copy and pad input image, some initializations, Scene Change Detection:
vsshevc::VssPreAnalyzerImpl::PreparePicture [ 10]: 68.146 ms, (6.815 ms each)
// perform Motion Estimation for one reference frame (fastest mode):
vsshevc::VssPreAnalyzerImpl::DoMotionEstimation [ 9]: 80.901 ms, (8.989 ms each)
Motion Compensation (OpenCL, AMD Radeon HD 7750)
HevcCL::FillMotionData [ 9]: 24.594 ms, (2.733 ms each) - copy motion vectors
HevcCL::RunFilter [ 9]: 64.640 ms, (6.960 ms each) - filter itself with prolog/epilog
Totals:
- Motion estimation: ~ 60 FPS / 4k video.
- Motion compensation: ~ 145 FPS / 4k video.
Notes:
- Motion compensation is done by 8x8 luma blocks. Each block requires about 16x8x8+8x8x8 multiplications = 24 multiplications/pixel. For real world video it's close to worst case.
- Input frame copying, output frame copying and motion vector filling are necessary only for test. In real encoder or decoder these buffers will always be prepared.
*4k = 3840x2160, YUV420.
No comments:
Post a Comment