Heterogeneous computing performance
Advertising its hybrid processors, AMD
perseverance reminds us that the integrated graphics core can be used to
accelerate general-purpose computations. That is true. OpenCL
and DirectComputeframeworks enable parallel computation on both x86 and
graphics cores are supported by both the AMD Trinity and iv. And while they
used to be used by very few specialized applications, the idea of heterogeneous
computing has become much more widespread. That's why we want to run a test on
the performance of applications that can use all the resources offered by
hybrid processors. There are many such apps available, so we can choose a few
popular ones. Therefore, our test will have some practical value.
We want to start with the simple task of
decoding and transcoding video. Modern hybrid processor does this by taking
advantage of their graphics core, but not the shader processors. There are
special sub-unit for that purpose. Intel calls it Quick Sync while in the APU
AMD this sub-unit is called UVD3 and VCE.
The current processors have no problems
with HD video playback in multiple formats. Hardware video decoding works
perfectly even when it's force to play a 1080p stream at 60fps with high bit
rate. However, the resolution and high bit rate becomes more popular, the inexpensive
processors can be difficult to cope with. For example, we have used in this
test a widescreen 4096x1744p @ 24fps clip encoded in H.246 format with a bit
rate of 34Mbps. If played via DXVA with enabled hardware decoding, we have
reduced frames. And reducing the number of frames based directly on the ability
of the CPU. The graph below shows the average number of frames displayed when
the video is reproduced in the test player software Media Player Classic - Home
Cinema version 1.6.5. We activate the subtitle to make the test even more
difficult.
The
current processors have no problems with HD video playback in multiple formats
We have a number of unusual results when
playing our 4K video. APU A10-5800K and A8-5600K are best with a minimum number
of frames missed. Core i3 processor is quite bad, closely followed by A10-5700
và A8-5500. A6, A4, Pentium and Celeron processor are on the failed side of
this test, which missed half of the frame in the video test.
In fact there aren't any processor that can
cope perfectly in decoding our 4K video. None can be used in a multi-media
center. When 4K and UHD format become more popular, users can optimize to
improve this situation, but it is safer to rely on the high-performance
hardware components.
The other common video processing task is
transcoding. Today, all graphics core developers have recognized that the
special encoders should be integrated in their solutions. We have tested the
transcoding ability of the Trinity and Ivy Bridge processor by CyberLink MediaEspresso
6.7 supports both Intel Quick Sync and AMD VCE. In this test, a 1.5GB 1080p
H.264 video (a 20-minute episode of TV series) is transcoded to a lower
resolution format for viewing on an Ipad2 (H.264, 1280x768 pixels, 6 Mbps).
The
other common video processing task is transcoding
The results of the Celeron and Pentium
processors are indicative of how important hardware acceleration is for that
task. Intel disable Quick Sync in their junior CPU model and its transcoding
speed is comparable to the length of the original video. The Core i3 series has
Quick Sync and performs the job 10 times faster. We also note that the advanced
version of the HD Graphics 4000 graphics core, is faster by a third, so it
differs from the HD Graphics 2500 in this respect as well, not only in the
number of execution units.
Anyway, Quick Sync is still the fastest
hardware transcoding solution in its every implementation. Trinity series with
its VCE technology is only one third of the speed of its rivals in this test.
By the way, VCE delivers the same transcoding performance in every APU. The
only exception is the A4-5300 model which was about 20% slower than its cousin.
Video transcoding and playback are
undoubtedly the most important task for the family computer, but we are
interested in how modern hybrid processors do in true heterogeneous
applications that run both on x86 cores and shader processors. A significant
indicator that the hybrid processor concept has been accepted by the software
market is the fact that OpenCL support is added to the popular data compression
tool WinZIP. Its 17th edition can use the GPU resources to compress files, x86
and graphics cores share the load in the following manner:
X86
and graphics cores share the load
According to the graph, the x86 core did
most of the job, but the CPU can help a lot. Therefore, it is no wonder that
the advanced graphics core ensures a substantial performance boost for AMD’s
Socket FM2 processors in WinZIP.
The
Radeon HD graphics cores implemented in the Trinity APUs really help improve
their performance