The challenges ahead
In many respects, improving processors for
smartphones and tablets is no different to improving processors for desktops
and laptops. The biggest challenges are power consumption and heat, but on mobile
both issues become more critical. Up to a point, the same tricks that work on
desktop processors also work on mobile: new process technologies and a die
shrink.
“We’re continually designing new CPUs to
squeeze more performance and lower energy out of new process nodes,” says
Bryant, noting that ARM is now validating cores down at 20nm and below. Intel
is taking Atom from 32nm to 22nm and 14nm, and Nvidia’s Stam adds that die
shrinks will also help Tegra. “We have the process technologies that are allowing
us to go to smaller dimensions or put a lot more processing power in the same
dimensions,” he says.
In
many respects, improving processors for smartphones and tablets is no different
to improving processors for desktops and laptops
However, on its own this won’t be enough.
“We can see the same mistake from the PC market being applied in mobile: higher
and higher clocks,” says Imagination’s Harold. “But it’s terrible for power
consumption, so everyone does DVFS [dynamic voltage and frequency scaling] to
bring those clocks back down. It isn’t a solution, it’s a market patch.” The
answer, he believes, isn’t in taking desktop processor designs and scaling them
down, but taking a highly scalable architecture and “integrating dynamic
scalability, intelligent power islanding, fast, context-sensitive save/restore
and many other techniques.”
This is the approach Imagination is pushing
with PowerVR, and it isn’t alone. ARM talks about the concept of dynamic range;
of how you create a SoC that doesn’t waste energy while handling lightweight,
background tasks, but also ramps up to handle high performance applications.
For ARM the answer lies not in dynamic clock speeds, but in what it calls big. LITTLE
processing technology, combining very small, highly tuned processors designed
to handle the basics, with larger, more powerful and more versatile processors
for the intensive work.
“You want to go further in both
directions,” says ARM’s Bryant. “You want to eke out your battery life when
you’re doing low intensity workloads, but you want to be able to deliver great
user experiences and high performance throughput on bigger processors when
you’ve got an intense workload.” In a way, it’s a more sophisticated take on
the Variable SMP (4-PLUS-1) architecture Nvidia introduced with Tegra 3, where
four CPU cores handle heavy workloads and a lightweight single core handles
basic OS tasks.
Nvidia, meanwhile, is finding answers by
sharing technologies across its product lines, mixing the performance expertise
of its GeForce GPU and Tesla highperformance computing teams, with the
power-engineering skills of the mobile and Tegra teams.
“If you can take some of that expertise and
cross it over into the desktop side, and take some of the desktop expertise of
how to build high-performance designs and bring it over into mobile, then you
get the best of both worlds,” says Stam. “Absolutely, we’ve been
cross-pollinating our engineers, and that’s been going on for a few years now
in earnest.”
Stam can’t be specific on what technology
is shared between Kepler and future Tegra product lines, but he will say that
circuit designs, power management features and DVFS concepts have moved up and
down the chain. “For fast paths, you have to use fast transistors, but these
tend to be a little leakier,” he says. “But for the paths that aren’t critical,
you can use transistors that are more power efficient. Our Kepler architecture
and certainly Tegra is engineered with different types of transistors that have
different behaviours from a power perspective, and we optimize the layout.”
Nvidia’s hope is that its expertise in
desktop graphics will translate to the mobile sphere, but the approaches
differ. Imagination, ARM and Qualcomm have all adopted tile based rendering
techniques, where the scene is analysed and broken down into blocks before it
goes into the rendering pipeline. “This approach minimises memory and, most
importantly, power while improving processing throughput,” says Imagination’s
Harold.
Chip
Nvidia Tegra 3
Stam is critical of the PowerVR technology
at the heart of Apple’s A5X and A6 chips. “That graphics horsepower isn’t
really being efficiently used at least not from a gaming perspective,” he says.
“They use what is called tiling/chunking technology, and that’s good for
certain things but not good for certain other things.”
Nvidia uses a more direct pipeline, using
on chip caches and a strong geometry rendering architecture to create an
efficient pipeline. For Stam, it’s this that will allow future Tegra processors
to enable “more advanced lighting effects, higher levels of geometry in
characters and games, more complex environments and physics between objects.”
It’s also what Stam claims will allow a
massive speed increase between the current generation Tegra 3 and the chip
three generations on, codenamed Stark. “Stark is a few years away, but that’s
going to be roughly 25 times the performance of Tegra 3,” says Stam. “Twenty five
times the power in just a couple of years within a similar power envelope is
pretty amazing stuff.”