Most people don’t really think about datacentres, but we all use Internet-connected apps, streaming services, and communication tools that rely on processing and storing massive amounts of information. As the world gets more connected and it becomes easier to create and distribute huge amounts of data, the systems and processes needed to handle all of it keep evolving. Sandra Rivera, Intel Executive Vice President and General Manager, Data Centre and AI Group, was recently in Bengaluru, and Gadgets 360 had the chance to hear about her take on current trends and her vision for the future. A lot has changed thanks to the pandemic, and of course AI is a huge part of the story going forward.
We first brought you Sandra Rivera’s comments about Intel’s ongoing work in India and everything that the company is doing here. Now, here are some more excerpts from that conversation, about innovation in hardware and software, the evolving nature of datacentres, and competing with Nvidia.
How datacentres are becoming even more important, and how things have changed in the recent past:
Sandra Rivera: All our innovations and products are clearly being driven by our customers. We are in a large and growing TAM [Total Addressable Market] and as we drive forward, nowhere is that more evident than in India, with digital transformation and the digitisation of every part of our lives. We need more compute; we’re creating more data. It needs to be compressed, secured, delivered over a network, and stored. It needs to be served up, and you also need to get valuable insights out of that, which of course is where AI comes in.
One of the interesting things that happened during COVID is that because of supply chain constraints that we all struggled through, we saw customers lean into more utilisation of the infrastructure that they had. AI, networking, and security are very hungry for the latest innovations and solutions, but a lot of the Web tier; office applications that run in cloud infrastructure; ERP systems; accounting systems; etc, are actually very focused on utilisation.
The biggest growth is happening at what we call the edge of the network, or on premises. The compute is coming to the point of data creation and data consumption. A lot of the challenge for us there is partnering with our OEMs to simplify deploying applications on-premise to process that data; to run machine learning, AI, data analytics, networking capabilities, security. That’s a lot of work both in hardware and of course in in software.
That’s true here in India as well. [Some of it] is driven by power constraints and so if they can have power dedicated to those leading-edge applications and infrastructure and then cap the power on more mainstream applications, then that’s a smart use of the power budget, which is a big deal.
India has been so important for us from an R&D perspective; I mean we’ve been here for decades. We also see with all of the investments that the government is making in digital transformation and infrastructure, that India is going to be a huge consumption market for us as well. The opportunity to build out more infrastructure here, more datacentres, more enterprise solutions, software ecosystem solutions, and services, is very exciting. We continue to invest not only in the workforce but also in the market opportunities here.
The continued importance of CPUs even as GPUs are in demand, and how that is disrupting datacentre design:
Sandra Rivera: There are high-growth workloads like AI and networking driven by the continued proliferation of 5G, as well as security and storage. One of the dynamics we’re seeing in the market is that in the near term, there’s a lot of interest for accelerated compute, meaning GPUs and AI accelerators.
Customers are looking to shift a bit of their capital expenditure towards GPUs. The CPU is part of the equation, but in the near term, more of that capex spend is going to go to GPUs. We don’t think that that’s a permanent market condition. The CPU is quite good from a cost-performance-programmability perspective for many AI workloads. In many cases, customers already have a Xeon CPU, and so the fact that they can do AI machine learning [with that] is a tailwind for our business.
[All that] everyone talks about right now is generative AI and large language models, but AI is much more than that, right? AI is all the data preparation that happens before you train the model; it’s the data management, filtering, and cleaning. So if you are trying to build an application to identify cats, [for example] you don’t want any dogs in those pictures. All of that is done upfront with the CPU and actually almost exclusively with the Xeon today. That’s part of the AI workflow. Then you get to the actual model training phase. The CPU is very well positioned to address small to medium-sized models – 10 billion parameters or lower – or mixed workloads where machine learning or data analytics is part of a broader application. The CPU is very flexible, highly programmable, and you probably have CPUs already.
When you talk about the largest models, with 100, 200, 300 billion parameters – there you need a more parallel architecture, which is what a GPU provides, and you also benefit from dedicated deep learning acceleration, like we have in Gaudi. After you train the model, you get to what we call the inference or deployment phase. Typically, you’re on-premises there. If you are in a retail organization or a fast food restaurant, you will typically be running that on either a CPU or some less power-hungry, less expensive accelerator. In the inference stage, we can compete very effectively with our CPUs and some of our smaller GPUs and accelerators.
Right now, there’s a lot of interest around those largest language models and generative AI. We see more customers saying they want to make sure that they have some GPU capabilities. We do see that dynamic, but long-term, the market is complex. It’s growing. We’re in the early days of AI. We think that we have a very good opportunity to play with the breadth of capabilities that we have across our portfolio. So it’s not that I think that generative AI is small; but it’s not addressable only with a large-scale GPU.
How Intel sees Nvidia, and how it plans to compete
Sandra Rivera: Everyone knows that Nvidia is doing a great job of delivering GPUs to the market. It’s a giant player. Let me put that in perspective. The Gaudi 2 has better performance than the Nvidia A100, which is the most pervasive GPU today. It doesn’t have more raw performance versus H100 right now, but from a price-performance perspective, it’s actually very well positioned. One of the data formats supported in the Gaudi 2 hardware is FP8, and the software to support that is going to be released next quarter. We expect to see very good performance, but you’ll have to wait and see what we publish in November. Next year, we’ll have Gaudi 3 in the market which will be competing very effectively with H100 and even the next generation on the Nvidia roadmap. Our projections look very good. We’re priced very aggressively. Customers want alternatives and we absolutely want to be an alternative to the biggest player in the market. It’s going to be what we do, not what we say.
Intel’s roadmap for building sustainable datacenters.
Sandra Rivera: We use over 90 percent and in some cases 100 percent renewable energy in all our manufacturing across the world. We are second to no one in renewable energy and total carbon footprint for the manufacturing of our products. The competition, like most of the world, is building their products in foundries either in Taiwan or in Korea. Of course Taiwan is the biggest, but the footprint that they have in renewable energy is actually quite small. It’s an island; everything gets shipped using diesel fuel. When we look at the datacentres that we’re building ourselves for our own fabs and our own IT infrastructure, again that’s 90 percent plus renewable energy. We also partner very closely with our OEMs as well as cloud service providers to help optimise around green and renewable energy.
With the 4th Gen Xeon we introduced a power-optimised mode where you can actually use 20 percent less energy by being smart about turning off cores during idle times and tuning the processor. We were able to do that with a very small performance impact, less than 5 percent, and customers like that because they don’t always need the processor to be running at full capability and they can save a lot of energy.
The current state and future potential of neuromorphic and quantum computing in datacentres
Sandra Rivera: Neuromorphic and quantum computing are leading-edge technologies. We’ve been an investor in quantum for at least a decade and a half. We’ve been investors in silicon photonics; optical networking and interconnects have become increasingly interesting, especially in these very high-end, large-scale computing platforms. We know that memory technologies are going to be critical for us going forward. We’ve been investors in memory technologies with partners and on our own. The commercial viability of those technologies are sometimes 10-20 years out, but innovation is the lifeblood of our business. We have extraordinary capabilities with Intel Labs. We have so many fellows, senior fellows and industry luminaries. The process technology is some of the most complex and exquisite engineering in the world.
We’ll continue to lead from an innovation perspective. Commercial viability all depends on how fast markets shift. We do think that AI is disruptive, and some of those technologies will probably be [developed] at an accelerated pace, particularly networking and memory. There are lots of innovations in power and thermals; these chips and systems are getting bigger and hotter. It’s not always easy to answer when the timing is [right]. Some of these technologies may not have commercial success, but you take parts of them and channel them into other areas. I think this is the business of innovation and we’re very proud of our history. Those [teams] get to do a lot of very fun things and they’re very energised.
Some responses have been condensed and slightly edited for clarity.
Disclosure: Intel sponsored the correspondent’s flights for the event in Bengaluru.