HPCwire presents our interview with Jeff McVeigh, vice president and general manager, Super Compute Group, Intel Corporation, and an HPCwire 2022 Person to Watch. McVeigh shares Intel’s plans for the year ahead, his perspective on HPC-AI synergies and the differentation in Intel’s AI-directed products. Also covered are quantum computing, HPC trends, and more. This interview was conducted by email at the beginning of April.
Jeff, congratulations on your selection as a 2022 HPCWire Person to Watch and on your promotion! Summarize the major milestones achieved last year for your division and briefly outline your HPC/AI agenda for 2022.
Thank you, it’s truly an honor! Last year, when Pat Gelsinger returned to Intel as our CEO, we established ambitious goals regarding the future of computing using the superpowers of ubiquitous computing, pervasive connectivity, artificial intelligence, and cloud-to-edge infrastructure. To achieve our goals, we brought together our business and product portfolio covering our high-end Xeon processors that power over 85% of the today’s HPC deployments with our upcoming high-end datacenter GPU products. We combine these hardware products with a comprehensive HPC software stack from drivers to tools to applications that provide our customers with complete solutions to help solve the world’s most challenging problems, faster.
Over the past twelve months, we launched the 3rd-generation Intel Xeon Scalable process (previously named Ice Lake), which delivered up to 30% better performance gen-to-gen on a vast set of HPC workloads; we demonstrated world-record performance on the forthcoming Next Gen Intel Xeon Scalable processors, codenamed Sapphire Rapids with integrated High Bandwidth Memory that offers applications up to 4x more raw memory bandwidth, and up to 2.8x the workload performance compared to current platforms; and our flagship Intel datacenter GPU, codenamed Ponte Vecchio that will power the Aurora supercomputer, started sampling to customers including Argonne National Labs. We also significantly ramped the oneAPI ecosystem with two new Centers of Excellence, which provides developers with a fully open programming model across diverse architectures and even vendors.
As you can see, we have a super year ahead as we get ready to ship compelling products and solutions targeted for HPC & AI.
How do you see the relationship between HPC and AI, both broadly and more specifically at Intel? Tie this to specific Intel products or goals.
At the most basic level, AI and HPC workloads need all the computing density and memory bandwidth that we can muster. While different workloads will have different needs and configurations (maximum core count, or frequency, or bandwidth), we must always provide balanced systems that match the workload characteristics. As AI models become more sophisticated, we’re seeing those get infused into traditional HPC codes. At first, these are used to augment computational models to speed iterations and overtime they’re expected to fully merge. That’s why we are increasingly integrating AI into our HPC products and traditional HPC requirements into our AI products.
For example, Xeon processors are the only x86 CPUs with built-in AI acceleration optimized to analyze the massive data sets in HPC workloads. With Next Gen Intel Xeon Scalable processors, codenamed Sapphire Rapids will deliver nearly 2x performance gain for end-to-end AI workloads. Additionally, our flagship datacenter GPU – Ponte Vecchio – has industry-leading, full rate double-precision support in addition to a massive L2 cache, high-bandwidth memory, dense vector/matrix compute, and glue less scale-up/out. This hardware portfolio is supported by the foundational oneAPI programming model to enable productive performance for HPC and AI applications.
Beyond the products in market today and those that will be launched later this year, we’re also working on a brand-new architecture codenamed Falcon Shores. Falcon Shores will bring x86 and Xe GPU acceleration together into the Xeon socket, taking advantage of next-generation packaging, memory, and I/O technologies for huge performance gains. We will also see efficiency improvements in systems computing large data sets and training gigantic AI models. We expect Falcon Shores to deliver more than 5x performance-per-watt, compute density, and memory capacity and bandwidth – all in a single socket with a vastly simplified GPU programming model. Products like Falcon Shores will accelerate the convergence of HPC & AI.
Intel has several different devices that offer “AI capabilities.” How do you differentiate these different product lines, including the Xe GPU?
Artificial intelligence is the fastest-growing compute workload from datacenter to mobile to the intelligent edge. Powering AI’s proliferation requires a holistic approach that accounts for vastly different use cases, workloads, power requirements and more. That’s why our silicon portfolio is comprised of a diverse mix of architectures from CPUs, GPUs, FPGAs and ASICs, which address specific customer requirements. Some of these products target power-constrained environments (client and IOT devices), others target cloud computing environments with diverse needs across AI and non-AI services, and others target dedicated AI training use cases. One size doesn’t fit all from a hardware perspective, but we need one, unified and open programming model across them – the oneAPI industry initiative provides this foundation.
It has been proposed that quantum accelerators will be integrated into either an HPC architecture or workflow. How do you see HPC and quantum coming together?
Quantum computing has the potential to solve problems that conventional computing – even the world’s most powerful supercomputers – can’t address in reasonable compute times or feasible level of energy consumption. However, before we integrate quantum computing into scaled HPC deployments, we need to solve some of the fundamental technical hurdles.
One of these hurdles is the underlying physics of qubits. We’ve uniquely pioneered research in silicon spin qubit technology, an alternative approach to other qubit technologies being pursued in the quantum field. Spin qubits resemble the electron transistors Intel has spent the last 50 years manufacturing at very large scale.
Additional hurdles include the achievement of high-fidelity multi qubit operations, efficient interconnect or wiring between qubits, fast and low-power qubit control electronics, plus high-performance, energy-efficient error correction to reduce the impact of qubit fragility.
While we’re still in the early stages of quantum computing research, we continue to make progress and produce positive outcomes, which will bring the world closer to realizing the true potential of quantum computing. Partnerships, like the one we have with QuTech/Delft University and Argonne National Laboratories will help address these technically complex challenges.
Computational milestones are inspiring and exciting. What is your vision for the exascale era?
While reaching a specific compute milestone can be treated as just another number, I feel that the exascale era will create a mind shift that will fundamentally change the way algorithms are designed to achieve a multiplicative affect. Instead of just speeding up an existing simulation by an order-of-magnitude, what if that simulation could be 10-times more accurate to develop novel materials that help to reverse climate change, or 100X more variations could be run to quickly discover cures to chronic diseases.
Evolutions in architectures or algorithms bring about incremental change; revolutions occur when both advance simultaneously. This is the door that opens with the exascale era.
Intel’s top execs are emphasizing a new era of openness. Can you comment on this and offer a few examples?
An open approach has always been part of our DNA. Throughout Intel’s history we’ve taken the approach of driving open platforms and industry-shaping standards with the APIs that enabled them for the good of the computing ecosystem. Open standards drive ubiquitous computing, which drives demand for more computing. A small sampling of these includes USB, Wi-Fi, and UCIe. These enable the entire ecosystem to be successful – end-users, developers, partners, and enterprises – instead the few when constrained to a proprietary solution.
An open ecosystem is the north star that guides our roadmaps and how we bring products to market. Our mission for the Super Compute Group is to drive these standards and partner with the ecosystem to create the foundational hardware platforms the industry can rally behind and build upon in the service of helping our customers solve the world’s most challenging problems, faster.
One of our most recent examples of this is how we incubated the oneAPI ecosystem and ensured its broad adoption across architectures and vendors. This provides developers with choice through collaborations with the broad ecosystem and working together with our industry partners to solve systemic challenges of security and distribution. We will also collaborate with academia and developers to innovate on future challenges like neuromorphic and quantum computing – all in the open.
Where do you see HPC headed? What trends – and in particular emerging trends – do you find most notable? Any areas you are concerned about, or identify as in need of more attention/investment?
Delivering supercomputing for all requires scale but it is equally important to reduce energy consumption. The increasingly open and distributed nature of compute, growth in AI/ML applications and heterogeneous computing will help HPC scale. The industry will have to make progress in lowering the power and cooling costs of these systems to make them more efficient and accessible to the broader community. I’m excited about the possibilities technologies like high bandwidth memory will bring to HPC and AI workloads. Intel has connected the dots between HPC and AI by building products that are optimized for these workloads.
To advance HPC from Exa-scale to Zetta-scale we need significant, revolutionary gains in architecture, power efficiency, thermal management, process and packaging technology, memory and IO capacity and bandwidth – all supporting rapid evolution of the software applications that run on top.
More generally, what excites you about working in high-performance computing?
Today, we are at the threshold of a new generation of HPC, where the technology’s scalability, ubiquity and accessibility can transform all our lives. I am very pleased that we’ve democratized HPC and put it in the hands of scientists who now have easy access to supercomputers to advance cutting edge research.
Nowhere has this been more obvious than with the battle against Covid-19. From the start of the pandemic, the scientific and research communities tapped these advanced supercomputers – both within research labs and in cloud HPC-as-a-service environments – to monitor, study, treat and eventually develop the drugs used to treat the SARS-CoV-2 novel coronavirus. The speed at which this was done was breathtaking and would have been impossible had it not been for the broad availability of HPC technologies.
I am excited and humbled to be leading Intel’s Super Compute Group responsible for building the technology foundations to advance HPC and make it accessible. We are energized with the new vision and strategy that Pat laid out and are committed to executing at a torrid pace to deliver our new products to customers this year.
What led you to pursue a career in the computing field and what are your suggestions for engaging the next generation of IT professionals?
I’ve always been fascinated with technology, even from my early childhood when I would disassemble our family telephone or TV to see how it worked – often to the frustration of my parents! My PhD thesis was on efficient compression of multi-view video signals – basically, how to use computational power and machine learning to predict different viewpoints. Now this was in the mid-1990s, so both the compute resources and AI algorithms were far from today’s capabilities, but that experience gave me the foundations on the interplay of architecture and algorithms.
My initial work at Intel carried this forward during the early days of video conferencing and streaming (well before we all used Zoom for work or YouTube even existed). I then worked on Intel’s integrated graphics products and then our developer tools. These experiences taught me about the importance of end-to-end solutions to solve customer problems and ensuring both performance and productivity – one without the other is always an uphill battle.
For the next generation, I recommend seeking out opportunities that match your passion as it’s always best to work on projects that you’re curious about and get you motivated to put in the effort every day. I also stress the importance of communication to convey the “so what” behind data or trends or issues that need to be fixed. If you can’t communicate clearly in all directions – peers, employees, or leaders – you’ll be fighting those battles alone.
Outside of the professional sphere, what can you tell us about yourself – family stories, unique hobbies, favorite places, etc.? Is there anything about you your colleagues might be surprised to learn?
I love the outdoors – skiing, running, hiking, mountaineering – anything that connects me with nature. To feed this thirst, I recently bought a sprinter van that I’m outfitting for the sole purpose of chasing powder for skiing. I’m also a big football fan. I enjoy these hobbies with my awesome wife, two terrific daughters and our sometimes-misbehaving dog Tillamook (who got his name because of his love of eating butter off the kitchen counter).
McVeigh is one of 12 HPCwire People to Watch for 2022. You can read the interviews with the other honorees at this link.