The goal of chip design always has been to optimize power, performance, and area (PPA), but results can vary greatly even with the best tools and highly experienced engineering teams.
Optimizing PPA involves a growing number of tradeoffs that can vary by application, by availability of IP and other components, as well as the familiarity of engineers with different tools and methodologies. For example, higher performance may be achieved with a larger processor, but it also can be done using smaller, more specialized processing elements with tighter integration of hardware and software. So even in the same area and with the same power budget, there are different ways of achieving the same goal, and the optimum mix may vary depending upon a specific domain or vendor’s needs.
This is made even more complex by the rising demand for security. Depending upon the criticality of the design, security can be either active or passive, which affects power and performance. And it can impact IC manufacturing costs, time-to-market, lead time, and vendor competitiveness.
To sort through all of these possible permutations, EDA vendors increasingly are looking to AI/ML, integrating various AI functions into tool flows. Results, so far, are promising, as reported by researchers at MIT and the University of Texas at Austin in a recent paper. The researchers concluded that deep reinforcement learning (deep RL algorithms) tools can outperform humans for certain tasks.
During a six-hour experiment, the researchers matched a graph convolutional neural network approach using reinforcement learning against other techniques, including conventional black-box optimization methods (Bayesian optimization, evolutionary algorithms), random search, and a human expert designer with five years’ experience. The experiment concluded that RL with transfer learning can achieve better results. In other words, AI-based tools can make transistor sizing and design porting more effective and efficient. (Table IV).
Today, many companies, including Google, Nvidia, Synopsys, Cadence, Samsung, and Siemens, either have started employing or anticipate using AI in chip design.
How will AI change the chip design landscape?
Until recently, chips have been designed by humans using various automated design tools in circuit and logic design, routing, layout, simulation, and verification to minimize errors while reducing time and cost. The process can be quite tedious and time-consuming.
Fig. 1: Various steps in the semiconductor design flow. Source: eInfochips
There are many steps in designing a chip. The process starts with a chip specification or architectural definition, following by various steps in the design flow. Following sign-off by the design team, a graphic design system (GDS II) file is sent to the foundry.
This process was fine-tuned when Moore’s Law was used as the main guide. But as the benefits of scaling began to diminish in the finFET era, chipmakers began looking for new ways to achieve PPA improvements. That significantly increased design complexity, making it more difficult to deliver working silicon on time and on budget.
“The average cost of designing a 28nm chip is US$40 million,” said Handel Jones, CEO of International Business Strategy (IBS). “By comparison, the cost of designing a 7nm chip is US$217 million, and the cost of designing a 5nm device is US$416 million. 3nm design will cost up to US$590 million.”
Moreover, while transistor counts have gone from thousands to billions at each new node, those designs are increasingly heterogeneous, and they often involve some form of advanced packaging. Now, instead of just cramming more transistors into the same space, there are issues to resolve involving power density, thermal dissipation, various types of mechanical and electrical stress, proximity effects, and contextual concerns that could affect overall chip behavior. All of that adds time to the design process, which in turn raises the cost. And to make matters worse, the constant pressure for chip manufacturers to introduce advanced-node designs in less time can lead to costly mistakes.
Improving efficiency with AI
Adding AI into chip design can help manage complexity, reduce errors, and shorten the development cycle. For example, using traditional tools to do routing in chip design can automate 90% of the work. An experienced designer is still needed to finish up the last 10%, and at the end there may be more attention to getting a functioning chip out the door than PPA optimization. AI can reduce the amount of time spent on that last 10%.
Fig. 2: Growing role for AI. Source: Cambrian AI Research
“It is all about efficiency,” said Steven Woo, fellow and distinguished inventor at Rambus. “Essentially, human designers use tools to achieve optimization. But AI can make it faster in less cycles. The AI engine can be fed preset rules to achieve better inference. Applying the reinforcement learning rule, the AI-based design tools will get better and better. It will help designers to achieve almost error-free solutions over time, with efficiency in optimizing PPA better than humans alone can achieve. Additionally, because speed is everything here, it is important to also consider the chip-to-chip memory speed, as AI needs to access a large database quickly.”
Others agree. “AI will automate chip design even further, especially in the layout process. It has been demonstrated that productivity has been increased using machine learning in analog circuit design already. On layout, machine learning will be used to suggest optimal device placement in finFET nodes to minimize interconnect parasitics. When a chip design involves MEMS such as accelerometer and gyroscope. AI can be used in a parametric design flow to co-design the IC and the MEMS device. This will allow designers to integrate MEMS, IC, and software a lot quicker than using traditional design flows, making designer’s life a lot easier,” commented John Stabenow, product engineering director for the IC Design group at Siemens Digital Industries Software.
How AI learns
AI machines can do a much better job than humans in pattern recognition and matching in a very short amount of time. AI does not start learning from ground zero. In most cases, the AI agent (processor) will be pre-trained or fed with a large amount of data, such as 15,000 samples of floor planning. By this point, AI algorithms already include some intelligence.
Additionally, AI will make use of reinforcement learning (RL) to optimize results. RL is a machine learning technique to help the agent learn in its interactive environment by trial and error based on its own experiences. The process uses a reward and punishment model. The AI model will start with an initial state (input), and deliver certain results (output).
Designers then will reward or punish the model. The model will keep learning and deliver the best results based on maximum rewards received. When an engineer accepts a suggestion from the AI model, it would be considered by the AI model as a reward. Conversely, when the AI suggestion is rejected or overruled by the engineer because they think a better solution is available, the AI model would consider this as a punishment. The RL learning process goes on. Over time, the AI model gets better and better.
“Machine learning is a subset of AI that refers to as a machine’s ability to think without being externally programmed,” said Ravi Subramanian, senior vice president and general manager for Siemens Digital Industries Software. “Traditional devices are programmed with a set of rules for how to act, and then this takes the form of if-then-else statements. But machine learning enables devices to continuously think about how to act based on data they intake.”
Subramanian said that for AI to learn, three things are necessary:
- A pool of data, which is the data lake. It can take the form of RTL IP, GDSII, C code, or a SPICE netlist.
- A model enabling the AI-based system to adapt, learn, improvise, and generalize itself so it can predict based on new inputs not from the data lake.
- A decision function based on some metric must exist, and a reward mechanism based on achieving the metric should be reliable.
“AI does not make decisions, per se,” he explained. “AI is about a system’s ability to adapt and improvise in a new environment, to generalize its knowledge and apply it to unfamiliar scenarios. This definition is taken from Francois Chollet, head of AI research at Google.”
Unlike an automobile, where there are standard ways to measure miles per gallon or distance per charge, there is no standard method of measuring the outcome of using AI. Each design is unique and the tools used vary. The industry across the board, however, has reported productivity improvements using AI-based chip design tools.
Google, for example, applied AI to floor-planning, and found that in less than six hours they could achieve what previously took engineers a few months. Both delivered results of manufacturable chips with PPA optimization, but productivity was significantly higher with AI.
“Adding AI to the chip design process will definitely increase its efficiency,” said Rod Metcalfe, product management group director in Cadence’s Digital & Signoff Group. “For example, a 5nm mobile CPU using AI can increase performance by 14%, improve leakage power by 7%, and density by 5%. This can be significant.”
These kinds of improvements are echoed in other applications. “Using AI-based design technology, our customers have indicated that they were able to achieve significant reduction in power — as much as 25% or more compared to manual tuning,” said Stelios Diamantidis, senior director of artificial intelligence solutions at Synopsys. “This kind of improvement over already optimized designs is amazing.”
The future of AI in chip design
Squeezing 1 billion transistors into a die is unthinkable to most people. But in June 2021, Synopsys reported that its largest chip built so far has 1.2 trillion transistors and 400,000 AI-optimized cores at 46,225mm2. To design chips this size is almost impossible for human designers with traditional design tools.
“The benefits of using AI to accelerate and optimize chip design is now a given, at least as far as the major chip vendors are concerned,” said Karl Freund, founder and principal analyst, Cambrian AI Research. “Systems like Synopsys DSO.AI are saving companies time and money, and producing chips with lower power, higher performance, and less area. Now the industry is turning its attention to the next steps beyond optimizing physical designs, such as system-level optimization, software/algorithm optimization, and even design verification. The entire industry will benefit from these innovations, as will the consumers of faster, less power-hungry, and lower cost silicon.”
All of the major EDA companies are infusing AI capabilities into their tools. But rather than just cramming more into a smaller space, they also can cram much more into a much larger space.
Fig. 3: Cerebras’ wafer-size chip. Source: Cerebras Systems
Cerebras Systems’ second-generation chip, developed using a 7nm process, contains 2.6 trillion transistors and 850,000 AI optimized cores. It is now the world’s largest chip for AI applications. It is about the size of a dinner plate. In comparison, the largest GPU only has 54 billion transistors. Cerebras’ chip requires 40 GB of on-wafer memory to support the AI computations. To design such a chip, AI-based chip design tools are required.
Additionally, integrated chip security will be required alongside of PPA concerns in coming months and years, and AI can help with that, as well.
Siemens’ Subramanian pointed to four areas where AI already is in use. “They may be using AI as an alternative to a conventional method of solving a specific problem. They may be using AI to create a new methodology on how to design or verify their IC. They may be using an AI-powered tool to reduce errors or time-to-best results. Or, they may be building an AI chip, in which case the designer is creating a new computing architecture to solve a problem, and the architecture is based on using AI or machine learning principles.”
AI works best in design when the problem is clearly defined in a way that AI can understand. So an IC designer must first see if there is a problem that can be tied to a system’s ability to adapt to, learn, and generalize knowledge/rules, and then apply these knowledge/rules to an unfamiliar scenario.
“Understanding whether there is a problem that is well-suited to AI is the first and most important step,” Subramanian said. “This is perhaps the most important phase of the entire process.”
What has been shown, so far, is there are many areas where AI does apply, and more will undoubtedly be developed in the future. AI is here to stay. Now the question is, what else can it do?