With functional verification consuming more time and effort than design, the chip industry is looking at every possible way to make the verification process more effective and more efficient.
Artificial intelligence (AI) and machine learning (ML) are being tested to see how big an impact they can have. While there is progress, it still appears to be just touching the periphery of the problem. There are many views within the industry about what happens next.
“We see AI as a disruptive technology that will in the long run eliminate, and in the near term reduce the need for verification,” says Anupam Bakshi, CEO and founder of Agnisys. “We have had some early successes in using machine learning to read user specifications in natural language and directly convert them into SystemVerilog Assertions (SVA), UVM testbench code, and C/C++ embedded code for test and verification.”
Effective learning requires good data. “AI requires a lot of learning, and we don’t really have a whole lot of data to enable you to do that,” says Deepak Shankar, founder and vice president of technology at Mirabilis Design. “You need to run a huge number of scenarios, and you can’t do that with verification. Even if you have a server farm or you’re running on emulators, you’re still not going to be able to get that kind of data set.”
Maybe the problem statement needs to be changed. “We are engineers, so we trying to be efficient, and that means we are making a lot of assumptions and a lot of reductions were possible,” says Alexander Petr, operating manager for device modeling at Keysight Technologies. “When we perform measurements, we basically know where the corners are, and we are trying to cover as much space in the middle as needed. The problem might need to be rephrased for neural networks since it’s a data-driven model.”
Until the problem statement is suitably changed, it can be used to improve the efficiency of what we are doing today. “AI and ML are aiding you to either improve your performance or improve your productivity, such as pointing to the root cause quickly, or getting your coverage up based on providing stimulus diversity (see figure 1),” says Kiran Vittal, product marketing director for verification at Synopsys.
Fig. 1: Opportunities for AI-Powered Verification. Source: Synopsys
Vittal explains the challenges associated with each of the stages of verification. He says that for decent-sized designs, structural verification can be very noisy. This noise causes large amounts of manual analysis, and it can result in problems being waived that are true bugs. For formal verification, a large number of proof engines exist and getting to a proof can take a long time, especially if the wrong engine is applied to a property. And when running dynamic verification, the runtime switches used can have a significant impact on simulator performance. Then, when simulators uncover errors, those problems have to be debugged. In addition, filling coverage holes and optimizing test suites can be difficult.
Attempts are being made to address all of these problems, and these attempts can have a significant impact on verification performance and effectiveness.
There is nothing worse than spending time and resources to not get the desired result, or for it to take longer than necessary. “In formal, we have multiple engines, different algorithms that are working on solving any given property at any given time,” says Pete Hardee, director for product management at Cadence. “In effect, there is an engine race going on. We track that race and see for each property which engine is working. We use reinforcement learning to set the engine parameters in terms of which engines I’m going to use and how long to run those to get better convergence on the properties that didn’t converge the first time I ran it.”
This technique can be very effective. Synopsys’ Vittal provides one example. “One customer had a RISC-V design, and they were running formal verification. They wrote many SystemVerilog assertions. The tool started proving those properties on that design, and the first run took 20 hours. It converged on 700 properties, but other properties came back as partial proofs. When we use machine learning, tool efficiency improved, and finished the run in two hours. It also managed to converge on an extra 30 properties.”
As is often said, the best run is the one you don’t need to do at all. “There are things that we call plain old learning,” says Cadence’s Hardee. “For example, we have a proof cache capability. If everything was unchanged, if the cone of influence didn’t change, if the property didn’t change, if the constraints didn’t change, we can just restore existing proof results in regression. There is no need to rerun.”
Many other techniques are being explored. One is looking at improving the performance of analog simulations. “We have something called an adaptive architecture, which looks at a given circuit,” says Sathish Balasubramanian, head of product management for the analog mixed-signal group at Siemens EDA. “There are a lot of different types of analog circuits, and they are different from the simulator perspective. Yet we have only one kind of simulator architecture — a matrix solver. With an adaptive architecture it looks at the design and comes up with the heuristic, saying this design might be of a certain type, such as a PLL. It adapts itself to use a different solver architecture. This takes out the overhead of having one architecture that needs to satisfy all different circuit types. We’ve seen speed-ups of 5X to 10X compared with legacy solutions. Every circuit simulation is a polynomial equation, but the way the polynomial equation is structured is different based on what the circuit types are.”
Those models even could be learned models. “You might see some of the SPICE solvers replacing the existing mathematical solvers with artificial neural networks (ANN),” says Keysights’ Petr. “Instead of solving the whole differential equation, we use the ANN to train the behavior, and then use interpolation to get the data points, which can be fairly accurate since it’s data-driven. If the data set is well defined for the inputs and outputs, it could be any kind of block. That says it’s not limited to a single device. It could be a group. It could be a larger function. The only limitation is it needs to be a continuous analog signal. As soon as you go discrete or digital the ANN will have problems, since the mathematical descriptions we are using are hyperbolic tangents (tanh), which describe a continuous form.”
Similar techniques are being used to create timing models. “A timing model is essentially an 8 x 8 or 16 x 16 table of different slews and loads with the associated timing delay,” says Siemens’ Balasubramanian. “To get this you need to run the SPICE level simulations for each of these cells across these different tables, across all different PVT points, which can be from 64 to 128 to 256, depending on what process node is being used, and then finally it spits out a Liberty model, which is like 1 or 2 gigabytes. This is required for static timing verification. It feeds your synthesis tool and drives many other things. Instead, we generate some anchor points, using the traditional approaches, and the rest can be automatically generated with very high accuracy.”
Running the right test
Verification engineers have trouble knowing exactly what a particular test does, or how effective it is, especially when compared with other tests. A company rarely throws away a testcase, even if it appears to provide no additional coverage. “Machine learning can be used to improve the ability of constrained random simulation to target various measures such as coverage bins or design bugs,” says Daniel Schostak, architect and fellow for Arm. “It can also help reduce the amount of verification required for design changes by predicting the best tests to run for the design changes. However, depending on testbench design, machine learning techniques require levels of information about, and control of, the random generation in a testbench that may not be readily available. This is something tools can potentially help with by extending support for reflective programming capabilities in a standard way.”
Some aspects of this are coming. “Verification engineers need to pay attention to coverage property density or sufficiency, especially for the functional blocks added,” says Bipul Talukdar, director of application engineering for SmartDV. “Machine learning algorithms can enable us to study the existing coverage data and improve our application of constraint random verification in terms of seed selection, constraint tuning, and reactive testbench development.”
Even if all tests eventually are re-run, there is an optimal order in which they should be executed. “You want to run the highest ROI tests first,” says Bradley Geden, director of product marketing at Synopsys. “That is one of the first levels of ML technologies, where we are learning what the most effective tests are, running those first so you can get to your coverage target quicker. That may be different based on the design change made. You need ML technology to figure out what parts of the design are connected to which tests so that instead of running the hundreds of thousands of tests, it can figure out the 100 tests that are going to encircle the changes you made to the RTL and fully verify that the change is not going to break your design.”
This requires linking tests, the coverage obtained by them, and the design itself. “You’ve got to have a very good coverage database, but you’ve also got to be able to relate the coverage results you’re getting back to the tests that accounted for that coverage,” says Hardee. “We see a lot of value in being able to trace which tests gave you the big increase in coverage, and being able to greatly reduce regressions by only repeating those tests. For high-reliability markets, it’s not just that I need a much higher level of coverage. I also need traceability. I need to be able to trace things all the way back to the specification.”
Part of the problem is with the coverage model itself. “You could specify functional coverage all day long, and if the model stinks, your coverage probably stinks,” says Paul Graykowski, senior technical marketing manager for Arteris IP. “Experience helps, and understanding the limitations of functional coverage is key. You’ve got to make sure that the model accurately represents the requirements of your design. It also must have some way to meet the performance goals as well. Many companies just create a functional coverage model and try their darndest to get to 100%, but few companies keep reviewing that functional coverage model to make sure it’s good. At the same point, that model itself has to be validated.”
Test generation could be improved by utilizing ML in test creation. “We see the use of AI and ML evolving from a tool supplemental approach to actually changing the fundamental mechanism of tool algorithms,” says Dave Kelf, CEO for Breker Verification Systems. “For example, a useful application of ML has been in simulation to achieve improved coverage faster by ranking existing tests. However, AI planning algorithms built into test synthesis take this to a new level, where tests are generated to target potential corner cases more effectively. Going from a task that could be achieved by an engineer with some time to one that goes way beyond human capability is the real potential of this technology.”
Rethinking verification is likely to lead to much greater gains. “Perhaps we’re coming up to a point where we need to think about what happens next,” says Stelios Diamantidis, senior director of artificial intelligence solutions at Synopsys. “Randomization will always give us uniform distribution of ability to discover bugs. But if we were to intelligently bias our coverage-driven optimization process, and the way that we construct stimuli, and the way that we pursue coverage holes, that’s the test that could help us bias systems toward finding more bugs. It essentially becomes a big search problem that could be treated like another game — or for design space optimization systems, for pursuing better PPA.”
That kind of disruption can impact models at all stages of the flow. “You can run thousands of use cases through a model, but that doesn’t really solve your problem because you’re not linking it to your requirements database,” says Mirabilis’ Shankar. “We should be linking up, extracting the data from requirements database, and then when you run these different iterations, starting to look at the quality of results for each run. Is this run a better run than this one? So first you have your AI reference, which is from your requirements database. Then you have intelligence, because you’re now assigning to each the requirements. What is the priority of those requirements?”
All systems rely on adequate training, but this also presents difficulties. “It is very difficult to aggregate knowledge from multiple customers,” says Hardee. “Even after consideration for customers’ secrecy, you have to train on applicable designs and make sure that the designs you’re training the supervised-learning algorithms on are representative of the design that you’re trying to get better results on. If you use non-typical designs in the training, you’ll see either no improvement, or potentially even worse results. You have to be very careful about the exact data being used for training. Results are very dependent on that, even within a customer.”
Training data also changes over time. “You have to measure their accuracy, versus reality, over time. And then at some point figure out they need to continue learning,” says Geden. “When you do make a change, if the prediction is completely off, the model has to start to learn again, and refresh its knowledge. It definitely would be a process of incremental learning. In most cases when you make a change, we would pick the relevant tests, run those, and then verify that change. But if those tests are not hitting the change, then we need to update the model.”
This requires that each run is related back to its previous runs. “What we’re doing is verifying the new results against what we got with the previous scenario run,” says Shankar. “The test case remains the same, but we are comparing it against the previous results. We are trying to justify that the new value is really an effect of this one change and not an effect of something else, caused by an error somewhere else in the system.”
Even then, the model may have restrictions. “One of the challenges with machine learning is you’ve got to train it,” says Arteris’ Graykowski. “It’s got to learn based on past datasets. You could theoretically build a model that could learn from what you’ve done in the past, but the question is, how do you extrapolate that to the new things that need to be done?”
AI and ML are effectively being used to improve the verification processes, and tool vendors have perhaps only scratched at the surface so far. But how far can the industry advance by basing everything on the process as it exists today? Perhaps it is time for a rethink.