Intel and AWS announced the availability of a new EC2 instance type powered by the Habana Gaudi AI accelerator. These instances are optimized for training deep learning models in the cloud.
Announced at re:Invent 2020, the latest DL1 instances join the accelerated computing family of EC2 instances such as P4, P3, and P2 powered by GPUs. AWS became one of the first cloud providers to offer Intel Habana AI accelerator-based VMs.
The Amazon EC2 DL1 instance type delivers up to 8 Gaudi accelerators with 32 GB of high bandwidth memory (HBM) per accelerator with 400 Gbps of networking throughput and 4 TB of local NVMe storage. These instance types have enough horsepower to train complex deep neural networks with large datasets.
Intel acquired Habana Labs in 2019 after dumping Nervana processors. Habana Labs has two AI accelerators – Gaudi and Goya. Gaudi is optimized for training while Goya is meant for inference. AWS is expected to launch an EC2 instance type with support for Goya.
Intel claims that the DL1 instance type of EC2 has the best price/performance ratio compared to NVIDIA GPUs. When training the popular Resnet50 computer vision AI model on DL1, customers can realize cost savings of 44% compared to the P4 instance powered by the NVIDIA A100 GPU. Both instance types have 8 AI accelerators optimized for training.
MORE FOR YOU
Similarly, training a model based on the popular conversational AI model, BERT-Large, results in cost savings of 10% vs. p4d and 54% vs. p3dn instance type.
Both the models were based on the standard TensorFlow distribution available on NVIDIA GPU Cloud and Intel’s Habana Vault.
According to Intel, though the 16nm- and HBM2-based Gaudi does not pack as many transistors as the 7nm and HBM2e based NVIDIA A100 GPU, Gaudi’s architecture, designed from the ground-up for efficiency, achieves higher utilization of resources and comprises fewer system components than the GPU architecture.
The rise of ML is driving the adoption of AI accelerators. AWS customers have the choice of using NVIDIA GPUs, Intel Habana, AWS Trainium (yet to be launched) and AWS Inferentia accelerators. Each AI accelerator depends on a native software optimization layer that can convert and optimize TensorFlow and PyTorch. Intel Habana uses SynapseAI SDK for compiling models for Gaudi and Goya. NVIDIA encourages the usage of CUDA and TensorRT libraries to get the best from its GPUs. AWS has invested in the Neuron SDK to target its Trainium and Inferentia chips.
Intel is under pressure to get its AI strategy right. After faltering a few times, it is hoping to regain the mindshare and market share with Habana.