Efficient MLOps through AI application containerization – Embedded

At the end of 2021, the artificial intelligence market was estimated to be a value of $58.3 billion. This figure is bound to increase and is estimated to grow tenfold over the next 5 years and reach $309.6 billion by 2026. Given such popularity of AI technology, companies extensively want to build and deploy solutions with AI applications for their businesses. In today’s technology-driven world AI has become an integral part of our life.  As per a report by McKinsey, AI adoption is continuing its steady rise: 56% of all respondent’s report AI adoption in at least one business function, up from 50% in 2020. This increase in adoption is due to evolving strategies for building and deploying AI applications. Various strategies are evolving to build and deploy AI models. AI application containerization is one such strategy.

Machine learning operations (MLOps) are becoming increasingly stable. If you are unfamiliar with MLOps, it is a collection of principles, practices, and technologies that help to increase the efficiency of machine learning workflows. It is based on DevOps, and just as DevOps has streamlined the software development life cycle (SDLC) from development to deployment, MLOps accomplish the same for machine learning applications. Containerization is one of the most intriguing and emerging technologies for developing and delivering AI applications. A container is a standard unit of software packaging that encapsulates code and all of its dependencies in a single package, allowing programs to move from one computing environment to another rapidly and reliably. Docker is at the forefront of application containerization.

What Are Containers?

Containers are logical boxes that contain everything an application requires to execute. The operating system, application code, runtime, system tools, system libraries, binaries, and other components are all included in this software bundle. Optionally, some dependencies might be included or excluded based on the availability of specific hardware. These containers run directly within the host machine kernels. The container will share the host machine’s resources (like CPU, disks, memory, etc.) and eliminate the extra load of a hypervisor. This is the reason why containers are “lightweight “.

Why Are Containers So Popular?

First, they are lightweight since the container shares the machine operating system kernels. It doesn’t need an entire operating system in place to run the application. VirtualBox, popularly known as virtual machines (VMs), require installation of complete OS making them quite bulky.

Containers are portable and can easily be transported from one machine to another machine with all the required dependencies within it. They enable developers and operators to improve CPU and memory utilization of physical machines.

Among container technology, Docker is the most popular and widely used platform. Not only the Linux-powered Red Hat and Canonical have embraced Docker, but also companies like Microsoft, Amazon, and Oracle are relying on it. Today, almost all IT and cloud companies have adopted docker, and are widely used to provide their solution with all the dependencies.

click for full size image

Virtual Machines vs Containers (Source: Softnautics)

Is There Any Difference between Docker and Containers?

Docker has widely become a synonym for containers because it is open source, has a huge community base, and is a quite stable platform. But container technology isn’t new, it has been incorporated into Linux in the form of LXC for more than 10 years, and similar operating-system-level virtualization has also been offered by FreeBSD jails, AIX Workload Partitions, and Solaris Containers.

Dockers can make the process easier by merging OS and package needs into a single package, which is one of the differences between containers and dockers.

We’re often perplexed as to why docker is employed in the field of data science and artificial intelligence, yet it’s mostly used in DevOps. ML and AI, like DevOps, have inter-OS dependencies. As a result, a single code can run on Ubuntu, Windows, AWS, Azure, Google Cloud, ROS, a variety of edge devices, or anywhere else.

Container Application for AI/ML

Like any software development, AI applications also face SDLC challenges when assembled and run by various developers in a team or collaboration with multiple teams. Due to the constant iterative and experimental nature of AI applications, there comes a point where the dependencies might wind up crisscrossing, causing inconveniences for other dependent libraries in the same project.

click for full size image

The need for Container Application for AI / ML (Source: Softnautics)

The issues are true, and as a result, there is a requirement for acceptable documentation of each step to follow if you’re presenting a project that requires a specific method of execution. Imagine you have multiple python virtual environments for different models of the same projects, and without updated documentation, you may wonder what are these dependencies for? Why do I get conflicts while installing newer libraries or updated models etc.?

Developers constantly face this dilemma “It works on my machine” and constantly try resolving it.

Why it’s working on my machine (Source: Softnautics)

Using Docker, all of this can be made easier and faster. Containerization can help you save a lot of time updating documents and make the development and deployment of your program go more smoothly in the long term. Even by pulling multiple images which will be platform-agnostic, we can serve multiple AI models using docker containers.

The application written fully on the Linux platform can be run on the Windows platform using docker, which can be installed on a Windows workstation, making code deployment across platforms much easier.

click for full size image

Deployment of code using docker container (Source: Softnautics)

Performance of AI models on Containers vs Virtual Machine

Many experiments have been done to compare the performance of docker with various virtual machines in the market for AI deployments: The table below can give a general idea about performance and variances for both VM and Docker containers that will affect the deployment of the AI model.

Variance Virtual Machine Containers
Operating System Need a guest Shared
Booting Speed Slower than Traditional machine Faster than VM
Standardization OS standards specific in nature Application-specific in nature
Portability Not very portable Faster and easily ported
Servers needed Needs more Few servers
Security Hypervisor defines security Security is shared
Redundancy Level VM owns resources Shared OS so less redundancy
Hardware Abstraction Hardware abstracted Hardware access can be achieved
Resource Sharing More resources needed Fewer resources are needed and shared
Resource Isolation High Moderate
Memory High memory footprint Less memory footprint and shared
File Sharing File sharing is not possible Files can be shared

Table 1: Virtual machines vs containers (Source: Softnautics)

The broad take-ways from conclusions of all comparison experiments are as follows:

  • Containers have low overhead than VMs and performance was as good as compared non-virtualized version.
  • In high performance computing (HPC), containers perform better than hypervisor-based virtualizations.
  • Deep Learning compute workload is primarily offloaded to GPU, resulting in resource contention, which is substantial for numerous containers but low in VMs due to excellent resource isolation.
  • Servicing large AI models is typically done via REST API containers.
  • Multiple model servicing is done mostly with containers as they scale easily with fewer resources.

Now, let’s conclude container’s supremacy over any VM with the below experiment results collected by Kennedy Chengeta in his recent research. Based on the deep learning datasets of Prosper Lending and Lending Club datasets for classifications, the following tables compare the booting time, network latency, data download, and network delay for the 4 distinct virtualization technologies (KVM, Xen, Docker, Docker + Kubernetes). KVM (Kernel-based VM) is the baseline value for the others in the table.

Table 2: Lending Club Dataset Performance (Lower is better) (Source: Softnautics)

Table 3: Prosper Dataset (Lower is better) (Source: Softnautics)

As you see, the performance of Docker and Docker managed by Kubernetes performs better than KVM and Xen Hypervisors.

Do large-size AI models pose a challenge to container deployments?

As developers will use containers for training as well as inferencing their AI models, the most critical for both will be memory footprints. As AI architectures get bigger, the models trained on them also become larger spanning from 100 MB to 2 GB.  Such models become bulky to be carried in containers because containers are to be considered lightweights. Model compression techniques are used by developers to make them interoperable and lightweight. Model Quantization is the most popular compression technique where you reduce the size of a model by changing its memory footprints from float32 set to float16 or int8 set. Most of the pre-trained ready to consumes AI models provided by leading platforms are quantized models in containers.


In summary, here are the benefits of converting an entire AI application development to deployment pipeline into a container:

  • Separate containers for each AI model for different versions of frameworks, OS, and edge devices/ platforms.
  • Having a container for each AI model for customization of deployments. Ex: One container is developer-friendly while another is user-friendly and requires no coding to use.
  • Individual containers for each AI model for different releases or environments in the AI project (development team, QA team, UAT (User Acceptance Testing), etc.)

Container applications truly accelerate the AI application development-deployment pipeline more efficiently and help maintain and manage multiple models for multiple purposes.

Rakesh Nakod is an Associate Principal Engineer at Softnautics, an AI proficient having experience in developing and deploying AI solutions across computer vision, NLP, audio intelligence, and document mining. He also has vast experience in developing AI-based enterprise solutions and strives to solve real-world problems with AI. He is an avid food lover, passionate about sharing knowledge, and enjoys gaming, and playing cricket in his free time.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.


<!– –>

<!– –>



googletag.cmd.push(function() { googletag.display(‘div-gpt-ad-942957474691236830-3’); });

–> <!–

googletag.cmd.push(function() { googletag.display(‘div-gpt-ad-942957474691236830-42’); });


Spread the love

Leave a Reply

Your email address will not be published.