AI Academy Capstone Projects - ML Ops & AutoML

Jason and Oliver share their AI Academy Capstone project experiences and outcomes!
Date posted
18 October 2021
Reading time
5 minutes

Machine Learning Operations (ML Ops) is a hugely and increasingly important aspect of machine learning (ML) projects, especially those in the Cloud.

Well-implemented ML Ops solutions can greatly simplify and speed up the process of training, deploying, and monitoring models in production. They additionally result in repeatable, reproducible workflows even for the most complex ML projects. At Kainos, our ML Ops delivery is a key element of ensuring customers benefit from our Data and AI offerings.

Automated Machine Learning (AutoML) is a newer field and is slowly becoming important to some categories of ML projects but is less all-encompassing for ML projects of the future compared to ML Ops.

Particularly in the beginnings of a project, AutoML can reduce the labour time required for engineers to conduct extensive experimentation with models. This frees them up to focus on tasks in the workflow which add real value, such as deriving insights for data engineering or building a well-optimised platform architecture.

Today, Jason and Oliver want to share a bit about our research into AutoML and ML Ops respectively, and how we have helped Kainos retain an edge in the field. Each of us has conducted research in the field of ML since joining Kainos in July and produced useful resources Kainos can use to assist our clients.

We also want to explore the importance of ML Ops and AutoML going forward, as they become an increasingly common part of a data science workflow.

Capstone Project Context

Let’s start by providing some context on our projects.

Graduates and placement students joining the Kainos Data & AI practice took part in a 7-week AI Academy. The culminating event of the programme was to plan, develop, and present an individual Capstone project. The project was intended to develop our programming, data science, and client engagement capabilities.

As well as upskilling ourselves, we also aimed to produce resources (code and documentation) which could be reused in future within the Kainos Data & AI practice.

We each chose a Capstone project integrating ML Ops or AutoML in some capacity, to expand our own skills and add value to Kainos’ Data & AI offerings.

Now we will each give a summary of our individual projects.

ML Ops with E-sports summary - Oliver

My project was to develop a set of ML Ops tools which could be used to rapidly spin up AWS-based Cloud infrastructures for ML projects, encompassing data processing, model training, deployment, secure web access, and live monitoring. The goal was to make the tools as reusable and general as possible, giving Kainos another edge in the ML Ops field.

Another key goal was modularity, allowing for the tools to be applicable to a wider range of projects through picking and choosing only required elements of functionality, without being forced to use them all. The tools were implemented in code via the AWS Python SDK.

To achieve these goals, I developed several iterations of Cloud infrastructures and tool functionality, each more feature-rich and flexible than the last. In doing so, I gained valuable experience with AWS as well as a greater general understanding of Cloud deployments.

The final product allowed for a project to point the tools at a dataset, change a few configuration values, and execute a pipeline including data processing, model training, evaluation, secure API deployment, and live monitoring.

Automated Machine Learning summary - Jason

My project was an investigation of AutoML. The main goal of the project was to expand Kainos’ expertise in AutoML. I explored what AutoML is, what benefits it provides to data scientists, and what tools are available. I then focused on one specific tool – Amazon SageMaker Autopilot.

AutoML is a general term which encompasses a wide range of technologies and paradigms. Some tools automate a single component in the ML workflow, such as hyperparameter tuning. Others automate the entire process, from data processing to model deployment. Autopilot is one of the latter.

Autopilot makes the process of producing a high-quality model from a dataset amazingly easy. You simply upload your dataset of choice and Autopilot will spit out a model. No coding required. The simplicity of tools such as Autopilot raise the abstraction level in ML, reducing the time, skills, and grunt work required to produce effective models.

To demonstrate how Kainos could use Autopilot in future, I developed a program which automates the process of running an Autopilot job using the Python SDK. The tool allows a user to execute an Autopilot job with a single command. See the script help below.

 

Reflection and Comparison

Both projects share a core similarity, aiming to create reusable code which developers could use to rapidly kickstart Cloud projects. The projects are each able to provide significant value in this area but differ in exactly where this value is delivered.

While the ML Ops with E-sports project focused on rapid deployment of a specific model to a wider infrastructure, the AutoML project focused on testing a range of models before deploying the best to an Endpoint. In this regard the projects are tailored to different use cases.

The AutoML project is well suited to a proof-of-concept deployment in a scenario without pre-existing or domain-specific ideas for the model development process. It can also provide a great place to start for further investigation, such as by testing a variety of models and then focusing more development on the best performing few.

The ML Ops project is more suited to such a deployment where ideas for one or more of those stages have already been conceptualised. For instance, a project where domain knowledge or context from prior work is key in building a model.

The projects may in fact synergise in some cases. Starting from a point with little prior insight into the domain, the AutoML project could be used to generate a great starting point model for further investigation. The ML Ops project could then be used during further development, through the application of domain expertise to improve the initial model and managing deployment/monitoring in the Cloud.

Takeaways

What did we learn?


Our capstone projects allowed us to gain valuable experience working on ML projects using AWS. We were each able to increase our exposure to real-world ML use cases and how innovative tools can streamline the development process, saving time and effort. The knowledge and skills gained from the AI Academy have helped us to prepare for our transition from training to client projects.

What value do our projects provide to Kainos?

Used separately or in combination, these two tools can simplify the process of model generation and deployment, saving data scientists’ and AI engineers’ valuable time. We envision the use cases of these tools to be for spinning up proof-of-concepts or systems needed on a quick turnaround for a demo or similar. However, these tools are not silver bullets. Effort would still be required to customise the resulting ML solution to specific use cases.

Acknowledgements

A huge thanks is in order for all those involved in running the AI Academy, including the lead Robert Chin and our mentors Liam Ferris, Jay Mistry, and Bernard Adabankah. Their help and guidance throughout our first couple of months at Kainos was fundamental to our success and these projects wouldn’t have been possible without them.

Oliver Stanley
Graduate Artificial Intelligence Engineer · Kainos
Oliver is a Graduate Artificial Intelligence Engineer based in Birmingham. His background was originally in Economics, but he was drawn to the world of data science and is now working on internal projects within the Data and AI Practice in Kainos.
Jason Wright
Graduate Artificial Intelligence Engineer · Kainos
Jason joined Kainos as a Graduate Artificial Intelligence Engineer, based in Belfast. Prior to joining Kainos he completed a Master's in Computer Science from Queen's University Belfast, specialising in AI in his final year. Jason works to help create innovative AI & Machine Learning solutions.