Digital Services

Digital Services

Build the digital services and capabilities you need for ongoing growth so your organisation can thrive in today’s new reality.

Discover Digital Services

Services

Digital Advisory
Cloud and engineering
Data and AI
AI Business Solutions
User-centred design
Managed services

Impacts

Building digital transformation
Driving continuous improvement
Improving business performance
Improving customer engagement
Workday

Workday

Introduce, embed, and amplify the power of Workday across your business to your precise strategic needs, today and for the future.

Discover Workday

Services

Deploy Workday
Test Workday
Audit Workday
Spark&Grow
Workday Extend
Workday Adaptive Planning
Workday Peakon Employee Voice
Pulsora

Products

Kainos Smart
Smart Test
Smart Audit
Smart Shield
Employee Document Management
EU Pay Transparency

Workday Rising

Rising Resource Hub
Industries

Industries

With over 30 years of digital design, development, and delivery under our belts, if you’ve got a digital challenge, we’ll work with you to get game-changing results.

Industries

Industries

Financial services
Insurance
Payments
Education
Government
Healthcare and life sciences
Insights

Insights

The latest news, developments and insights from our experts.

Insights

Workday collection
Business
Diversity and inclusion
People

Technology
Trends
Testing and assurance
Transformation
Careers

Careers

At Kainos, our people are at the heart of everything and once you're here there's no limit to how far you can go.

Careers

Our culture
Graduates and students
Kainos Academy
Opportunities

Get in touch

Kainos and AI: One Simple Tool to Improve the Efficiency of your Machine Learning Systems

Home · Insights · Kainos and AI: One Simple Tool to Improve the Efficiency of your Machine Learning Systems

Date posted

18 April 2019

Reading time

12 Minutes

Conor McCormick

Kainos and AI: One Simple Tool to Improve the Efficiency of your Machine Learning Systems

Have you ever been a position where you didn't have enough data? Would that chatbot, recommendation system or fraud detector become possible if more data were available? If so, keep reading??? ??this blog is for you!

Background

The Text Augmenter (henceforth referred to as TA) is a program which can be used to augment text records (generate new records from existing records) in order to supply Machine Learning projects with additional training data. I used research done by Jake Young as a jumping off point for TA. I wanted to use the methods that he had come up with but in a slightly more user-friendly way to be used in the team.

Now you know why I made this, so what does this actually look like?

/><figcaption>This is the original data value I provided.<br><br></figcaption></figure>

<figure ><img src=

/></figure>

<p>Due to the nature of Synonym Augmentation, the number of results can vary wildly depending on the length of the original values. If longer sentences are used, then there will be more words to augment, and thus a larger return of data.</p>

<p><strong>Validation:</strong> To ensure that the results of the Synonym Augmentation are usable for any given project, the user can specify a 'blacklist' which is a list of words that the user defines which will be ignored by the Augmentation function.</p>

<p><em>For example, the word 'one' gives 5 different data records. Using the blacklist means that none of these would show up in the augmented dataset. This is useful as it means that certain things that shouldn't be augmented, for example words in the name of a law will not be augmented.</em></p>

<h3>Translation Augmentation</h3>

<p>This takes the data records and translates them into a different language then back into English using <a rel=

About the author

Conor McCormick