Datactics has won the Most Innovative Data Quality Initiative at the A-Team Group’s Innovation Awards 2022 which celebrate projects and teams that make use of new or emerging technologies to deliver high-value solutions for financial services.
Datactics won the A-Team Innovation award for the creation of Rapid Match, a system which allowed financial analysts to understand the geographical allocation of financial loans made by UK government during the 2020/21 Covid crisis augmented with a breakdown of industry type.
Datactics was selected and funded by Innovate UK to investigate the use of its self-service platform to solve a challenge for information analysts: How does an organization reduce time in wrangling data to make it fit for good quality analytics? The firm has reported that it has been regularly asked this question by clients spending too much time on manual preparation of data and seeing little return on investment (ROI) for their efforts.
The Datactics solution focused on challenges associated with data preparation when an organisation seeks to create rapid data analytics but is faced with disparate data sources, (often stored in different formats), siloed information and poor underlying data quality.
Summary of business benefits of the Datactics solution:
- Addresses data quality and matching at scale challenges associated with joining large amounts of messy, incomplete data in varying formats, from a multiple sources.
- Provide a reliable ‘match engine’ allowing government and organisations to accurately and securely integrate diverse sources of data.
- Automated and reproducible platform to ingest, cleanse and match and update datasets for downstream analysis.
- Provides systematic, reproducible pipelines to address time consuming data quality, preparation and matching tasks to produce complete, high quality and timely data for decision making.
Datactics’ Head of Software Development, Dr Fiona Browne, who led the project said,
“We were very aware that Companies House information is used in almost every KYC/AML system in the UK and we wanted to develop automated techniques for improving the quality of this information and to make it easy to ingest and to query. I’m glad to say that we achieved this ambition through building systematic, reproducible data quality pipelines that address time consuming tasks such as data wrangling and matching. We are augmenting our early work by looking at applications of machine learning and network analysis in this space for downstream tasks such as fraud detection and onboarding which are underpinned by data quality. ”
In detail: How it works
The system extracted information from multiple financial and geographical data sources and used automated data pipelines for preparing, scrubbing and matching multiple data types before presenting the data in a very easy to query dashboard. These pipelines are systematic and re-producible enabling the process to be re-run when datasets are updated, or new datasets added.
The process is transparent for auditability. Furthermore, the process provides a reliable ‘match engine’ allowing organisations to accurately and securely integrate diverse sources of data. A particular focus of the project was on using UK Companies House as a master source relating to company name/address validation and understanding the relationship between companies and their owners. A view of the resulting analytical dashboard is presented in Figure 1 below.
Figure 1: Analytics Dashboard Screen for Rapid Match illustrating the breakdown of loans across industries in England
Stuart Harvey, CEO of Datactics commented:
“Ultimately we wanted to prove that it was possible to create an easy-to-use solution in which a data team can address time-consuming data quality, preparation and matching tasks in order to create complete, high quality and timely data for decision making.
Rapid Match took advantage of existing strengths in the Datactics’ platform and added benefits from machine learning. We addressed data quality and matching at scale – joining large sets of messy data in varying formats from multiple sources. We implemented a reliable ‘match engine’ which allowed for fuzzy matching from these diverse sources. We made use of data op’s automation to ingest, clean and match data with minimum human input.
A key part of the challenge in building Rapid Match was seamless integration with Companies House information as a trusted data source. Companies House data is used in almost every KYC and AML process involving UK entities. It contains over 4 million companies updating information on 500,000 of these per year. Since it’s a fairly raw source of open data and many end users have significant challenges with data quality. The Datactics’ solution provided easy access to clean Companies House data via REST API and we used sophisticated network analysis to understand the relationship between companies and their owners.
We are grateful to Innovate UK for their support of this work and we’d welcome further opportunities to collaborate with clients or partners if our recent work in this area is of interest to you”.
Angela Wilbraham, CEO of the A-Team Group, who hosted the A-Team Innovation Awards 2022, commented,
“Our A-Team Innovation Awards 2022 celebrate and reward those companies at the forefront of innovation within our industry. We congratulate Datactics in winning the Most innovative data quality initiative award in recognition of their excellence in driving forward progress in capital markets capabilities.”
To read more about the A-Team’s Innovation awards and view the full report of worthy winners, please take a look here.