Master data cleansing is the process of bringing data into a single standard with templates & attributes to establish consistency and uniformity, enriching and correcting data to univocally define each item, deduplicating or removing of same items from the catalogue to improve stock & procurement performance, and alignment of MDM processes & governance to ensure data quality in the future

Why choose automated master data cleansing?

Synopps cleans corporate master data more effectively than traditional, manual data cleansing techniques. Over the years of experience we have developed proprietary automating tools to empower the clean up processes
  • Items Standardization
    With the use of data-recognition and data decomposition technologies, and master data templates.
  • Catalogues Merging
    By mapping and reconciling of two or more catalogues to deliver a single version of truth master catalogue.
  • Duplicates Treatment
    Via automatic comparison of data blocks, using synonyms, format, type and many other factors to resolve item replication.
  • Processes Improvement
    By implementing the right policies to manage master data based on best practices and the specifics of your organization.
  • Top Data Quality
    By applying automated cleansing tools and manually validating with our expert team in accordance with defined rules and procedures.
  • Increase in Profitability
    From the correct accounting of warehouse balances, stopping excess purchases and the consumption of previously unaccounted stocks.

Master data cleansing process

Assess data quality and efforts required for cleansing

Develop templates and attach each record to an appropriate one

Standardize data and distribute across templates and attributes
Provide a list of items that should not be purchased
Implement MDM processes to ensure sustainable data quality

Data cleansing: step-by-step


Audit your master data and plan resources

We will perform fast analysis of the master data and as a result: define the project scope by excluding inactive or irrelevant items, identify potential duplicates, establish categories of cleansing based on identified classes, determine the necessary resources and timeline for project implementation. Typically, this activity does not take longer than 1 week.

Automatic errors correction

Data recognition in records

Develop templates and attach each record to an appropriate one

We will apply the following 5 steps methodology to define templates

  1. Identify a noun per each record (a defining word of an item, usually the first word in the description)
  2. Attach templates per all nouns
  3. Define attributes per each template (based on best practice, internal expert databases and current catalogue)
  4. Validate mandatory attributes, that are required to be filled in in order for an item to be exactly identified
  5. Confirm final templates as a result of cleansing and validation cycles


Automatic correction of errors in master data records

We will scan the dataset with our Synopps platform and correct more than 10 000 errors in a typical catalogue of 60 000 records, including replacing untypable characters, removing excess symbols, correcting syntax and spelling errors, replacing letters incorrectly used as numbers and many more. This activity will ensure better decomposition in the following step.

Automatic errors correction
Data recognition in records

Recognition of data blocks and records normalization

An adaptive system of algorithms recognizes more than 220 000 significant blocks such as item classes, characteristics, and their values, part numbers and variations, values via keywords, etc. in a typical catalogue. To normalize and clean these recognized blocks, we apply the following steps:

  1. Confirm and validate data blocks
  2. Standardize attribute values per templates & cleansing document
  3. Redistribute the order of blocks according to the template
  4. Classify each record per agreed classifier
  5. Categorize each item per normalization type & data enrichment of mandatory attributes (optional)

Duplicates search based on recognized blocks

Identify over 7 000 potential duplicates in a typical catalogue of 60 000 records with Synopps algorithms, that compare classes considering synonyms, compare part numbers and their variations, without considering formatting, and compare values of characteristics, rather than their textual representation. Then validate with our expert team and end-users.

Duplicates search

Master data process

Redesign of processes to ensure the quality of master data

We redesign Master Data Management processes, such as new record creation or editing of an existing record, based on the world's best practices and business specifics. The deliverables include gap analysis and to-be process design, suitable IT solution or excel-based tool implementation, teams training and documentation, hand-over and support to ensure sustainable data quality.


Calculation of budget savings and blocking purchase requests

We calculate the forecast of warehouse balances and provide a list of positions that should not be purchased, thereby ensuring a multiple payback of the project. We apply a decision matrix based on a stock position, consumption and purchase requests, in order to identify an invisible stock that should be consumed, and excessing requests that should be blocked to deliver immediate cash savings.

Master data savings

Selected project in master data cleansing



An international mining company with several assets across various countries had a centralized catalogue with more than 75,000 SKUs. The lack of necessary information for existing items and misaligned processes for new item creation had resulted in several duplicated SKUs, inflated warehouse inventories and excessive purchases.


  1. Catalogue data was processed using automatic and manual methodologies
  2. A decision support matrix for resolving the handling of duplicates was developed and applied, using inventories, requests and open orders as feeds
  3. Missing data in SKU data sets was added into the catalogue from several sources, such as manufacturers' catalogs and online resources, using automated tools


  • 60,000 SKUs were normalized according to the approved data standards and enriched with additional data
  • 4,000 items were identified as duplicates, with half of that removed as redundant SKUs. The process of defining reference items was automated and reduced from one week of manual effort to three minutes
  • The items catalogue was reduced by 19% in size

Our clients around the world

Weatherford logo
Trident Energy logo
Nordgold logo
Nornickel logo
Highland Gold logo
Tiger Realm Coal logo
Statoil logo
Vale logo
Shell logo
Explore our App+ MDM
Smart IT solution to control and ensure high-quality catalogue with built-in optimization features

Frequently Asked Questions

Request our service or ask a question
By clicking the button you agree to our Privacy Policy