New Product Feature: Intelligent Auto-Suggestions in Group Search-and-Replace

At Mammoth Analytics, we are always looking to provide a best of breed application to aid in data preparation and cleansing to allow you to get to data analysis sooner and find data insights faster.  Today, we are excited to introduce Intelligent Auto-Suggestions in Group Search-and-Replace.  This new functionality eliminates repetitive, tedious tasks that are common with data cleansing.

Why is Data Cleansing Tedious?

As you may know, analysing data at first sight is a recipe for disaster, and this has only been accentuated with the buzz of big data. More data equates to more problems. We at Mammoth Analytics understand that cleaning and standardising data is a crucial step before data analysis can take place. There are multiple facets to cleaning up raw data, such as eliminating invalid data, finding missing data, addressing data non-uniformity, correcting typographical errors, and so on.

One of the biggest anomalies that exists in raw data is the presence of typographical errors, i.e. spelling errors or inconsistent letter case. This is common in the case of manual data entry or with data imported from multiple sources where customer names, product names, etc. may differ between systems. A seemingly insignificant typo can cause short- and long-term problems, leading to inaccurate records and analysis. Therefore, it is recommended that before data is analysed for insights, one needs to deal with these differences in letter cases and/or spelling.

Intelligent Search-and-Replace Simplifies Data Prep

To address typographical errors, Mammoth has offered for a while the ability to drag-and-drop entries to be replaced into various buckets and group them manually to replace them with a single word. However in the past, the user would have to iterate through this process for each different variation of the word (e.g. ABC v. abc vs a.b.c.), and this can be tedious, time consuming, and error-prone.

By introducing Intelligent Auto-Suggest for group search-and-replace, Mammoth now addresses the issue of having to repeat the process for each word variation. Mammoth accomplishes this by analysing all the words in a particular column and then clustering them into buckets based on the computed “distances” between each other. Of course, Mammoth allows the user to modify the word buckets manually to ensure 100% accuracy.  However, the intelligent auto-suggestions eliminates 80-90% of the manual work that is typically associated with this type of data cleansing.

Mammoth addresses a variety of data cleansing issues

Along with intelligent group search-and-replace, Mammoth offers various data transformation capabilities to clean and prepare the data (such as removing duplicates, checking type accuracies, search/replace, etc.) and an automation pipeline to make the process replicable week after week, month after month.

If you’re interested in learning more about Mammoth Analytics data management platform, please sign-up today for a free trial or reach out at hello@mammoth.io to schedule a demo.