How to combine two tables by time series without coding?

Here’s a scenario that happens to be typical in the manufacturing world - Dataset A (ERP data) contains a date range, and Dataset B (Sensor data) contains rows that fall within the date range of Dataset A.

Example:

time series

Fig. 191 Example datasets

Objective

Our goal is to combine the two datasets with “Time” as the common column. We need to place the Order ID next to the Sensor values so it looks like this:

time series combined

Fig. 192 Combined result

The challenges

At the surface, this may look like a simple JOIN (in SQL) or VLOOKUP (in Excel) exercise, but it’s not. Because the “Time” columns in both tables describe a range of time values but not exact times. This operation involves a complex, multi-step SQL operation. If you’re trying to do this with code, you need to:

  • First, perform an Outer JOIN with the two datasets

  • Then deal with missing or null values. (which you need to solve through another non-trivial piece of code.)

The code-free alternative

With Mammoth, you can acheive these results in a few simple steps, no coding involved of course. If you don’t know about Mammoth Analytics , it is a lightweight, code-free data management platform. Mammoth offers powerful tools for the entire data journey, including data retrieval, consolidation, storage, cleanup, reshaping, analysis, insights, alerts and more.

Let us try to solve the above problem with Mammoth and see how simple and efficient it is. Shall we?

Solved step-by-step

The following steps describe how we can perform this task, easily, and in a couple of minutes with Mammoth.

First, you need to bring your data into Mammoth. We offer a lot of easy ways to do that.

For example, I just uploaded our sample CSV files from my computer into Mammoth.

time series combined

Fig. 193 Importing example datasets to Mammoth

Now, our goal is to combine two separate datasets into one and then perform some transformations on the Combined Data.

time series combined

Fig. 194 Our goal

  • Step 1: Send the ERP data into another Dataset called “Combined Data”

To do this, open up any of the datasets, go to “Data Preparation” and use the “Merge & Branch out > Branch out to dataset” option.

time series combined

Fig. 195 Navigating to the Branch out to dataset function

Now send this data to a new Dataset called “Combined Data”.

time series combined

Fig. 196 Branching out the two datasets into one resultant dataset

If you’ve selected “Keep this task in the data pipeline”, you’ll now see a step in the pipeline indicating your action.

time series combined

Fig. 197 Branch out recorded as Task in the Pipeline

  • Step 2: Send the Sensor data into the same “Combined Data” Dataset

Open up the Sensor Dataset and perform similar steps as Step 1, only this time it needs to be sent to the existing “Combined Data” dataset.

time series combined

Fig. 198 Branching out the sensor DS to the same dataset

In your Data Library, you’ll see a new dataset called “Combined Data”.

time series combined

Fig. 199 Resultant dataset appears in Mammoth’s Data Library

When you open that file, you will see the following:

time series combined

Fig. 200 The resultant dataset

  • Step 3: Fill the missing or null values in the “Combined Data” Dataset.

First, let’s sort the data in ascending order by Time.

time series combined

Fig. 201 Sorting the “Time” column

Which gives you this:

time series combined

Fig. 202 Sorted dataset

Now, to fill the missing Order ID values, we just need to use the Fill Missing Values function.

time series combined

Fig. 203 Filling missing values

Select the Order ID column in “fill empty cells in“ drop-down and to fill the values from the cells above the empty cell, select “above“ in “with values from“ drop-down. Its worth noticing that the grouping is not applicable here, so we will fill the missing values without adding a grouping rule.

time series combined

Fig. 204 Fill missing values dialog box

Which gives you this final result:

time series combined

Fig. 205 Final result

And we’re done

We’ve achieved a code-free solution to combining two time-series datasets in a couple of minutes. An extra bonus — automation is built-in. New data coming into any of the original datasets automatically sends it to the “Combined Data” Dataset. Once you’re done with this task, you can explore other ways of getting your data in the right shape using Mammoth. Want to give it a spin, book a demo here.