Expensive professionals, low-value jobs

Data is big and time is scarce. How do we get the answers we need in time?

In our rapid-paced economy, analytics is intrinsically constrained by data’s “freshness.” Old answers are enemy number one of any business trying to become data-driven. Thus, they become a big pitfall when combined with data-driven strategic planning, which works with the following steps:

  • Set objectives
  • Set KPIs
  • Create a plan
  • Execute
  • Measure
  • Course correct
  • Repeat

In a data-driven business, there is constant evaluating and measuring of actions. As a result, actions that work well are reinforced, and those that don’t are promptly replaced. In this sense, data allows businesses to enter a never-ending cycle of iterations, getting feedback on each action, almost in real-time. Furthermore, this constant, instant feedback is what nurtures a data-driven business.

Data at work

Successful enterprises live and die based on their ability to find insights and course correct on a timely basis. Certainly, if Amazon or Google had to wait months or years to determine whether a product, service, plan, or strategy worked out, it would be too late for the team to implement corrective measures.

Real-time measuring enables teams to identify non-performing products, services, and plans before they drain valuable resources. Above all, the more readily data is available, the more ability a team has to conduct small experiments. This, in turn, encourages team members to try new ideas, analyze their results, and course correct.

The window to react is short because the main competitive advantage data-driven businesses have is reaction time. Consequently, these companies regularly test the market and their customers. As a result, they discover new trends and preferences before their competitors. They exploit and leverage these insights before their competitors are aware of “the new thing” and follow their lead.

For teams to be able to take advantage of instant feedback and react immediately, these big data-driven companies need lots of analysts and data scientists. Above all their success relies on managing these assets and deriving the most value out of them. This enables different teams to go through their activities, continually measure, and iterate.

One effective data management strategy

Data management is less of an issue for these big tech companies. They certainly have enough in-house resources to deal with it. There are three possible approaches to data management:

Centralized Data Approach

The first is to have a big IT team capable of processing data and feed the processed data to each department or team. IT provides each team with the information they need to work.

Decentralized Data Approach

The second approach, opposite to the first one, is to train and equip each team member with the right skills and tools for her to manage their data themselves.

Hybrid Approach

The third approach is a mix of the first two. In this scenario, the company has an IT team that deals with most of the technical heavy lifting. After this core data preparation, each team deals with the data they need to do their jobs.

Big tech companies usually operate using the third approach. This, in turn, enables for leaner operations, reducing the dependency on IT to the maximum extent possible. They have lots of data scientists and analysts, but they work across teams within the organization. In contrast, most companies keep their analysts isolated in one department.

For many smaller companies, managing data can become a significant challenge as quickly as data flows in. Finding people with the right skills and providing the best tools can be expensive and thus, get a lower priority than, say, hiring salespeople and deploying a functional CRM.

For better or for worse, big tech companies and disruptive startups already changed the playing ground. So, if you want your company to compete and eventually succeed in this environment, you will need to make sense of data and reach quickly. Companies and managers know this, and data infrastructure is a reported priority for over a quarter of EMEA companies (ComputerWeekly Survey), and it shows no sign of retreat.

Most analysts’ Achilles Heel

The importance of well-managed, valuable data as an asset is almost not contested. Because of this trend, one of the most valued professionals right now are data scientists and technical analysts. They enjoy high salaries for their particular, technical job, and one rarely questions that. They do a job that companies need and that most people don’t have the skills and experience to do.

These professionals are paid to manage data assets and conduct data analysis. Above all, data analysis is a process based on asking questions (Did campaign Z meet its targets? What were our primary growth drivers? etc.) and getting answers. Consequently, analysts’ work is supposed to influence or guide business decisions to achieve goals, reduce inefficiencies, identify friction points, among countless others. However, some estimates show that around 60%-80% of the time of data scientists and analysts is devoted to cleaning and preparing data, as opposed to close to 20% of the time, which they reportedly dedicate to proper data analysis. Companies are not getting the value for which they are paying.

Just imagine paying a top construction worker (highly specialized, skilled, experienced, and expensive) to clean around the construction site 60%-80% of her time to be able to work the other 20%. If you were managing those operations, you would make sure that she would spend most of her time doing her very specialized and valuable job.

Same with any other profession. Take a lawyer or a consultant or a manager. You would not allow for any of those people to clean their offices or to classify papers 80% of the time and pay for that. It’s hard to miss the problem when we put it in those terms. You might be paying top salaries to highly qualified people to get information assets ready to be used, and that’s costing you time and money. Why is this happening?

Dirty, dirty data

When data is too big and too messy, analysts and data scientists have to spend long hours creating workflows that leave the data analysis ready. Sometimes it’s a one-time thing, but most of the time, it’s a never-ending, iterative work. Lines and lines of code start to stack up, tracking the actual changes to data becomes difficult, and the operational cost and complexity increases.

This manual data-cleaning process could be hard to scale and becomes time – and labor-intensive. If we center the discussion around skills, it’s clear that analysts and data scientists have the right skills to prepare and clean data, but it’s also clear that it’s not the best use of their time and skillset. That’s where equipping data scientists and analysts with the right tools become important.

Any tool that reduces the time your analysts and data scientists use preparing and cleaning messy data adds value to data as an asset. Yet, you need a tool that can help you perform these complex, multi-step operations and has the capabilities to automate and scale. If those specifications are not met, up goes the time your analyst will spend on data preparation.

The ideal tool

In an ideal world, data management and analytics tools should:

Provide a robust toolkit

There are seemingly infinite operations that an analyst might need to perform to get data in a ready-to-use state, and the right analytics tool should offer a comprehensive set of capabilities. If not, we go back to having an inefficient process, where you might need one platform to solve one particular problem.

Record the steps the analyst took to prepare each dataset.

When new batches of data from the same sources arrive, they get prepared and cleaned automatically. Automation is key to scaling.

Offer a transparent and flexible data flow.

It’s essential for organizations and individuals to see and understand the transformations made on a dataset to prevent biases or misleading conclusions from influencing business decisions.

Analysts will set up the number of pipelines needed for each source and destination (teams within the company). These pipelines automatically process new batches of data once they are set. Anybody can own their data, review the transformations the analysts programmed and can modify them if they want to.

In this ideal world, all companies will operate with the hybrid approach to data management. Everybody in the organization manages and analyzes the information they need to boost their performance. The IT team provides support with the most complex problems and their answers. Data cleaning and preparation is dealt with using the right tool, and analysts and data scientists’ skills are used to conduct actual data analysis.

We should all be analysts

Data empowerment is the solution for this pressing resources allocation issue. Consequently, equipping data scientists, analysts and non-technical business users with the right tools can completely revert the time allocation for data-related jobs. In conclusion, keep your analysts and data scientists happy and productive by eliminating needless prep work and focusing on doing their actual job. Let machines clean the mess that is raw, unstructured data, and let humans ask questions and discover insights.

If you’ve enjoyed this post, please share it with others. Sign up to our regular newsletter. You will benefit from similar articles direct to your inbox.