A Claim: All Data Work Falls into One of Two Categories
The title of this post says it all, really. I’m a fan of simplicity, and this binary framing is something that came to me, I think, in the spring of 2022 when I was trying to visualize, at a high level, the different skills and activities that data workers could be involved with. That diagram was focused on a subset of all data workers, and it had a bunch of interconnected bubbles and had a particular slant towards the work of the consultancy where I was employed at the time.
What struck me from the exercise was not so much the specific bubbles, but the fact that all of those bubbles fell neatly into one of two different categories (which is how I ultimately arranged the diagram): data collection and management and data usage.
Earth shattering? Not exactly.
Useful? Perhaps. Perhaps…as a lens through which organizations can and should be viewing their investments1 in data.
There is NO inherent business value in the collection and management of data
Data collection only provides the potential for business value. Business value only gets realized through the degree to which the data is effectively put to use. This may seem obvious, but there are deeply entrenched beliefs that feel so logically based that they undermine this basic fact. With some light paraphrasing (I don’t want to be picking fights with anyone here), a few examples:
“Data collection is like the foundation of a house. If you don’t ensure that it is well-designed and of the highest quality, then everything built upon it (the data usage) will be inherently weakened.”
OR
“I think of data collection as potential energy. Like the water tower that provides the water pressure. Data usage (water from the tap) is impossible without it, but also less powerful if data collection isn’t done well.”
OR
“It’s absolutely critical to ensure the data collection is robust. That’s why it’s critical to outline all of the business questions that you want the data to answer up front, and then prioritize updates to your systems and processes to enable answering those questions.”
None of these perspectives are necessarily wrong, but they all lead organizations to gently kick the data usage can farther down the road: “We’ll get to that (actually using the data) as soon as we clean up our data collection.” The result? A perpetual investment in building potential to deliver business value, which winds up contributing $0 to the top line of the business.
Multiple forces push organizations to constantly increase their investments in data collection
If we step back and think about both economic forces and human nature, it’s easy to see how data collection sucks in resources like a Norwegian maelstrom:
Data collection is a tangible task–while Thought Leaders will go on at length about how data collection requires constant maintenance (that’s the “…and management” piece that I’ve dropped the farther this post goes along), the tangibility of data collection (“the data wasn’t there, and now it is”) lends itself to traditional prioritization and development processes. Organizations “know” how to do development.
Data collection requires technology–there’s money in that thar’ data! A key reason that Scott Brinker’s martech landscape has grown so dramatically since he first created it in 2011 is that “technology products” can readily scale–investors and entrepreneurs are drawn to the promise of a low variable cost business model like Odysseus to the sirens. The result? A lot of investment in new companies that collect (and manage) data, which leads to a flood of marketing and sales teams with polished messaging about how their tool, if purchased and implemented, will enable2 actionable insights and glory.
Consultants love data collection projects–this is really a result of the previous two items, but it’s worth calling out. Consultants thrive on crisply defined (tangible!) engagements, and they really thrive when they have developed repeatable expertise in a small set of technologies. While they still need to learn some of the basics of an organization’s operations, they can focus on implementing technology, and they can relatively rapidly scale their implementation chops by developing expertise and repeatable processes with supporting technologies.
Data collection work buys time–once a (tangible) data collection project is in work, the pressure to put data to use eases up in the short term: “Sure. We’re not getting meaningful value from our data now, but we identified the issue was [insert description of current data collection project], and we’re addressing that. Once that project is rolled out, we will surely be churning out actionable insights faster than we can act on them! The technology vendor we’re using as part of the work pretty much guaranteed it!”
It’s a lot of forces! Together, they wind up building a mentality that the data collection itself is the goal. It is freakily common to find technology vendors, consultants, and even data workers within an organization who sincerely believe that, once the data is available, “the business” will readily be able to put it to productive use. Yet…
The forces pushing organizations to invest in data usage are ethereal and fleeting
Effective data usage is hard. And, while there is some (tangible) technology involved (the tools being used to pull, join, munge, model, visualize, share, and deploy results), data usage is primarily a people and process operation. Both of these are…hard! People, for instance, are messy beings. They have to be found, managed, supported, and retained. Oh, to simply be able to buy a technology license and annually pay a renewal fee to keep the actionable insights flowing!
Processes are hard, too, and the ownership of those processes (or, really, frameworks) is murky at best. There are historical and systemic reasons for this, but we’ll call that a tangent for another time.
Ultimately, there is a belief/hope/wish that “if we have a robust set of data, then, surely, ‘the business’ will be able to get value from it!” When that doesn’t happen, phrases like “data literacy” and “self-service” get thrown around, but they typically have an undercurrent of, “What can I do to make these questions about the value we’re getting from our data just go away?! I have more data I need to collect, and these questions are a real nuisance!”
This imbalance is a problem
I’ll go ahead and be somewhat hyperbolic and claim that the imbalance in investment and focus between these two categories is one of the top reasons that organizations struggle to get the value they expect from their data.
I also recognize that I have a bias towards data usage. To me, that happens to be more challenging and more fun. And, it’s what I’m hoping I’ll be focusing the next phase of my career on (it’s what I’ve tried to focus the last couple of phases of my career on, but that means I’ve experienced the insidious strength of the data collection forces pulling myself, my colleagues, and my clients away from that work).
In my next post3, I’ll dig into data usage a bit more. In the meantime, I’d love to hear what you think about the simple delineation described in this post.
This is “investments” in the broadest sense possible, from technology licensing, data storage costs, process development and maintenance, and the people who are doing the data work.↩︎
As it turns out, “enable” is just a code word for “potential.”↩︎
Feel free to believe that I have a master content calendar with tightly established publication dates and a clearly defined set of topics. I mean, why not?↩︎