The Soft Underbelly of “Hard Data”

9 July 2015

What exactly are “hard data”? Rocks are hard, but data? Ink on paper or electrons in a “hard drive” are hardly hard. (Indeed, the latter are often called “soft copy.”)

If you must have a metaphor, try clouds in the sky. You can see them clearly from a distance, but up close they are obscure. You can poke your hand through them and feel nothing. “Hard” is the illusion of having turned something real into a number. That guy over there is not Simon, but 4.7 on some psychologist’s scale. The company didn’t just do well; it sold 49 trillion widgets. Isn’t that clear enough?

Soft data, in contrast, can be fuzzy, ambiguous, subjective—at least from a distance. They usually require judgment; like Simon, they can’t even be transmitted electronically. In fact, sometimes they are no more than gossip, hearsay, impression (for example, the rumor going around that most of those widgets are proving defective).

So the dice are loaded. Hard data win every time, at least until they hit the mushy brains of us human beings, living in our soft societies. Hence we had better consider the soft underbelly of these hard data.

1. Hard data can be too general. Alone, they can even be sterile, if not impotent. “No matter what I told him,” complained one of the subjects of Kinsey’s famous study of sexual behavior in the human male, “he just looked at me straight in the eye and asked ‘How many times?”’1 A little bit of the nuance lost, no? (For starters, what exactly constitutes a “time”? And whose?)

Hard data may provide the basis for description, but often not for explanation. So the sales went up. Why? Because the market was expanding? (You can probably get a number on that.) Because a key competitor was doing dumb things? (No numbers on that, just more gossip.) Because our own management was brilliant?  (That’s objective! …says the management.) Or else because it lowered quality to cut the price? (Try to get the data on that.) All this suggests that we usually need soft data to explain what’s behind the hard numbers—for example, hearsay about what the competitor is doing, gossip about quality in our own factory.

2. Hard data can be too aggregated. How are these hard data presented? Not widget-by-widget. Usually all the widgets are added up to provide one number: total sales. Likewise with that quintessential bottom line: the whole company wrapped up in that one number. Think of all the life lost in that number, and all the reality. It is fine to see the forest from the trees…unless you are in the lumber business. Most managers are in the lumber business: they also need to know about the trees. Too much managing happens in a helicopter, where the trees look like a green carpet.

3. Much hard data arrives too late. Information takes time to “harden.” Don’t be fooled by the speed of those electrons racing around the Internet. Happenings first have to be recorded as “facts”—that can take time—and then aggregated into reports, which may have to await some predetermined schedule (like the end of a quarter). By then, fed up with the quality, the customers may have run off with the competitor. The gossip may have indicated this first, softly, and the grapevine may have carried it around, quickly. But in a world of hard data, that hardly counts.

4. Finally, a surprising amount of hard data is not reliable. They look good, all those definitive little numbers on a pretty screen. But where did they come from? Lift up the rock over hard data and have a look at what’s crawling underneath. “Public agencies are very keen on amassing statistics—they collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But what you just never forget is that every one of those figures comes in the first instance from the village watchman, who just puts down what he damn pleases.”2

And not only public agencies. Most organizations these days are obsessed with the numbers. Yet who goes back to find out what the watchmen put down, especially today when he is some kind of automaton? Or what some manager in search of a promotion put down? Have you ever met a number that could not be gamed—a reject count in a factory or a citation count in a university (just cite your own articles), let alone that quintessential “bottom line”? Moreover, even if the recorded facts were reliable in the first instance, something is always lost in the process of quantification. Numbers get rounded up, mistakes get made, nuances get lost.3  

All of this is not a plea for getting rid of hard data. That makes no more sense than getting rid of soft data. It is a plea to cease being mesmerized by the measures. We all know about using hard facts to check out the soft hunches. Well, how about using soft hunches to check out the hard facts (“eyeballing” the numbers)?

So what’s the bottom line? There’s an old joke that if you meet [someone from a country I can't mention], hit him in the face. He’ll know why. Well, if you meet a number, challenge it. You’ll find out why.


© Henry Mintzberg 2015. In fact, I sketched out these ideas long ago, before the Internet descended upon us (Impediments to the Use of Management Information [monograph of the National Association of Accountants [U.S.] and Society of Industrial Accountants [Canada], 1975]) LINK, and revised them in various publications ever since. Related TWOGS include: “If you can’t measure it, you had better manage it”; “How National Happiness became gross”; “Downsizing as 21st Century bloodletting”; “Productive and Destructive Productivity”; and “What could possibly be wrong with efficiency? Plenty”.  


1 From A. Kaplan The Conduct of Inquiry (Chandler, 1964)

Attributed to Sir Josiah Stamp 1928, cited in Maltz, M. D. (1997) Bridging Gaps in Police Crime Data: Discussion Paper, BJS Fellow Program, Bureau of Justice Statistics.

3 In his account of “statistics and planning” in the British Air Ministry during World War II, Ely Devons wrote that the collection of such data was extremely difficult and subtle, demanding “a high degree of skill,” yet it “was treated . . . as inferior, degrading and routine work on which the most inefficient clerical staff could best be employed” (p. 134). Errors entered the data in all kinds of ways, even just treating months as normal although all included some holiday or other. “Figures were often merely a useful way of summing up judgment and guesswork.” Sometimes they were even “developed through ‘statistical bargaining.’ But ‘once a figure was put forward . . . no one was able by rational argument to demonstrate that it was wrong.” And when those figures were called “statistics,” they acquired the authority and sanctity of Holy Writ.” (E. Devons Planning in Practice: Essays in Aircraft Planning in War-time, Cambridge University Press, 1950:155)