Massive Online Retail Data Error Uncovered

It’s common knowledge that Kodak’s demise came at the hands of digital. But few know the whole story of why that happened. Kodak relied, too much it turns out, on a lot of bad data about how much of a threat digital was to its business.

Kodak didn’t think it had much to worry about – until it did – and by then, it was too late.

Their “Kodak Moment” came in 2003 – seven years after its best year ever – and not long after a board report concluded digital was nothing to really worry about.

In 2016, retail is on the verge of its own “Kodak Moment” – and not because it doesn’t understand the impact of digital to its future.

But because it, too, has been relying on bad data about how much of an impact digital and eCommerce has had on its business. 

And making decisions based on that bad data for years.

My colleagues at MPD and PYMNTS found a massive mistake in how retail and Census data is reported – a mistake that has severely underreported the impact of digital sales on physical retail sales.

I’m going to share that story with you now.

But first, a little context, because you know how much I like a good story put in the right context.


1996 was a very good year for Kodak.

It had a commanding market share: two-thirds of the global camera and film market.

It was a beloved brand – in fact, one of the most valuable in the world behind McDonald’s, Coke and Disney.

It was a sales and profit machine with annual revenues of $16 billion that year and profits of more than $1 billion. In fact, a Harvard Business School case study said that Kodak accounted for 90 percent of film sales and 85 percent of camera sales in the U.S. in 1976 – a position that didn’t waver all that much over the next two decades.


Life was grand, but Kodak’s CEO understood that digital was coming and wanted to be ready. Kodak’s CEO knew that digital would be a threat to its business model. Kodak’s “razor and blades” model banked on selling cameras at decent prices and the sales of film for those cameras forever. No film camera sales – no film sales.

Kodak took that threat so seriously that it got to work producing a digital camera – and, in fact, produced the world’s first. And, in the 1990s the company commissioned research to better understand the growth of digital cameras and where the adoption might be toward the end of the decade. Those studies showed the projected growth of digital cameras to be relatively low, at least outside of Japan, something on the order of 400,000 units a year. Against the tens of millions of film cameras projected to be sold, Kodak concluded that digital was coming but not with the force once thought. Digital, they concluded based on that data, was a niche market for “power users” and therefore not much of a threat to Kodak’s core consumer. 

The Kodak Board and senior execs must have been pretty happy when that data was presented to them.

But, as it turns out, not for long.

Those conclusions, obviously, turned out to be very wrong – the unfortunate consequence of relying on bad data that drove some very bad decisions by Kodak management.     

In 1999, IDC reported that digital camera sales were projected to be 4.7 million that year – a far cry from the 400,000 that Kodak’s research disclosed. And, in 2003, just seven years after one of Kodak’s best years ever, and a few years after the research report, which I’m sure was filled with lots of gorgeous charts and perfectly formatted data, that said, “No problem, don’t worry!” the sales of digital cameras completely swamped the sales of film cameras.

Their “Kodak Moment” had arrived. 

That downward decline of Kodak continued — slowly but surely — with the shuttering of plants and film processing labs, and the shrinking of its workforce. In 2012, Kodak filed for Chapter 11 bankruptcy, was delisted from the New York Stock Exchange and sold its patent portfolio for $527M to a group of 15 companies including Apple, Google, Amazon, Adobe and Microsoft.   

Although Kodak emerged from bankruptcy in 2013, it struggles today to recover from the digital tsunami that it, and others, seemed not to see coming. Kodak’s fall was really more of a long, slow slide: 36 years after Apple unveiled the Apple I, 16 years after the start of the commercial Internet, and 12 years after the first cellphone with a camera was launched. Today, its CEO is searching through the remnants of Kodak’s storied part to see what can be made of its IP. “We missed enormous opportunities,” says Jeff Clarke, Kodak’s CEO. “We’ll never be able to prosecute the value of our intellectual property with Kodak-branded sales.” Kodak’s market cap today is roughly $800 million.

And to think, in 1996, Kodak was riding high, even though in retrospect it is clear that its business was collapsing below it. Bad data masked the real threat to the business.


In 2016, I believe that physical retail may be facing its own “Kodak Moment.”

How many times have you heard retail experts assuage the concerns of physical retailers over the impact of online sales to their business by saying:

“Don’t worry – 94 percent of retail sales still happen in a physical store?”

This is from Forbes in July of 2014: 

“Indeed, despite the hubbub over digital commerce, 94 percent of total retail sales are still generated at brick-and-mortar stores, according to data from market research firm eMarketer.”

“The buzz given to Amazon, eBay and Alibaba far outweighs their true sway in the marketplace,” Mike Moriarty, co-author of the A.T. Kearney Omnichannel Shopping Preferences Study, told Forbes. “Particularly when you consider that Amazon increased their retail revenue from 2009 to 2013 by $50 billion, but their profits went up zero.”

Yeah, but … 

It turns out that Forbes and everyone else in the market that reports retail sales has been relying on data from the U.S. Census Bureau.

And that data is wrong. 

Massively wrong.


The Census Bureau is the source for reporting physical and retail sales in the U.S. It’s the gold standard. Companies file reports with the Census Bureau according to strict guidelines. Depending on what the Census is covering, companies send this data in monthly, quarterly, or yearly. It’s why everyone uses them as a source. And, hey, while it’s not the IRS, it’s the main government statistical agency, so companies must take their data obligations pretty seriously, right?

But my colleagues at MPD, economists David Evans and Dick Schmalensee, found a flaw in that reporting, a flaw that was uncovered as they were writing the chapter on retail reinvention for their forthcoming Harvard Business School Press book on platform businesses, Matchmakers.  Scott Murray, Head of Data Analytics at PYMNTS, was working closely with them on their analysis.

As Evans, Murray and Schmalensee started to dig into the data on the impact of online sales to brick-and-mortar sales, something didn’t quite add up, literally. All of their field research showed that physical retail sales were cratering, but then the Census data said not much was happening. 

Then they made a big discovery.

One of the biggest online retailers,, wasn’t in the Census figures on online sales for 2013. The Census had clearly missed more than $6 billion of online sales. 


And that led them on a journey to find out what else was missing. 

Six months of analysis and some email correspondence with the Census folks later, they uncovered a data bombshell: online sales as a percentage of all retail sales has been undercounted – and by a lot.

Online sales as a percentage of physical retail sales has been undercounted by about one-third each and every year for at least the last five. 

Evans, Murray and Schmalensee have documented that reporting error back to 2010. They suspect that it’s been underreported for even longer than that – probably ever since the start of the online sales revolution more than two decades ago.

But let me put this in practical context.

This Census Data error means that in 2014, eCommerce as a percent of sales wasn’t 6.4 percent as the Census reported, but 8.2 percent.

But here’s the really big news, we think.

The Census Data for 2015 aren’t all in yet, but we’re pretty confident that they will report something in the neighborhood of about 7.3 percent. Correcting for the one-third mistake, Evans et al. think that right number is more like 9.3 percent.

If we ignore sales by Gas Stations, which currently aren’t selling anything online, online retail sales will hit 10.3% of all retail sales.

Which means that for the first time ever, in the United States, the percent of retail sales that take place online will hit double digits and the percentage of retail sales taking place in a physical store won’t start with a “9.”

Here’s a chart which shows the estimates of the online retail sales for 2010-2015 with and without gasoline, just to show you visually, what we’re talking about.

E-Commerce as a Percentage of Retail Sales


But these figures just report the average across all retail categories from motor vehicles to health products. 

The other big data error these guys discovered was in reporting what physical retail stores themselves are doing online. 

If you looked at the Census data, you’d think physical retailers were just sitting out the online revolution. It turns out Census missed a massive amount of online activity by the online divisions of physical retailers themselves, which of course, all physical retailers are doing now – and some more than others.

In some retail sectors, like general merchandise (think Walmart, Kohl’s, Target), clothing (think Macy’s, Nordstrom, Gap and all of those specialty retailers), home furnishings (think Williams-Sonoma and Crate & Barrel), building materials (think Home Depot, Lowe’s) and electronics and appliance stores (think Best Buy and Fry’s) the undercounting of online retail sales is orders of magnitude off – in the hundreds of percent.

For general merchandise, a conservative estimate is that the real amount of online sales by physical retailers is 23 times higher than what the Census reports. For electronic and appliance stores, the real amount of online sales by physical retailers is almost 14 times more.


So, you might ask, why does any of this matter? Everyone already knows that eCommerce is cutting into physical retail sales – what’s the big newsflash? Does it really matter that the Census was off by a third?

That depends on whether you’re the CEO of a massive physical retailer who might have wanted to know six years ago (or more) that not only was the growth of eCommerce accelerating  – which was well-known – but that the base upon which it was growing was a third larger than she thought.

Maybe she might have wanted to know that so that she could have reprioritized her digital initiatives – and moved them closer to the top of the list.

You see, if she’s looking at the Census numbers for her peers, and knows her own numbers, she’s probably thinking one of two things: she’s doing better than anyone else so there’s no real hurry to do more, and everyone’s in the same place so there’s no real threat. Both conclusions would be wrong, now with the benefit of 20/20 data hindsight. 

Maybe she might have embraced mobile wallets and cloud-based point of sale solutions a few years earlier. Who knows, maybe we would be a whole lot further down the mobile/digital wallet field by now.

Maybe she would have adjusted her merchandising and pricing and loyalty strategies to get consumers tied closer to her store using the digital shopping media that her consumers clearly preferred and were using more often than she thought.

Maybe she would have known a little earlier where Amazon was cleaning her clock and had the time adjust her strategy.

And then there’s the Board.

My guess is that many CEOs of physical retailers, probably echoing what their management consultants and execs were feeding them in their perfectly crafted and formatted decks, have been telling their boards and investors not to worry. “Online is just a drop in the bucket.” “People love shopping in physical stores.” “We’ve got plenty of time to adjust to online; after all, 94 percent of physical retail still happens in physical stores!”

Wrong, wrong, wrong.

And that brings us to retail’s “Kodak Moment.”


Yes, I’m sure you’re curious about how in the world this could have happened and what makes the MPD team feel confident about the missing Census data. The authors are releasing the technical details of their work in a couple of days — geeks who want to know the details, stay tuned at But if you can’t wait, here’s my best non-economist summary of what they did and the crux of the error at a high level.

The Census Bureau appears to count sales from pure-play etailers just fine. But that’s not the issue, nor the problem.

There are three contributors to the bad data problem at Census.

First, things get hairy — and wrong — when it gets down to counting online sales from physical retailers, which is now just about everyone.

Omnichannel is in full swing as most retailers, even some of the luxury brand holdouts, recognize the importance of having a synergistic online and physical retail presence.

As I mentioned, the Census Bureau relies on data that retailers report to them. MPD examined physical retailers at the 3-digit NAICS Industry code (using the most recent complete data set, which was for 2013). That resulted in the discovery of the undercounting error. Walmart’s more than $6 billion for 2013 was just a tad more than the $88 million reported by Census for general merchandisers, which is where Walmart should have been. A bit more digging and some emails with the Census Bureau pretty much nailed the fact that was missing. 

Now, according to Evans, the Census can’t actually say anyone is missing or they’d get their heads handed to them. But everything his team looked at, and the feedback from the Census, made it pretty clear wasn’t anywhere in the Census online data. When they finished their digging and added in other missing physical retailers they found that the Census had missed about $62 billion in online sales in the U.S. in 2013.

The second source of retail’s big data mistake is that the Census isn’t capturing sales of big groups of nontraditional retailers like manufacturers who sell online, like Apple or Nike, in the retail numbers it reports.

I know, hard to believe.

But that turned out to be a big deal, too. That segment alone – retailers like Under Armour, Ralph Lauren, Apple, Kate Spade, Lululemon and many others — accounted for roughly $22 billion in 2013, and a projected $25 billion and $29 billion in 2014 and 2015. Yet all are completely missing in how the Census tabulates things.

There’s a third contributor to retail’s big data problem.

Most researchers, data analysts, and media who rely on the Census data use the average online sales as a percent of total retail sales across all retailers. That’s where the “online is still very tiny” conclusion has come from.

A lot of times the “average” isn’t a very good number to base decisions on. The average annual temperature in Boston is 58.7 degrees Fahrenheit. Last Saturday afternoon, when the Patriots won the AFC East Championship Title here, it was 37 degrees Fahrenheit. So, a fan in the stands dressing for 58.7 degrees would have been pretty cold (happy, but still cold). Generally, dressing for 58.7 degrees is a bad idea in Boston since it’s hardly ever 58.7 degrees. It can be a lot colder, like Saturday’s 37 and last year’s sub-zero temps, and a lot hotter, like over 100 degrees.

Further, reporting an average physical/online break fails to show the impact of online sales growth by retail category – which is important for understanding what sectors will feel the online hit most acutely, and how soon that impact will be felt. 

The bottom line? When the Census reports that 94 percent of all retail sales still happen in brick and mortar locations, everyone feels better. When that number is really 89.7 percent now, the storyline turns sour. And for retailers sitting in a category that is a lot lower than the “average,” it’s downright scary.


Now before you start sending hate mail to the Census Bureau, please don’t blame them.

These folks are understaffed and under-resourced, and can only report based on what they’re given. They also operate big, inflexible government databases that can’t account for the differences and nuances of our changing retail environment. Part of it is they are stuck using outdated NAICS codes that are based on treaties the U.S. has signed with other countries. Let’s not go there right now either.

It’s pretty shocking, though, that a government entity tasked with reporting such important information for the largest economy in the world, now some 20 years after the birth of the Internet and eCommerce, isn’t given the resources to do it properly.

But thank goodness we spent $856,000 last year to find out that it takes 3 months to teach a mountain lion to run on a treadmill. 

Of course, as the blurring of the online and offline worlds accelerates, accounting for online and physical retail sales will only get harder and hairier. Buy online and pick up in-store may be great for retailers and a sign that omnicommerce is in full swing, but Census Data isn’t set up to count those sales correctly.

And retail will just keep feeling better that physical retail is still such a large part of the retail economy. Yet retailers know instinctively that they must embrace digital and step up their online and omnicommerce games since that’s where the consumer is taking them.

They’re also starting to internalize the hit they take when they don’t.   

Our Checkout Conversion Index – a collaboration with BlueSnap – shows that most merchants today fail the most basic of all online tests: how easy it is for consumers to check out on their sites.

When we benchmarked 650 sites in the U.S. – representing about 75 percent of U.S. eCommerce sales – we found that there was so much friction across the board, that most retailers were putting as much as 36 percent of their sales at risk. Of course, those sales aren’t lost forever, but simply lost to merchants who can’t convert those shoppers to buyers to those who can. 

Like Amazon, for example.


Now that we’ve discovered this mistake, we’re going to share what we know on a regular basis. Coming soon, we’ll be publishing the “Whole Scoop And Nothing But The Scoop On Online/Offline Retail Sales Tracker” so that you have monthly updates on what’s really going on at the intersection of online and offline retail. 

We’re doing to give you averages, of course, but the more valuable data by three-digit NAICS code.

What you do with that data, is your call.

Consider it our contribution to turning retail’s “Kodak Moment” into a picture truly worth framing. And a memory worth remembering – for all of the right reasons.