Monday, 23 September 2019
HOW HAVING MORE DATA MAY INCREASE ERROR
In his book The Black Swan: The Impact of the Highly Improbable Nassim Nicholas Taleb [NNT] argues that it is better to start with ‘real life’ and then read books than to start with books and try and apply to ‘real life’.
His point being that if we start with a theory and then go looking for an example to fit our model we end up finding something that fits our pre-conceptions. Taking the opposite approach should mean we come up with an explanation for something that is true in ‘real life’.
I found his book clever, difficult, inspiring and rude but never dull.
He is scathing of people who create models and spreadsheets that predict next month, next year, or next decade when we know that its difficult to predict the weather tomorrow. Take for example financial predictions.
Did you know that 50% of the price volatility in the markets is down to 10 days over the last 50 years. Can any of us predict the market behaviour for every-day for 50 years in the knowledge that miss just those 10 ‘special days’ and you’ll be out by 50%. Here is another thought: Did you know that for many banks all the cumulative gains of 50 years were lost in 1 week of trading.
Nobody 3 years ago would have predicted how the Brexit saga would have turned out. Indeed its difficult to predict next week let alone next month, next year, or next decade.
NNT notes that having more data confirms confidence, but not fact.
In his book Talking to Strangers: What We Should Know about the People We Don't Know Malcolm Gladwell notes many instances where computers get facts right based on very little data (example offender record, age, offence) whereas people get it wrong ostensibly because of too much data (they look innocent, they dress well, they speak with an accent).
Faced with all that extra data we are bound to find something to justify our notion. Again, more data confirms confidence, but not fact.
In his book Thinking, Fast and Slow Daniel Kahneman explains both the merit in quick thinking and the risk. This simple categorisation based on past experience, values, beliefs means we suffer “confirmation bias” – looking for things to confirm assumptions rather than go through the hard work of fresh and independent thought (potentially challenging or changing our experience, values, beliefs)
So, what are we to do if too much data confirms confidence, but not truth. And hasty thinking leads to false assumptions?
All this came to me today in a practical situation about risk. Looking at GDPR and Cyber Security there is a real risk that we suffer analysis paralysis: lots of data confirming how scared we should be, which ostensibly comes from the people who earn a living from these risks. Warren Buffet is alleged to have said “Never ask a barber if you need a haircut”.
We need to understand the context and maybe NNT is right. Maybe we need to start with ‘real life’ and then look at data rather than become too obsessed with what the spreadsheet or check-list says and be blind to the reality and practicality of it all.
As a guy who is data-driven and more about process than personality this is a new way of thinking for me. Like I said, NNT’s book was interesting because it challenged me and my assumptions. But when you put it in the context of all the other literature it seems clear: too much data blinds us.
I am interested in your thoughts and experiences. Maybe you agree. Maybe you do not. Maybe you can recommend some other books, blogs or videos. Do not hesitate to get in touch, and if you are in Jersey I will happily buy you a coffee if you would like to talk about your experience.
MBA (Management Consulting) Projects & Change Practitioner,
TEDx & Jersey Policy Forum, Public Accounts Committee,
Posted by timhjrogers at 12:14