Put an error bar on it

TL;DR; a general principle that should be present in all types of analysis, not literally confined to data analysis with stats.

What do I mean, an error bar?

You have probably all seen a graph with error bars on it:

But sadly you’ve probably seen many more charts with no error bars on them at all.

Normally, in an experimental science, the error bar is intended to give a sense of the random error in making a measurement (e.g., you timed the tortoise running the race, but you didn’t exactly stop the timer at precisely the right moment).

But what do we do when we have no such timing error?

When I call up the access log for my data service, there is no random error on the timestamp, When I call up the price from bloomberg there is no random error on the number. 

So what do we do? We add some error ourselves

Why have an error bar?

Is 5 greater than 4?

If you don’t have an error estimate, you do not know that 5 is greater than 4.

So if you are making decisions, you better know the uncertainty in your data or you are guessing.

Sliding dates to create errors

For example: if i want to answer the question: did the fund beat the benchmark, there is no natural error in that. We can’t justify just making up an error in the price, so let’s put an error on the date:

  • say i want a 5 year return
  • I take today’s date and spread it, + 30 days, -30 days.. then i have a 61 day window
  • I take the date 5 years ago and do the same thing.
  • Then all the combinations of those pairs: i get a population of 3600 dates
  • I take the closing prices of fund and benchmark on those dates

Yes this is creating a bit of a bogus measurement, 5 years+ 30 days is not 5 years, you can correct that if you want.

Is this a realistic error? Don’t know, but it represents something like a real-world “what if” that gives me some comfort.

Portfolios perturbed

Having a portfolio with more than 50 stocks in it is also a great way to create some real-world adjustments which might give us a range of outcomes:

  • leave one out: randomly leave out a security, backfill with the index
  • drop a sector: I’ve tried this with things like Oil, Gambling, tobacco, etc. Surprisingly the portfolio returns change, but not that much
  • change the timing: you can choose to slide dates past each other, make a trade later or earlier
  • perturb the weights: did you really  decide to buy that stock to 2.5% or would 2.4% be just as good? Let’s find out!

Forecasts and predictions should respect the historic base rate

Anything that is human generated can just be fiddled with to create a “optimistic” or “pessimistic” outcome. Whether it’s the deadline for a project or a amount of disk space needed or the future earnings of a company you want to invest in. 

Yesterday’s weather is probably the best meta-algorithm for estimating across any domain, but you can put an error on that. 

But we don’t need to completely guess: we can find a suitable range of outcomes from history of “similar” cases, if nothing else, we should expect our own estimates have at least that uncertainty.

Projects are estimations too 

The whole basis of project management is : it is easier to estimate a task, I can estimate a whole project from tasks. But without some error rates, we are asking for trouble. This is why agile is so nice, lots of data, lots of bottom up estimates with baked-in uncertainty. 

As with investment decisions, with project management, it’s the downside and the missed deadlines that get the attention. But really: if you estimate 2 projects and project a is 40 person days and project B is 50 person days.. do you really have enough precision to tell the difference?

Using real stats

I don’t know enough real stats. Whatever I tell you will probably be a mish-mash of things i’ve partly understood and whatever I read last.

Sadly, R and python have got so good there are many semi-automatic libraries and apps out there which will happily train and predict a model.

No error in measurement? Simulate your errors

Suggestion: if you don’t have a random error, you should try creating some. Just to see what it does.

You might say “something something bootstrap”; but there are lots of ways of doing it.

Leave a comment