AI Efficiency — Back to the basics

Bertrand K. Hassani, PhD
4 min readJun 3, 2021

Guys, sorry to disappoint, but let’s face it: AI is simple, at least conceptually speaking! That’s the reason why you have so many people talking and writing about it! A lot of people will disagree with me, and I apologise for bruising egos. I feel terrible about this (#sarcasm). Truthfully, though, it took me a while to get there.

You want me to back up this wild assertion? Ok, let’s go back to the basics.

  • What is a model?

A simple representation of something…. That’s not from me…. But even the most complex model is a simplification of reality… “Models are, for the most part, caricatures of reality, but if they are good, like good caricatures, they portray, though perhaps in a disturbed manner, some features of the real world.” Marc Kac.

  • Why do we need a model?

Multiple reasons, for predictions, for analysis, to assess a probability of occurrence, and so on and so forth.

  • How does that work?

This is where conceptually it becomes super easy. Take a phenomenon (risk, sales, customer behaviour, way of speaking or writing ….), gather the drivers of that phenomenon (what would be referred to as the independent variable in a linear regression😊), combine the drivers such that the predictions you get are consistent with the realisations or occurrences of that phenomenon, and there you go!

Really, done! Now, a bit more about how we combine the impacting factors — the drivers. That is done using algorithms (random forests, linear regressions, neural nets, etc.). Algorithms combine elements in different ways under different assumptions.

So, where does the complexity arise from? From not having accurately or sufficiently captured the problem. In other words, the problem itself — the problem as depicted or as perceived — is usually only the tip of the iceberg. Often, what we see is not necessarily the problem at all but instead the outcome of an underlying problem. If they are not accounting for this discrepancy, models might not be that helpful. I believe that last year’s Covid related forecasting models were fairly illustrative, as we basically saw all possible results in the spectrum and even beyond…How did we get it so wrong? Because we didn’t get a grasp of the whole picture (how could we?), therefore we were presented some very “creative” results. As would say Alfred Korzybski “The map is not the territory”. As such, we cannot confuse the model or the algorithm for the reality, and models’ actionability is indeed in the reality scope of things.

So, what do we do about it?

First, we stop pretending that coding two lines of python is going to do the trick. Ok, I might be a bit unfair towards developers here, as many data scientists do not necessarily code anything but use already existing packages, or just download a piece of code from a git with very quick and sometimes very dirty results (and here, there is absolutely no criticism towards people sharing ideas and knowledge, i have the utmost respect for them). Sometimes I wonder if that’s not one of the main underlying reason why algorithms are not industrialized, as there might be a lack of understanding of how the solution has been developed on the git. It might allow you to post something on a social network, but not much more.

Second, we should stop pretending we fully understand the issue immediately, and remember that whatever solution we develop, it will have its own limitations and it might not be that appropriate over time, so in a nutshell we have to remain humble… I know it is hard knowing that no one ever criticised what we posted on social networks… yup there is no thumbs down… but “Where there is no freedom of blaming, there can be no genuine praise.” (Beaumarchais)

An efficient AI solution requires a holistic understanding of the issue, the people, the data, the way similar issues have been dealt with, the IT architecture and so on. The key is not the algorithm — the algorithm is just a means to an end and nothing else. If you want to get it right(er) you must try to get a grab of the whole value chain, so eventually you are able to analyse the outcomes of the model with respect to each component and not just the algorithm itself.

Finally, and some data scientists may disagree, start considering that both the flow of information feeding the algorithms, and the ultimate use of the results are more important than the algorithms themselves…. ouch I know, but don’t worry it’s going to pass.

So how do you make sure that an AI based solution is functioning adequately? I would recommend starting from the end: what do you want to do? When do you want to have your solution up and running? And then move backward from there. What data do you have? Considering the available data, what model are you able to use to ensure the robustness of the solution? How can you improve the accuracy?

Remember guys, better the devil you know… a model of poorer quality (a priori), but whose limitations you understand, might be much better to start with. You’ll work on the limitations later. I would emphasize that at an industrial level the solution has to be robust no matter what! Validate the results with respect to business value, ethics, and common sense (though I know, according to Churchill, it’s not so common.) Finally, you need to ensure that the results displayed are self-contained, i.e. that people using the solution are able to use the result. Ultimately, the efficiency of an AI based solution is its actionability and the rest is nonsense.

--

--