AI: Beware the Anecdotes

There is a schism in AI. The bull market continues to suggest something transformative is underway. Nvidia’s shares jumped another 9.3 percent in late May after another set of record quarterly results. Taiwan Semiconductor Manufacturing Co. (TSMC), sole chip supplier to Nvidia and Apple, recently surpassed $1 trillion in market capitalisation. 

But questions are slowly emerging about the real utility of GenAI, the spark behind the fire. What is its killer use case?


In Same as Ever, Morgan Housel outlines enduring historical truths of human experience. One is to be wary when anecdotes and data diverge. He quotes Amazon founder Jeff Bezos: 

The thing I have noticed is when the anecdotes and the data disagree, the anecdotes are usually right. There’s something wrong with the way you are measuring it.


A recent report from Goldman Sachs and MIT suggests GenAI may fall into this category. It seeks to determine its short and long term viability for investors. While three Goldman Sachs economists remained confident that a killer use case would arrive shortly, Jim Covello and Daron Acemoglu were more sceptical. 

MIT economist Acemoglu said: 

Given the focus and architecture of generative AI technology today…truly transformative changes won’t happen quickly and few – if any – will likely occur within the next ten years.

With tech giants and others predicted to spend over $1 trillion on AI capex in the coming years, Global Co-Head of Single Stock Research at Goldman Sachs Jim Covello stated: 

AI technology is exceptionally expensive, and to justify those costs, the technology must be able to solve complex problems, which it isn’t designed to do. 


The Financial Times’ undercover economist Tim Harford illustrates this issue in a recent column. Harford uses the example of the Cheryl’s Birthday logic puzzle. The popular conundrum forces one to guess Cheryl’s birthday from the handful of clues she provides to her friends Albert and Bernard. 

ChatGPT answers impressively and correctly. It provides a reasoned answer, walking its prompter through the logic to the solution of July 16th. But changing the parameters ruins the illusion of rationality. Entering different names and dates, but with the same underlying logic, means ChatGPT can no longer rely on a response learned through trawling the entire internet. 

The GenAI tool consequently chooses bluff over humility. It gives a wrong answer but with a confidence and eloquence that would easily fool me. It shows, Harford says, that “large language models can be phenomenal bullsh*t engines”. And the danger is that the “bullsh*t is so terribly plausible”. 


Harford’s example pours scepticism on recent research that has hyped AI as a stock picker, as a recent University of Chicago School of Business paper suggested. Three scholars showed ChatGPT performed better than humans when predicting company earnings from previous balance sheets and income statements. ChatGPT was accurate 60 percent of the time versus 57 percent for its human competitors, albeit with certain prompts. 

And those prompts require knowledgeable human interaction. Financial analysts must correct the tool if it provides similarly errant analysis as when faced with a logic puzzle. Even then, at a 3 percent premium on humans, the Financial Times’ Robert Armstrong asks if it’s really a game changer vis a vis the low fee Vanguard index fund? That has long outperformed the median analyst or stock picker. 


Factoring in the energy costs of using these tools changes the equation further. A recent study from the Amsterdam School of Business and Economics found that AI applications could use as much power as the Netherlands by 2027. 


As a result, the Wall Street Journal reports that some of the tech giants are moving towards smaller models. Microsoft is publicising Phi, a set of LLMs which are 1/100th the size of the model behind ChatGPT. They are geared towards more specific use cases and so don’t need access to the same quantity of data. The paper reports this as a reaction to the higher than anticipated operating costs of the company’s multi-billion dollar bet on AI. 


The stock market reflects the bullish position big tech has taken on this new technology.  Other companies pile in out of fear of being left behind. And while many pundits remain outwardly confident that the prices justify the hype, anecdotal evidence tells a different story. Already companies are accepting its use may be more limited than originally thought. 


Others go further, arguing the whole thing is a busted flush. On Bloomberg’s Merryn Talks Money podcast, James Ferguson of the Macrostrategy Partnership said GenAI was effectively “useless” and had created a “fake it till you make it” bubble that could end in disaster. 

Ferguson points to those huge energy costs as well as the ongoing issue of hallucinations. As we’ve previously discussed in AI: Has the Bubble Burst? sceptics like Jeffrey Funk and Gary Marcus argue these errors may be impossible to fix. How can we find a compelling use for a tool that randomly lies? 

Nevertheless, Ferguson believes the bubble may have longer to run. Investors are looking at the market and thinking it can’t last but also knowing that “if it lasts one more quarter and I’m not playing, I’ll lose my job”. 

The “fake it till you make it” attitude is a Silicon Valley staple. Its AI proponents fervently believe that the major business case will emerge as they keep riding the hype wave. But as progress clearly plateaus, so the possibility grows of admitting that the Emperor has no clothes. 


Leave a Reply

Your email address will not be published. Required fields are marked *