AI Wants to be Free... and Expensive

AI Wants to be Free... but it also wants to be Expensive

February 2024

This piece was originally written and translated into Korean as a part of LG Technology Ventures' Monthly Newsletter to business units and strategic partners. It is republished here with permission. Sensitive information has been removed.

In 1984, Writer and founder of the Whole Earth Catalogue Stuart Brand famously remarked “Information wants to be free,” which became a rallying cry for the open source movement, pirates, and hackers alike. However, there is context to the quote: “On the one hand, information wants to be expensive… on the other hand, information wants to be free.” Brand’s thesis was that data wants to be free because it is so easy to copy and move, and it wants to be expensive because it is so valuable, and at the beginning of the internet, that tension became extremely apparent as companies started building out monetization strategies amid rampant piracy.

I believe that a similar paradox is present in AI.

AI wants to be free because many of the models are open source, and the best tests estimate that open-source models are roughly 6-9 months behind those of the best closed-source models (currently GPT4 Turbo). Further, these models are portable. Most can run locally on a laptop or even just a smartphone, and some of the most advanced models are only a few gigabytes in size. Next-generation AI cores in SoCs from Qualcomm, Intel, Nvidia, AMD, and virtually every other chip provider can accelerate inference on inexpensive edge devices even further. Elon Musk even remarked “Our digital god will be in the form of a csv file”, and in effect, that’s all that these models are: a large list of weights that are just matrix multiplied based on a text input.

AI tools may even be getting easier to develop and implement across hardware. Nvidia’s CUDA compiler is facing pressure from open source ZLUDA which allows CUDA code to run on Intel and AMD GPUs. ONNX Runtime promises to offer a universal translation layer for AI developers regardless of hardware, much as DirectX allows graphics development regardless of GPU. And Windows is offering AI tools built into the OS that run locally like Windows Studio Effects and Windows Auto HDR. These free tools, although not state-of-the-art compared to premium NVIDIA features, push AI closer to cheaper and more available to consumers.

At the same time, AI wants to be expensive. AI models are extremely expensive to train, usually requiring months on clusters of the world’s most advanced (and scarce) GPUs to train. These chips are extremely expensive, reserved years in advance, and have propelled NVidia to become one of the world’s most valuable companies. US Lawmakers have even banned the export of some of these advanced chips to China.

Chips are not the only thing keeping AI expensive - data is too. Model training is reaching diminishing returns in accuracy through increasing parameter size and context window alone, and most models today are already scraping the entire open web for training data. There are natural limits to capability from information on the open web alone. Instead, some players are paying to license high-quality content from data providers. For example, NVidia is licensing images from Getty to train its models, OpenAI is licensing data from New York Times, and Google is licensing data from Reddit for training. These are expensive data sets to license, and those costs will be passed on to customers.

It is undoubtedly clear that AI has enormous potential for value creation, however, it remains unclear if the value capture will be concentrated in a few AI specialist companies, just as value capture was concentrated in Google, Microsoft, and AWS for cloud computing, or if the value capture will broadly spread among companies that utilize the technology, such as in the adoption of electricity, where most of the value capture was in factories, rather than electricity providers. AI wants to be expensive because it is so expensive to train and operate at large scale. It wants to be free because open-source models are quickly gaining in capability, are small enough to be portable, and can run on virtually any consumer device. And this natural tension will create interesting new business models and massive opportunities. The subscription model is unlikely to remain the dominant business model in the future.

[The remainder of this article has been redacted]