LLMs: Are a Few Terabytes Enough to Hold Human Knowledge?

LLMs: Are a Few Terabytes Enough to Hold Human Knowledge?

Photo by Marta Branco on Pexels

A recent Reddit discussion on r/artificial has sparked debate about the capacity of Large Language Models (LLMs) to represent the entirety of human knowledge. The core question revolves around whether the relatively small size of models like GPT-4 (a few terabytes) indicates a highly compressed and generalized form of all recorded human understanding. While participants acknowledge the limitations of equating model size with knowledge capacity and the inherent difficulties in defining and quantifying “knowledge,” the discussion highlights the fascinating potential and limitations of LLMs in processing and representing vast amounts of information. The original discussion can be found at [https://old.reddit.com/r/artificial/comments/1odr0nh/can_we_measure_the_amount_of_written_human/]