Studying large language models as compression algorithms for human culture

Abstract

Large language models (LLMs) extract and reproduce the statistical regularities in their training data. Researchers can use these models to study the conceptual relationships encoded in this training data (i.e., the open internet), providing a remarkable opportunity to understand the cultural distinctions embedded within much of recorded human communication.

Publication
Trends in Cognitive Sciences
Date