Parrots and Octopi vs. Domain Models
Two Ways of Thinking about AI Tools and Their Implications
This article is the second in the “AI Is Not Smart” conversation. The first post briefly discussed, in the perspective of Chalmers, the idea that text generators/large language models/natural language processors process text by categorizing and associating data and information. This contradicts the “language-not-context,” “statistical text processing” emphasis furthered by Bender et al. in their “stochastic parrot” article.
In this article I am going to compare more explicitly and directly these two perspectives on the nature of AI tools. The reality of generative AI tools, especially text generators, is probably a hybrid of these models.
The Impetus: Copyright Implications of the Nature of AI
This post grows out of a thread that I put on BlueSky (I have an account there now: bringthehuman.bsky.social). I shared it on multiple other platforms, and it started a few conversations.
“The recently renewed exemption to the DMCA means that use of copyrighted materials to train generative AI tools for research purposes is more broadly considered ‘fair use’ than it was before.
PLEASE NOTE that there is a PROPOSED CLASS that was REJECTED that had to do with generative AI safety features. The APPROVED class, which still has to do with generative AI, was labelled "Text and Data Mining."
The important thing to note here is that researchers, whether internal or external, are not allowed to DOWNLOAD the raw data, (the books or movies being used), called the "corpus." They can only download the values and manipulated data (analyses). However, they can ACCESS the works for analytics.
This is related loosely to David Wiley's brief presentation at the beginning of the AECT 2024 International Convention copyright panel: AI models do not rely on the data. They rely on the values assigned to data pieces AFTER the original data has been deleted. In other words, value-formatted METADATA.
I will also reference Chalmers' article again in which he refutes, somewhat, Bender's and Gebru's claims that LLMs and genAI tools in general are trained through straight statistical, probabilistic training. He notes that according to them, LLM tools are simply repeating communication patterns without analyzing those patterns. They are just repeating what a human user would say or do in a given situation.
However, according to multiple recent research projects that align with the “domain model” theory of Chalmers, AI tools CAN analyze concepts and create new connections. Does this mean they are conscious? OF COURSE NOT. But they ARE creating contextual and linguistic connections.
My point in bringing up these two disparate ideas in this one thread is that IF statistical and probabilistic analysis are just bringing about "stochastic parrots," then the copyright holders have very little to worry about unless users maliciously prompt the AI.
ON THE OTHER HAND...
If the machines are making contextual connections on their own and using user data to create things based on these new connections (out of data values and NOT the original works used in training), then while it may not be eligible for copyright protection of its own, it is probably NOT infringing.”
Let us examine these models more deliberately and think about their implications. I have to say, based on my work with AI tools, that I agree with Chalmers’ “domain models” point of view more than I do the “stochastic parrots” argument.
Bender: Stochastic Parrots and Hyperintelligent Octopi
Emily Bender’s analogy of "stochastic parrots" critiques large language models (LLMs) as entities that regurgitate statistical patterns without true understanding. According to Bender et al. (2021), these models are sophisticated in mimicking human-like text due to vast datasets and complex training algorithms, yet lack comprehension or reasoning. They do not have contextual information to aid in complex decisions.
Bender has expressed a similar ways of looking at text generators in her “octopus test.” She wrote a paper with Alexander Koller entitled “Climbing Toward NLU.” Two individuals are stranded on separate islands and communicate through a system reliant on text. Below them, a hyperintelligent octopus analyzes the text of these messages, but does not have the environment of the island or the human life to give context to these messages. In generic conversations, the octopus can successfully mimic Island 1’s communications to Island 2, and vice versa. However, Bender and Koller argue, the octopus would be a useless advisor if a uniquely land-based or human challenge were to occur. In other words, without context the octopus lacks the ability to create truly useful text that responds to meaning and subtext in prompts.
These to animal analogies highlight the limitation of LLMs in not forming genuine semantic connections; instead, they predict what words are likely to come next based on probability. The "stochastic parrot" label underscores that AI tools, despite their apparent fluency, should not be confused with entities possessing intention or insight.
Still, the replies of generative AI tools suggest that they are not just using the statistical probability of their training data to respond. In Chalmers’ critique of Bender et al’s argument, he stated that while probability prediction training, called “string matching,” certainly is a factor in training, “that doesn’t mean that their post-training processing is just string matching. … “All kinds of other processes may be required.”
In other words, there must be some kind of context added to all of those probabilities and values. Probabilities are not the only factor in determining the output. Therefore, the “stochastic parrot,” “octopus” model of thinking about generative AI is not only not true, it is misleading.
Chalmers: Domain Models
In contrast, David Chalmers and others propose that advanced AI tools can function as "domain models" or "world models." This framing suggests that these models may internalize abstract representations of the environments or systems they are trained on. As Chalmers (2023) notes, the potential for models to approach understanding lies in their capacity to simulate or replicate intricate domains.
This view posits that AI might possess rudimentary forms of understanding—albeit not comparable to human cognition. It challenges the notion that AI outputs are purely mechanical, suggesting that, with proper fine-tuning, these models can reason and adapt in ways that transcend the "parrot" analogy. This interpretation is especially relevant for specialized applications like medical diagnostics or legal analytics, where nuanced understanding is crucial.
In their 2018 paper "World Models,” David Ha and Jürgen Schmidhuber demonstrate that AI agents could be trained in simulated environments generated by their own internal models, effectively allowing them to "dream" and learn from these experiences. This method reduces the need for extensive real-world interactions, thereby enhancing training efficiency.
These ideas were expanded upon in 2020 and 2021. Ammanabrolu et al explored the abilities of genAI text generators to use their “world models” to create contextual connections and new ideas and items in fictional worlds. They found that AI tools were able to built thematic knowledge graphs and guide themselves through creating a complex world. Humans created the initial baselines, but generative AI tools were able to moderate their own actions according to those baselines and rules created/implied by themselves.
Ammanabrolu and Riedl expanded on the 2020 project by examining a text generator’s ability to productively respond to zero-shot prompts regarding new worlds. They found that using a text generator “significantly outperformed” other techniques for creating world models.
Implications of Stochastic Parrots
The implications of treating AI as stochastic parrots are far-reaching. First, it cautions against overreliance on AI for critical decision-making, emphasizing human oversight. Misunderstanding the limitations of LLMs could lead to errors in sensitive fields such as healthcare or governance. Second, this view amplifies ethical concerns about training data origins, as AI systems reproduce biases inherent in their inputs.
Bender’s analogy also shapes how we approach transparency in AI. If models merely echo their training data, developers have a responsibility to disclose how these datasets are curated. Calls for more transparency regarding AI datasets and for open AI ecosystems resonate with this need for accountability.
However, the need for transparency does not automatically make AI tool creators liable for copyright infringement. After all, the tools are only repeating part of what their value-and-probabilities structure is telling them to generate. Only through malicious and deliberate programming can AI tools be coaxed to reproduce a sufficient amount of a copyright work to be held accountable. The “stochastic parrot” structure does not lend itself to copyright infringement.
Implications of Domain Models
Viewing AI as domain models broadens their scope of application. For instance, in scientific research, AI tools can hypothesize and simulate phenomena beyond human capacity, contributing to breakthroughs in climate modeling or drug discovery. This perspective encourages deeper exploration into whether AI can achieve higher-order understanding through advanced architectures, such as reinforcement learning.
Stochastic parrots are only repeating words without understanding connections, so they cannot create works that are worthy of copyright infringement. If there are outputs that do infringe on copyright, that is the user’s fault, not the machine’s.
Machines following the “domain model” work according to contextual connections they made on their own. They create new outputs that combine new user data and old concepts, connection, and context. They form new connections to reinforce and revise preexisting “clouds,” which one can think of as “word clouds.” Outputs are affected by the form of the cloud connections. Again, as discussed in a previous post, the text of the training data is replaced by assigned values. Therefore, the connections in this cloud are connections between data values and NOT the original works used in training). Thus, if one follows the “domain model,” while AI-generated materials may not be eligible for copyright protection of their own, they are probably NOT infringing.”
What Does This Mean for Education?
In education, both paradigms suggest transformative yet cautious adoption. As stochastic parrots, AI can serve as powerful tools for generating drafts, summaries, or personalized learning paths, but educators must teach critical engagement with outputs to avoid uncritical acceptance. Domain models, on the other hand, promise adaptive learning technologies capable of responding to complex queries and simulating real-world problem-solving scenarios.
Pedagogical frameworks like the Technology Consumer or Producer (TCoP) Model encourage balanced AI integration, emphasizing collaboration over automation. This approach prepares students to ethically and effectively engage with AI in professional and academic contexts.
Conclusion
The "stochastic parrots" and "domain models" paradigms offer two compelling lenses through which to evaluate AI. While Bender emphasizes linguistic and contextual limitations, Chalmers provides a vision of AI as a transformative ally in specialized domains. Both perspectives underline the necessity of human oversight and the importance of critical frameworks for AI literacy.
References
Ammanabrolu, P., & Riedl, M. O. (2021, October 20). Learning knowledge graph-based world models of textual environments. arXiv.org. https://arxiv.org/abs/2106.09608?utm_source=chatgpt.com
Ammanabrolu, P., Cheung, W., Tu, D., Broniec, W., & Riedl, M. O. (2020, January 28). Bringing stories alive: Generating interactive fiction worlds. arXiv.org. https://arxiv.org/abs/2001.10161?utm_source=chatgpt.com
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922.
Bender, E., and Koller, A. (2020). “Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463.
Chalmers, D. (2023, August 11). Could a large language model be conscious?. Boston Review. https://www.bostonreview.net/articles/could-a-large-language-model-be-conscious/
Library of Congress Copyright Office (2024, October 28). “Exemption to prohibition on circumvention of copyright protection systems for access control technologies.” Federal Register. Vol. 89, No. 208. https://www.govinfo.gov/content/pkg/FR-2024-10-28/pdf/2024-24563.pdf
Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2024, April 17). Mastering diverse domains through World Models. arXiv.org. https://doi.org/10.48550/arXiv.2301.04104
Hepler, R. (2024). Reed C. Hepler (@bringthehuman.bsky.social). “The recently renewed exemption...” thread. https://bsky.app/profile/bringthehuman.bsky.social/post/3lbao2rppbc2z