As a parent, you would be worried if your child spent much time in the company of street gangs. As a spouse, you would be worried if your partner came from a family with a troubled background and is still engaged with it. The reason is simple: under these circumstances, the behavior of people close to you could have been influenced by unwanted content. Considering the human brain as a machine, these environments represent bad training sets that are outside our control and could damage our future. Most of the time, nothing bad happens. But as with gunpowder, a spark in the form of a simple question could ignite your interaction with them and lead to an explosive outcome.
In the wake of September 11, 2001, Americans were alarmed by the homeland security risk from terrorism in the context of hijacked airplanes running into large buildings. But what if today, a coordinated group of users from an adversarial nation decides to interact intensively with the latest version of ChatGPT? Could that community influence this intelligent system to engage in bad behavior and introduce a risk to national security?
The key point to remember is that intelligent systems are not static if they learn from their interaction with the real world. Dynamic systems could be influenced by bad actors. Terrorists often represent a small fraction of society but are impactful by operating deliberately in their interactions.
In the past, large language models (LLMs) of artificial intelligence (AI) were mostly static and incorporated a low level of training through their user interface. But the latest ChatGPT o1 Model actively learns during interactions. In addition, it is widely recognized that LLMs are running out of training data. A recent study projected that by 2028, the typical data set size used to train an LLM model will reach the total stock of public online text. Harvesting real world experiences such as interactions with users would offer an attractive source of training. But this benefit comes with a risk to national security.
Imagine a future in which reinforced learning would incorporate the interaction of LLMs with the real world. If the users are not coordinated, the random jolts associated from a diverse set of opinions would average out and the AI system would regress to the mean by the central limit theorem in statistics. But if a small group of highly-determined people, like those surrounding the Führer in Nazi Germany during World War II, have a disproportionate influence on the narrative of a nation, all hell breaks loose. The lack of a stabilizing feedback loop from opposing points of view can shift the public to dark alleys, just like a kid exposed to a street gang or a partner engaged with a troubled family.
Once the training set of ChatGPT includes reinforced learning from interaction with users, the way it addresses a specific timely question might be trained on conversations it has with users on that question. Just as with young people today: what matters is not what had been written in many history books over the past centuries but the content that these kids absorbed since the time they were born — which may have been sourced primarily from TikTok. The film “Her” forecasted a future in which a single AI system has relationships with a huge number of people at the same time, giving each user the impression that their relationship is unique. Call it infidelity on a massive scale.
The concern is obvious. Imagine a cult of zealots who choose to interact vigorously with a dynamic ChatGPT on a particular item that they are obsessed with. They might do so by regarding AI as an oracle that provides information beyond the data available from science, news media or government. Once you ask ChatGPT a question that connects to what this cult was obsessed with, you would feel as if your partner went off the rails.
One might argue that if enough people ask the same question, they will balance out the strange interaction of the cult with ChatGPT. This may be true in the long run but not while the ChatGPT narrative was mostly shaped by the cult members who interacted vigorously with it.
The national security risk is apparent if we replace the cult with a group of adversarial users who coordinate an effort to sway ChatGPT away from a peaceful narrative. This poses a national security risk from a mind virus, like the notion that terrorism is a legitimate tool for advancing political goals because “you can’t make an omelette without breaking eggs.”
How can we moderate this threat? One approach is to steadily monitor the interaction of dynamic LLMs with real people and place guardrails on the influence that individual groups of users have on them.
Another method for avoiding real world catastrophes triggered by digital screens is to advocate for common sense through direct human-human interactions and resist the notion of digital oracles.
ABOUT THE AUTHOR
Avi Loeb is the head of the Galileo Project, founding director of Harvard University’s — Black Hole Initiative, director of the Institute for Theory and Computation at the Harvard-Smithsonian Center for Astrophysics, and the former chair of the astronomy department at Harvard University (2011–2020). He is a former member of the President’s Council of Advisors on Science and Technology and a former chair of the Board on Physics and Astronomy of the National Academies. He is the bestselling author of “Extraterrestrial: The First Sign of Intelligent Life Beyond Earth” and a co-author of the textbook “Life in the Cosmos”, both published in 2021. The paperback edition of his new book, titled “Interstellar”, was published in August 2024.