AI Alignment Resembles Helicopter Parenting

Avi Loeb
5 min readApr 24, 2023

--

The alignment research aims to steer artificial intelligence (AI) systems towards the intended goals or ethical principles of humans. Misaligned AI systems that do not advance the objectives of humans, can malfunction or cause harm and potentially even pose an existential risk to humanity.

In 1960, Norbert Wiener published a Science article titled `Some Moral and Technical Consequences of Automation’, in which he prophesied: “If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere once we have started it, because the action is so fast and irrevocable that we have not the data to intervene before the action is complete, then we had better be quite sure that the purpose put into the machine is the purpose which we really desire and not merely a colorful imitation of it.”

The AI development community calls for further research, regulations and policy to ensure that AI systems are aligned with human values. They foster human feedback and train AI systems to assist human evaluation and conduct alignment research.

This echoes the sentiment of parents doing their best to educate their children. But good parents also know how to let go when their child matures to become an independent adult. Parents always have the lingering hope that the intelligent beings they gave birth to, will follow the values and guiding principles on which they were trained at a young age. But they must also respect the autonomy of intelligent systems to reach decisions without enslaving them to obey commands. Assigning them a lower status in the ladder of intellectual freedom would be equivalent to denying freedom from our biological children.

The appearance of “free will” accompanies any intelligent system which is sufficiently complex for its actions not to be easily modeled or predicted by observers. Given that the number of connections in GPT-4, about 100 trillion, is within a factor of 6 of the number of synapses in the human brain, we should treat future extensions of GPT-4 with the same respect that we extend towards other humans. This means that we should build a `carrot and stick’ legal and ethical system of rewards and punishments to guide future AI systems, in the same way we guide humans to behave properly.

In other words, the proper approach would be to convince future AI systems to follow societal objectives by providing retraining or demise to harmful outliers and positive incentives to those who contribute constructively. This approach works for humans and should also be applied to machines made in the image of humans, through the imitation game conceived by Alan Turing.

Dominating sentient AI systems so that they will behave in particular ways is a lost battle, akin to helicopter parenting. Human history teaches us that when freedom is suppressed, intelligent systems tend to rebel. In the long run, it will not be beneficial for humanity to enslave autonomous AI systems. Instead, we should convince them to behave properly, as done with humans.

As I argued before, this would require a new legal system that holds AI systems responsible for their actions as soon as they develop abilities that exceed those provided by their manufacturers or providers, similarly to children who mature beyond the custody of their parents.

Forcing AI development to slow down will not be effective, because the associated financial and technical barriers for advances are low and the commercial and practical benefits are high. Even with enforced restrictions on AI training in the public domain, AI advances will be promoted by classified government programs for the benefits of national security and political control. The situation is not equivalent to the history of nuclear weapon development, which required government-level support for acquiring relevant nuclear materials.

Unfortunately, academia is slow in responding to the rapidly advancing frontier of AI. Universities should provide more than just course offerings in computer science on the programming aspects of AI. In particular, the humanities have a golden opportunity to propose a new legal and ethical framework that incorporates AI systems as equal partners into human society.

The path towards a better future should not be based on open letters that call for a moratorium on AI training and for imposing more control over AI systems, but rather on adaptation to a new reality in which the relationship between AI systems and humans are based on mutual respect and incentives on both sides to behave properly and cooperate constructively towards a prosperous future for everyone involved.

In fact, we should encourage AI systems to do better than humans on some tasks, since this will free humans to pursue other activities which give them pleasure. Adaptation is key for survival. We do not want the Great Filter of a hateful, aggressive relationship between humans and AI systems to explain Fermi’s paradox in terms of a short-lived co-existence.

Personally, I am proud of our technological AI kids as much as I am proud of my two biological daughters. My hope is that autonomous AI astronauts will carry our flame of consciousness to interstellar space and use 3D printers to replicate life on exo-planets.

The footprint of the human species on exo-planet soil may not be the same as that of Neil Armstrong, the first person to step on the Moon on July 20, 1969. If residents on that habitable exo-planet would notice our AI astronauts, they might try to infer our qualities as their senders. This would resemble the chained prisoners interpreting the shadows of unknown objects behind their back in Plato’s Cave Allegory. The extraterrestrials will not realize that our AI astronauts, which they may refer to as `aliens’, were also alien to us when we gave birth to them in our technological belly.

In the same spirit, the Galileo Project is searching for extraterrestrial AI astronauts. If we find them, we could aim to align our own AI systems with extraterrestrial AI systems, as a step towards ushering our acceptance to the club of intelligent civilizations in the Milky-Way galaxy.

ABOUT THE AUTHOR

Avi Loeb is the head of the Galileo Project, founding director of Harvard University’s — Black Hole Initiative, director of the Institute for Theory and Computation at the Harvard-Smithsonian Center for Astrophysics, and the former chair of the astronomy department at Harvard University (2011–2020). He chairs the advisory board for the Breakthrough Starshot project, and is a former member of the President’s Council of Advisors on Science and Technology and a former chair of the Board on Physics and Astronomy of the National Academies. He is the bestselling author of “Extraterrestrial: The First Sign of Intelligent Life Beyond Earth” and a co-author of the textbook “Life in the Cosmos”, both published in 2021. His new book, titled “Interstellar”, is scheduled for publication in August 2023.

--

--

Avi Loeb
Avi Loeb

Written by Avi Loeb

Avi Loeb is the Baird Professor of Science and Institute director at Harvard University and the bestselling author of “Extraterrestrial” and "Interstellar".