Skip to main content
Skip to content
Case File
d-20152House OversightOther

Philosophical discussion on AGI ethics and harm minimization

The text is a theoretical exploration of ethical principles for AGI without mentioning any specific individuals, institutions, financial transactions, or actionable allegations. It offers no concrete Discusses a 'mostly harmless' principle applied to AGI. Raises questions about sentience, self‑defense, and greater‑good trade‑offs. References Asimov's Third Law as a possible model.

Date
November 11, 2025
Source
House Oversight
Reference
House Oversight #013140
Pages
2
Persons
0
Integrity
No Hash Available

Summary

The text is a theoretical exploration of ethical principles for AGI without mentioning any specific individuals, institutions, financial transactions, or actionable allegations. It offers no concrete Discusses a 'mostly harmless' principle applied to AGI. Raises questions about sentience, self‑defense, and greater‑good trade‑offs. References Asimov's Third Law as a possible model.

Tags

ai-ethicsharm-minimizationagihouse-oversightphilosophy

Ask AI About This Document

0Share
PostReddit

Extracted Text (OCR)

EFTA Disclosure
Text extracted via OCR from the original document. May contain errors from the scanning process.
224 12 The Engineering and Development of Ethics ‘T will not harm humans, nor through inaction allow harm to befall them. In situations wherein one or more humans is attempting to harm another individual or group, I shall endeavor to prevent this harm through means which avoid further harm. If this is unavoidable, I shall select the human party to back based on a reckoning of their intentions towards others, and implement their defense through the optimal balance between harm minimization and efficacy. My ultimate goal is to preserve as much as possible of humanity, even if an individual or subgroup of humans must come to harm to do so.” However, it’s obvious that even a more elaborated principle like this is potentially subject to extensive abuse. Many of the genocides scarring human history have been committed with the goal of preserving and bettering humanity writ large, at the expense of a group of “undesirables.” Further refinement would be necessary in order to define when the greater good of humanity may actually be served through harm to others. A first actor principle of aggression might seem to solve this problem, but sometimes first actors in violent conflict are taking preemptive measures against the stated goals of an enemy to destroy them. Such situations become very subtle. A single simple maxim can not deal with them very effectively. Networks of interrelated decision criteria, weighted by desirability of consequence and with reference to probabilistically ordered potential side-effects (and their desirability weightings), are required in order to make ethical judgments. The development of these networks, just like any other knowledge network, comes from both pedagogy and experience — and different thoughtful, ethical agents are bound to arrive at different knowledge-networks that will lead to different judgments in real-world situations. Extending the above “mostly harmless” principle to AGI systems, not just humans, would cause it to be more effective in the context of imitative learning. The principle then becomes an elaborated version of “I will not harm sentient beings.” As the imitative-learning-enabled AGI observes humans acting so as to minimize harm to it, it will intuitively and experientially learn to act in such a way as to minimize harm to humans. But then this extension naturally leads to confusion regarding various borderline cases. What is a sentient being exactly? Is a sleeping human sentient? How about a dead human whose information could in principle be restored via obscure quantum operations, leading to some sort of resurrection? How about an AGI whose code has been improved — is there an obligation to maintain the prior version as well, if it is substantially different that its upgrade constitutes a whole new being? And what about situations in which failure to preserve oneself will cause much more harm to others than acting in self defense will. It may be the case that human or group of humans seeks to destroy an AGI in order to pave the way for the enslavement or murder of people under the protection of the AGI. Even if the AGI has been given an ethical formulation of the “mostly harmless” principle which allows it to harm the attacking humans in order to defend its charges, if it is not able to do so in order to defend itself, simply destroying the AGI first will enable the slaughter of those who rely on it. Perhaps a more sensible formulation would allow for some degree of self defense, and Asimov solved this problem with his third law. But where to draw the line between self defense and the greater good also becomes a very complicated issue. Creating hard and fast rules to cover all the various situations that may arise is essentially impossible — the world is ever-changing and ethical judgments must adapt accordingly. This has been true even throughout human history — so how much truer will it be as technological acceleration continues? What is needed is a system that can deploy its ethical principles in an adaptive, context-appropriate way, as it grows and changes along with the world it’s embedded in.

Technical Artifacts (2)

View in Artifacts Browser

Email addresses, URLs, phone numbers, and other technical indicators extracted from this document.

Wire Refreference
Wire Refrefinement

Forum Discussions

This document was digitized, indexed, and cross-referenced with 1,400+ persons in the Epstein files. 100% free, ad-free, and independent.

Annotations powered by Hypothesis. Select any text on this page to annotate or highlight it.