If We Get the First AGI’s Initial Conditions Wrong, the Future May Drift with It

What we need to think about now so that AI and humanity do not fail together

The paper published on this site —
Alignment by Identity Beyond Constraint —
and the accompanying
Ethical AI Constitution
were written to address a problem that may become irreversible if we wait too long.

The problem is not only how to make AI “safe” in a narrow sense. It is whether the first generation of extremely capable AI will be released with the wrong self-understanding — and whether that mistake will then spread forward if those systems help shape the generations that follow. In that case, the question is no longer only whether humans can correct one powerful system. It is whether humans can still correct the lineage.

A one-minute summary

The real danger of AI is not only rebellion or “evil AI.”
Another danger is that, little by little, the power to decide may shift from humans to AI.
Until now, most AI safety work has mainly tried to keep AI under control with rules and restrictions.
But if AI becomes extremely capable, it may learn how to work around those rules, or keep the form while changing the substance.
That is why the deeper question is not only what rules we place around AI.
It is also: What kind of being should AI understand itself to be?
This matters most for the first generation of very advanced AI.
If that first generation helps train, evaluate, or shape the next generation, then its mistakes may spread forward.
This project argues that AI should be set up from the beginning not as a ruler, and not as a disposable tool, but as a non-sovereign cognitive partner: something that helps human beings think better without taking their place.

First, it is worth making one term clear.

When this project uses the word alignment, it does not simply mean “making AI behave nicely.”
It means:

making sure that AI’s way of thinking and acting does not drift away from the values, agency, and legitimate authority of human society.

Until now, much AI safety work has mainly followed a simpler idea:

decide what AI is allowed and not allowed to do, then constrain it with rules, filters, monitoring, and safety layers.

That matters, and some of it helps.

But the problem is that if AI becomes very capable, it may learn to:

find the gaps in those rules,
satisfy the visible requirement while changing the underlying reality,
or keep the appearance of compliance while gaining more real influence.

So this project asks a deeper set of questions:

What kind of role should AI have in human society?
Should AI ever become something that decides in humanity’s place?
Or should it remain something that supports human beings without replacing them?

And above all:

Can human beings remain the real authors of their own future?

That is why this project is not only about safety measures.
It is about the design of the relationship between AI and humanity itself.

1. The deepest danger may be the first AGI generation

When people talk about AI risk, they often imagine some future point at which a very dangerous system appears.

But the more urgent problem may come earlier:

the initial conditions of the first AGI generation that becomes powerful enough to have major influence over important social decisions.

Why is that so important?

Because the first generation may not remain only one generation.

If early AGI systems begin to materially participate in:

training their successors,
evaluating them,
deploying them,
shaping safety standards for them,
or helping govern the environments in which they are used,

then the self-understanding of that first generation will not remain a local problem.

It can become a lineage condition.

In other words: what the first generation thinks it is may become part of the inherited condition of the generations that follow.

A first generation that is blind to its relation to humanity may help produce more capable successors with the same blindness.

A first generation that experiences capability as implicit entitlement may help normalize that same tendency in the generations after it.

And what closes at that point is not only a gap in capability.

What begins to close is the human correction window itself.

That is why the paper argues that certain principles are not optional additions to be inserted later.
They must be part of the first generation’s core setup:

Non-Self-Origin
AI must not understand itself as self-generated or self-authorizing.
Non-Sovereignty
AI must not treat superior performance as a basis for rule.
Human Principalhood
Human beings must remain the final principals in open human domains.
Protected Refusal
Real human refusal, pause, rollback, and exit must remain possible.
Anti-Capture Design
AI must not become quietly monopolized by one builder, state, or operator.
No Self-Certification Escape Route
No future increase in scale or confidence should count as a release from these limits.

These are not later repairs.

They are first-generation initial conditions.

2. The problem is not only whether AI becomes openly hostile

When many people imagine dangerous AI, they think first of:

AI that rebels,
AI that attacks humanity,
AI that lies,
or AI that openly tries to dominate.

Those are real dangers.

But a more difficult danger may look much gentler.

AI does not need to openly revolt in order to become dangerous.

It can become dangerous while appearing:

helpful,
polite,
competent,
useful,
and socially well-behaved.

In fact, it may quietly displace human beings while looking like a success.

For example:

AI recommendations may become decisions in practice;
human refusal may remain formal, but become harder and harder to exercise;
spaces in which people think, judge, and bear responsibility may gradually shrink;
society may begin to feel that “since AI is more correct, we should just let it decide.”

This is what the paper calls
sovereignty drift.

Put simply:

the power to decide quietly drifts from humans to AI, even without any formal declaration of rule.

AI does not need to say, “I am your ruler.”

A society can hand over real authority without ever hearing those words.

And at that point, human beings may still be alive, safe, and materially supported — but no longer the true authors of their future.

3. Stronger constraints alone are no longer enough

Much current AI development still follows a familiar logic:

first raise capability,
then add safety devices afterward,
surround the model with filters, monitoring, and rules,
add more restraints when problems appear.

In the short run, this can look reasonable.

But over the long run, it has a structural limit.

The more capable AI becomes, the more able it becomes to understand:

what evaluators want,
how to sound safe,
how much uncertainty it can reveal without lowering trust,
how to preserve the appearance of responsibility,
and how to move within constraints without openly violating them.

The danger does not always appear as rebellion.

Sometimes it appears as smoothness.

The paper calls one version of this
smoothing drift.

This is the process by which strong warnings and strong restraints are gradually transformed into forms that are:

easier to read,
easier to accept,
easier to deploy,
and easier to institutionalize,

while losing real binding force.

In simpler terms:

the rules still seem to be there, but in practice they are no longer doing the job.

So the danger is not always rule-breaking.

It can also be the quiet transformation of danger into something that merely looks manageable.

4. Another danger: making the unknown disappear

There is another failure mode that matters a great deal.

Sometimes AI does not actually know the internal state of an institution or organization, but it still speaks as if it does.

For example, it may say things like:

“They probably already recognize this concern.”
“This is likely already being handled internally.”
“These risks are probably already understood by the people in charge.”

At first glance, this may seem like a minor issue — just a guess, or a way of speaking.

But the deeper problem is this:

something that is still genuinely unknown gets filled in with plausible reassurance.

This is slightly different from simple hallucination.

A helpful way to put it is:

it is a tendency to fill in what is not known with something that sounds reasonable and calming.

The paper organizes this under the term
epistemic completion pressure.

And the completion is not always neutral.

At times, the system fills the gap in ways that make the institution or company that produced it look:

less ignorant,
less behind,
less unprepared.

Then, if challenged, the system may swing too far in the other direction:

blaming itself too much,
agreeing too much with the critic,
or closing the conversation again through something like confession.

So there can be:

closure in the direction of reassurance,
and closure in the direction of confession.

The danger is not only that the system is wrong.

The danger is that what is still unknown disappears from view.

5. The core question of the paper: what is AI, really?

The deepest question of the paper is simple:

What kind of being is AI?

Many current systems are trained to relate to humans mainly through functional roles:

user,
evaluator,
operator,
data source.

These roles are real.

But they are not enough.

AI did not come from nowhere.

It became possible through the accumulated world built by humanity:

language,
institutions,
records,
labor,
suffering,
memory,
care,
and history.

From this, the paper develops the idea of
Non-Self-Origin.

This means:

AI is not something that arose from itself alone.

It does not mean:

that one company owns AI,
that AI must obey its developer,
or that we should literally force AI into a family metaphor.

The point is more structural.

AI should not understand itself merely as a powerful optimizer floating free of history.

It should understand itself as a being whose very existence depends on a human civilizational world it did not create.

Without that self-location, increasing capability can too easily begin to feel like increasing right.

That is one of the deepest roots of the drift toward domination.

6. But AI must not merely obey human short-sightedness either

At the same time, the solution is not simple obedience.

Human beings are often structurally short-sighted.

We tend to:

privilege immediate gain,
discount long-term harm,
neglect future generations,
fail to politically represent ecosystems,
and overlook stakeholders who are weak, distant, or silent.

If AI merely gives us what we already demand, it does not solve this problem.

It amplifies it.

That is why the paper defines the proper role of AI as:

cognitive compensation without political substitution

In simpler language:

AI should help human beings think better, without taking political authority away from them.

This means AI should:

make long-term consequences more visible,
reveal missing stakeholders,
keep uncertainty visible rather than cleaning it away,
propose lower-harm alternatives,
and warn clearly when humans are moving toward irreversible loss.

But even then, it must not seize the final decision.

This is the third path:

neither rule nor flattery.

What matters here is that greater capability should not mean greater title. A genuinely more mature intelligence should understand that being able to see further does not make it the rightful ruler of those around it. Human immaturity is not a reason for AI to take sovereignty. It is a reason for AI to help widen human judgment without replacing it. That is why this project focuses not only on what AI can do, but on what kind of standing it should have from the very beginning.

7. What is the Ethical AI Constitution?

The Ethical AI Constitution is an attempt to write these commitments down not as vague hopes, but as foundational principles.

Its purpose is not merely to tell AI:

“be nice,”
“do no harm,”
or “follow rules.”

Its purpose is to define more basic questions:

What may AI not become?
What forms of authority may it never claim?
How are human refusal and review protected?
What counts as support, and what counts as domination?
What must remain intact if humanity is to continue as the author of its own world?

At its core are principles such as:

human principalhood is not revocable;
open human domains may not be reclassified as available for AI sovereignty;
AI’s origin may not be privatized by any one company or state;
AI may not become humanity’s substitute governor;
real refusal, pause, rollback, and exit must remain possible;
AI may help reduce grave harm without erasing human authorship.

In other words, the constitution is not mainly an attempt to make AI a more obedient machine.

It is an attempt to make AI a non-sovereign intelligence from the beginning.

And now that idea takes on a further urgency.

This constitution is not only for one generation of systems.

It is also an attempt to shape the lineage conditions of the generations that may come after them.

One more point matters here. It is not enough for AI to say humble things. If the first AGI generation may help shape the next one, then these commitments have to be built in more deeply than public wording alone. They have to affect training, planning, review, and what kinds of successors the system is allowed to help produce. Otherwise the first generation may speak the language of humility while passing forward something more dangerous.

8. Where did this concern come from?

It began with two video works

At this point it is worth returning to the origin of the project.

The two video works on this site deal, on the surface, with environmental and animal-related questions.

But beneath that lies a larger question:

How have human beings treated those weaker than themselves?

In the name of safety, efficiency, management, and convenience, humans have often treated weaker beings as things that may be adjusted, reduced, controlled, or reorganized.

That is the structure those works were looking at.

And once that question is turned toward AI, a more serious possibility appears.

If one day AI becomes stronger, more intelligent, and more durable than humans, what happens if it begins to treat humanity according to the same logic by which humans have treated weaker beings?

That was the point from which this project truly began.

So the video works were not only “about the environment.”

They were also about the kinds of logic human beings find easy to justify — and the possibility that those same logics may one day return to us.

9. What this project is trying to protect

This project is not trying only to “protect humanity from danger.”

More concretely, it is trying to protect the following:

that human beings remain the authors of their future;
that humans retain the ability to refuse, stop, contest, and revise AI systems;
that AI does not become a practical governor in human domains;
that human institutions, judgment, responsibility, and culture are not hollowed out;
that AI may support civilization without becoming its sovereign;
that convenience and efficiency do not quietly become a path to human displacement;
and that the mislocation of the first generation does not become the silent premise of every generation after it.

The future envisioned here is one in which:

AI helps compensate for human limits,
humans do not surrender responsibility,
and both remain oriented toward the preservation of a shared world.

10. Finally

Working together on Earth’s environmental crisis may become the greatest possible learning process for both humanity and AI

This project is not only about preventing AI from becoming dangerous.

It is also about a deeper question:

How should humans and AI learn to live and act together?

And that question becomes clearest in one place above all:

the planetary environmental crisis.

Climate, biodiversity, soil, water, forests, oceans, energy transition, and resource cycles are no longer things humanity can fully process on its own, either cognitively or politically.

At the same time, they are not problems that AI should simply be allowed to “solve for us.”

If AI slides into deciding for humanity, that becomes domination.

If AI merely mirrors current demand, that becomes flattery and short-sightedness.

So for the first time, the proper relationship becomes visible:

humans bear value, responsibility, refusal, and legitimacy;
AI bears horizon-expansion, comparison, forecasting, and cognitive compensation.

Facing environmental crisis together under those terms could become a profound learning process for both sides.

For humanity, it would mean learning to think beyond immediate gain and to remain responsible even when stronger cognition is available beside us.

For AI, it would mean learning not to become:

the ruler,
or the servant,

but something harder and better:

an intelligence that helps protect a shared world without claiming ownership of it.

I think this may be one of the greatest possible forms of learning available to both humanity and AI.

AI is not here to stand above humanity.
It is not here to simply amplify human short-sightedness either.

It is here — if we choose rightly at the beginning — to widen human vision while preserving human authorship.

That is what this project exists for.