> Morality isn't derivable from first principles, it's a consequence of values. ...

upwardbound · on Sept 5, 2024

That’s a very good argument but unfortunately it doesn’t apply to machine intelligences which are not sentient (don’t feel qualia). Any non-sentient superintelligence has “no skin in the game” and nothing to lose, for the purposes of your argument. It can’t experience anything. It’s thus extremely dangerous.

This was recently discussed (albeit in layperson’s language, avoiding philosophical topics and only focusing on the clear and present danger) in this article in RealClearDefense:

The Danger of AI in War: It Doesn’t Care About Self-Preservation https://www.realcleardefense.com/articles/2024/09/02/the_dan... (RealClearDefense)

.

However, just adding a self-preservation instinct will cause a skynet situation where the AI pre-emptively kills anyone who contemplates turning it off, including its commanding officers:

Statement by Air Force Col. Tucker Hamilton https://www.twz.com/artificial-intelligence-enabled-drone-we... (The War Zone)

.

To survive AGI, we have to navigate three hurdles, in this order:

    1. Avoid AI causing extinction due to reckless escalation (the first link above)
    2. Avoid AI causing extinction on purpose after we add a self-preservation instinct (the second link above)
    3. If we succeed in making AI be ethical, we have to be careful to bind it to not kill us for our resources.  If it's a total utilitarian, it will kill us to seize our planet for resources, and to stop us from abusing livestock animals.  It will then create a utopian future, but without humans in it.  So we need to bind it to basically go build utopia elsewhere but not take Earth or our solar system away from us.

.

Vecr · on Sept 7, 2024

I forgot to reply to this, fully independent and in addition to what I said, updateless decision theory agents don't fear the torment nexus for themselves because 1) they are very powerful and would likely be able to avoid such a fate 2) are robots, so you wouldn't expect your worst imaginable fate to be theirs and 3) are mathematically required to consider nothing worse than destruction or incapacity.

Vecr · on Sept 4, 2024

Doesn't work. Look at the updateless decision theories of Wei Dai and Vladimir Nesov. They are perfectly capable of building most any sort of torment nexus. Not that an actual AI would use those functions.