Niceness is the first line of defense

The *first* line of defense in dealing with any partially superhuman AI system advanced enough to possibly be dangerous is that it does not *want* to hurt you or defeat your safety measures.

The first line of defense in constructing any sufficiently advanced Artificial General Intelligence is building an AI that does not want to hurt you. Any other measures, like AI-boxing or trying to [airgap_ai prevent the AI from accessing the Internet], should be thought of only as backstops in case this first line of defense fails. When designing the AGI we should first think as if all these oppositional measures don't exist, so that we aren't distracted while trying to envision an AGI that--regardless of [capability_gain how much power it has]--will not want to hurt us.

See also the non-adversarial principle and the distinction between Directing, vs. limiting, vs. opposing.