Consequentialist preferences are reflectively stable by default

{
  localUrl: '../page/preference_stability.html',
  arbitalUrl: 'https://arbital.com/p/preference_stability',
  rawJsonUrl: '../raw/3r6.json',
  likeableId: '2701',
  likeableType: 'page',
  myLikeValue: '0',
  likeCount: '2',
  dislikeCount: '0',
  likeScore: '2',
  individualLikes: [
    'EricBruylant',
    'MiddleKek'
  ],
  pageId: 'preference_stability',
  edit: '3',
  editSummary: '',
  prevEdit: '2',
  currentEdit: '3',
  wasPublished: 'true',
  type: 'wiki',
  title: 'Consequentialist preferences are reflectively stable by default',
  clickbait: 'Gandhi wouldn't take a pill that made him want to kill people, because he knows in that case more people will be murdered.  A paperclip maximizer doesn't want to stop maximizing paperclips.',
  textLength: '2148',
  alias: 'preference_stability',
  externalUrl: '',
  sortChildrenBy: 'likes',
  hasVote: 'false',
  voteType: '',
  votesAnonymous: 'false',
  editCreatorId: 'EliezerYudkowsky',
  editCreatedAt: '2016-05-22 15:15:11',
  pageCreatorId: 'EliezerYudkowsky',
  pageCreatedAt: '2016-05-21 12:56:32',
  seeDomainId: '0',
  editDomainId: 'EliezerYudkowsky',
  submitToDomainId: '0',
  isAutosave: 'false',
  isSnapshot: 'false',
  isLiveEdit: 'true',
  isMinorEdit: 'false',
  indirectTeacher: 'false',
  todoCount: '0',
  isEditorComment: 'false',
  isApprovedComment: 'true',
  isResolved: 'false',
  snapshotText: '',
  anchorContext: '',
  anchorText: '',
  anchorOffset: '0',
  mergedInto: '',
  isDeleted: 'false',
  viewCount: '160',
  text: 'Suppose that Gandhi doesn't want people to be murdered.  Imagine that you offer Gandhi a pill that will make him start *wanting* to kill people.  If Gandhi *knows* that this is what the pill does, Gandhi will refuse the pill, because Gandhi expects the result of taking the pill to be that future-Gandhi wants to murder people and then murders people and then more people will be murdered and Gandhi regards this as bad.  By a similar logic, a [2c sufficiently intelligent] [10h paperclip maximizer] - an agent which always outputs the action it expects to lead to the greatest number of paperclips - will by default not perform any self-modification action that makes it not want to produce paperclips, because then future-Clippy will produce fewer paperclips, and then there will be fewer paperclips, so present-Clippy does not evaluate this self-modification as the action that produces the highest number of expected future paperclips.\n\nAnother way of stating this is that protecting the representation of the utility function, and creating only other agents with similar utility functions, are both [10g convergent instrumental strategies], for consequentialist agents which [3nf understand the big-picture relation] between their code and the real-world consequences.\n\nAlthough the instrumental *incentive* to prefer stable preferences seems like it should follow from consequentialism plus big-picture understanding, less advanced consequentialists might not be *able* to self-modify in a way that preserves understanding - they might not understand which self-modifications or constructed successors lead to which kind of outcomes.  We could see this as a case of "The agent has no preference-preserving self-improvements in its subjective policy space, but would want an option like that if available."\n\nThat is:\n\n- Wanting preference stability follows from [9h Consequentialism] plus [3nf Big-Picture Understanding].\n- Actual preference stability furthermore requires some prerequisite level of skill at self-modification, which might perhaps be high, or too much caution to self-modify absent the policy option of preserving preferences.',
  metaText: '',
  isTextLoaded: 'true',
  isSubscribedToDiscussion: 'false',
  isSubscribedToUser: 'false',
  isSubscribedAsMaintainer: 'false',
  discussionSubscriberCount: '2',
  maintainerCount: '1',
  userSubscriberCount: '0',
  lastVisit: '',
  hasDraft: 'false',
  votes: [],
  voteSummary: 'null',
  muVoteSummary: '0',
  voteScaling: '0',
  currentUserVote: '-2',
  voteCount: '0',
  lockedVoteType: '',
  maxEditEver: '0',
  redLinkCount: '0',
  lockedBy: '',
  lockedUntil: '',
  nextPageId: '',
  prevPageId: '',
  usedAsMastery: 'false',
  proposalEditNum: '0',
  permissions: {
    edit: {
      has: 'false',
      reason: 'You don't have domain permission to edit this page'
    },
    proposeEdit: {
      has: 'true',
      reason: ''
    },
    delete: {
      has: 'false',
      reason: 'You don't have domain permission to delete this page'
    },
    comment: {
      has: 'false',
      reason: 'You can't comment in this domain because you are not a member'
    },
    proposeComment: {
      has: 'true',
      reason: ''
    }
  },
  summaries: {},
  creatorIds: [
    'EliezerYudkowsky'
  ],
  childIds: [],
  parentIds: [
    'reflective_stability',
    'convergent_strategies'
  ],
  commentIds: [],
  questionIds: [],
  tagIds: [
    'start_meta_tag'
  ],
  relatedIds: [],
  markIds: [],
  explanations: [],
  learnMore: [],
  requirements: [],
  subjects: [],
  lenses: [],
  lensParentId: '',
  pathPages: [],
  learnMoreTaughtMap: {},
  learnMoreCoveredMap: {},
  learnMoreRequiredMap: {},
  editHistory: {},
  domainSubmissions: {},
  answers: [],
  answerCount: '0',
  commentCount: '0',
  newCommentCount: '0',
  linkedMarkCount: '0',
  changeLogs: [
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10801',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '3',
      type: 'newEdit',
      createdAt: '2016-05-22 15:15:11',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10800',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '2',
      type: 'newTag',
      createdAt: '2016-05-22 15:13:55',
      auxPageId: 'start_meta_tag',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10794',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '0',
      type: 'deleteTag',
      createdAt: '2016-05-22 15:10:44',
      auxPageId: 'stub_meta_tag',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10792',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '0',
      type: 'newAlias',
      createdAt: '2016-05-22 15:07:24',
      auxPageId: '',
      oldSettingsValue: 'goals_reflectively_stable',
      newSettingsValue: 'preference_stability'
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10763',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '2',
      type: 'newEdit',
      createdAt: '2016-05-21 12:58:32',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10760',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '1',
      type: 'newEdit',
      createdAt: '2016-05-21 12:56:32',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10756',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '1',
      type: 'newParent',
      createdAt: '2016-05-21 12:56:13',
      auxPageId: 'convergent_strategies',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10755',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '1',
      type: 'newTag',
      createdAt: '2016-05-21 12:52:11',
      auxPageId: 'stub_meta_tag',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '10754',
      pageId: 'preference_stability',
      userId: 'EliezerYudkowsky',
      edit: '1',
      type: 'newParent',
      createdAt: '2016-05-21 12:52:06',
      auxPageId: 'reflective_stability',
      oldSettingsValue: '',
      newSettingsValue: ''
    }
  ],
  feedSubmissions: [],
  searchStrings: {},
  hasChildren: 'false',
  hasParents: 'true',
  redAliases: {},
  improvementTagIds: [],
  nonMetaTagIds: [],
  todos: [],
  slowDownMap: 'null',
  speedUpMap: 'null',
  arcPageIds: 'null',
  contentRequests: {}
}