Points of failure: direct specification in AGI alignment

Dokos, Elias

doi:10.25949/19438166.v1

01whole.pdf (483.32 kB)

Points of failure: direct specification in AGI alignment

thesis

posted on 2022-03-28, 18:00 authored by Elias Dokos

Some have critiqued the strategy of explicitly formalising and implementing a value structure in the design of ethical Artificial General Intelligences (AGIs). I build on these critiques by providing a conceptual account of the issues with direct specification, demonstrating its in-principle unviability when compared to implicit and indirect approaches. I begin with a consideration of the factors involved in AGI risk, and the need for risk mitigation. The design of AGIs which are motivated towards ethical behaviour is a key element in risk mitigation. A natural approach to this problem is to directly specify values for the AGI, but this approach necessitates two fatal consequences: an axiological gap between any potential AGIs and humans, and the immutability of this gap. Indirect approaches evade both of these consequences. I construct an account of the axiological gap and argue for its inevitability under direct specification.

History

Notes

Theoretical thesis. Bibliography: pages 49-56

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Arts, Department of Philosophy

Department, Centre or School

Department of Philosophy

Year of Award

2019

Principal Supervisor

Paul Formosa

Additional Supervisor 1

Richard Menary

Rights

Copyright Elias Dokos 2019. Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (56 pages)

Former Identifiers

mq:72328 http://hdl.handle.net/1959.14/1283726

Usage metrics

Keywords

Artificial intelligence -- Philosophy Artificial intelligence -- Moral and ethical aspects direct specification superintelligence machine ethics indirect normativity singularity AI alignment Artificial intelligence AGI AI safety

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Points of failure: direct specification in AGI alignment

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Additional Supervisor 1

Rights

Language

Extent

Former Identifiers

Usage metrics

Categories

Keywords

Licence

Exports