01whole.pdf (483.32 kB)
Download file

Points of failure: direct specification in AGI alignment

Download (483.32 kB)
posted on 28.03.2022, 18:00 by Elias Dokos
Some have critiqued the strategy of explicitly formalising and implementing a value structure in the design of ethical Artificial General Intelligences (AGIs). I build on these critiques by providing a conceptual account of the issues with direct specification, demonstrating its in-principle unviability when compared to implicit and indirect approaches. I begin with a consideration of the factors involved in AGI risk, and the need for risk mitigation. The design of AGIs which are motivated towards ethical behaviour is a key element in risk mitigation. A natural approach to this problem is to directly specify values for the AGI, but this approach necessitates two fatal consequences: an axiological gap between any potential AGIs and humans, and the immutability of this gap. Indirect approaches evade both of these consequences. I construct an account of the axiological gap and argue for its inevitability under direct specification.


Table of Contents

I: AGI risk and alignment -- II: Points of failure -- Conclusion


Theoretical thesis. Bibliography: pages 49-56

Awarding Institution

Macquarie University

Degree Type

Thesis MRes


MRes, Macquarie University, Faculty of Arts, Department of Philosophy

Department, Centre or School

Department of Philosophy

Year of Award


Principal Supervisor

Paul Formosa

Additional Supervisor 1

Richard Menary


Copyright Elias Dokos 2019. Copyright disclaimer: http://mq.edu.au/library/copyright




1 online resource (56 pages)

Former Identifiers

mq:72328 http://hdl.handle.net/1959.14/1283726