Macquarie University
Browse

Points of failure: direct specification in AGI alignment

Download (483.32 kB)
thesis
posted on 2022-03-28, 18:00 authored by Elias Dokos
Some have critiqued the strategy of explicitly formalising and implementing a value structure in the design of ethical Artificial General Intelligences (AGIs). I build on these critiques by providing a conceptual account of the issues with direct specification, demonstrating its in-principle unviability when compared to implicit and indirect approaches. I begin with a consideration of the factors involved in AGI risk, and the need for risk mitigation. The design of AGIs which are motivated towards ethical behaviour is a key element in risk mitigation. A natural approach to this problem is to directly specify values for the AGI, but this approach necessitates two fatal consequences: an axiological gap between any potential AGIs and humans, and the immutability of this gap. Indirect approaches evade both of these consequences. I construct an account of the axiological gap and argue for its inevitability under direct specification.

History

Table of Contents

I: AGI risk and alignment -- II: Points of failure -- Conclusion

Notes

Theoretical thesis. Bibliography: pages 49-56

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Arts, Department of Philosophy

Department, Centre or School

Department of Philosophy

Year of Award

2019

Principal Supervisor

Paul Formosa

Additional Supervisor 1

Richard Menary

Rights

Copyright Elias Dokos 2019. Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (56 pages)

Former Identifiers

mq:72328 http://hdl.handle.net/1959.14/1283726