01whole.pdf (4.75 MB)
Download file

Deep reinforcement learning as text generator in image captioning

Download (4.75 MB)
thesis
posted on 28.03.2022, 16:03 by Farhad Amouzgar
Reinforcement learning (RL), as one of the oldest AI paradigms, has led to exciting results in recent years. Even though the research frontiers in the field are game playing and robotics, the natural language processing (NLP) community has also found many applications of RL as a solution for optimizing non-differentiable metrics in deep learning, including in text generation, image captioning and chatbots. However, current literature is mainly focused on the REINFORCE algorithm and its derivatives. REINFORCE is a robust algorithm, but it dates back to the 1990s and suffers from high variance compared to modern RL algorithms. To address this challenge, we study and analyze the recent state-of-the-art in RL. Taking image captioning as our specific NLP use case, we identify Proximal Policy Optimization (PPO) RL algorithms as suitable updates for REINFORCE, and propose methods for optimizing non-differentiable captioning metrics based on these. We experimentally evaluate them with respect to the REINFORCE-based standard and find that, while the static clipping mechanism of PPO is the key aspect of state-of-the-art results in game playing, it does not improve over REINFORCE in image captioning; rather, the actor-critic aspect of the algorithms has a more significant impact on the convergence of the model.

History

Table of Contents

1. Introduction -- 2. Background and literature review -- 3. OpenAI Gym Experiments -- 4. Proposed method for image captioning -- 5. Experimental results for image captioning -- 6. Conclusion and future work.

Notes

Theoretical thesis. Bibliography: pages 66-72

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2019

Principal Supervisor

Amin Beheshti

Additional Supervisor 1

Mark Dras

Rights

Copyright Farhad Amouzgar 2019. Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (xiv, 72 pages) illustrations

Former Identifiers

mq:71677 http://hdl.handle.net/1959.14/1276957