Deep reinforcement learning as text generator in image captioning

Amouzgar, Farhad

doi:10.25949/19435757.v1

01whole.pdf (4.75 MB)

Deep reinforcement learning as text generator in image captioning

thesis

posted on 2022-03-28, 16:03 authored by Farhad Amouzgar

Reinforcement learning (RL), as one of the oldest AI paradigms, has led to exciting results in recent years. Even though the research frontiers in the field are game playing and robotics, the natural language processing (NLP) community has also found many applications of RL as a solution for optimizing non-differentiable metrics in deep learning, including in text generation, image captioning and chatbots. However, current literature is mainly focused on the REINFORCE algorithm and its derivatives. REINFORCE is a robust algorithm, but it dates back to the 1990s and suffers from high variance compared to modern RL algorithms. To address this challenge, we study and analyze the recent state-of-the-art in RL. Taking image captioning as our specific NLP use case, we identify Proximal Policy Optimization (PPO) RL algorithms as suitable updates for REINFORCE, and propose methods for optimizing non-differentiable captioning metrics based on these. We experimentally evaluate them with respect to the REINFORCE-based standard and find that, while the static clipping mechanism of PPO is the key aspect of state-of-the-art results in game playing, it does not improve over REINFORCE in image captioning; rather, the actor-critic aspect of the algorithms has a more significant impact on the convergence of the model.

History

Notes

Theoretical thesis. Bibliography: pages 66-72

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2019

Principal Supervisor

Amin Beheshti

Additional Supervisor 1

Mark Dras

Rights

Copyright Farhad Amouzgar 2019. Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (xiv, 72 pages) illustrations

Former Identifiers

mq:71677 http://hdl.handle.net/1959.14/1276957

Usage metrics

Keywords

Natural language processing (Computer science)image captioning Deep learning reinforcement learning Reinforcement learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Deep reinforcement learning as text generator in image captioning

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Additional Supervisor 1

Rights

Language

Extent

Former Identifiers

Usage metrics

Categories

Keywords

Licence

Exports