Abstract:
Robots have transformed the manufacturing industry and have been used for scientific
exploration in human inaccessible environments like distant planets, Oceans, etc.
However, a major barrier in its universal adoption is its lack of fragility and robustness in a complex and highly-diverse environment. This project constitutes the initial steps
towards flexibility regarding exploration strategies that can be applied to challenging
problems for autonomous grasping of rigid and deformable objects. Here, we employ
recent advances in Deep Reinforcement learning (RL) to generate simple reactive
behaviours like approaching, manipulating and retracting to pick an object. Once such
simple behaviours are learnt, these could be sequenced in various combinations to give rise to a complex task.
RL is a trial and error optimisation technique where an agent ought to take action in an
environment to maximise some notion of cumulative reward. Current research in RL has been formulated on traditional techniques such as Deep Q-learning and policy gradient methods. These methods have worked well when the feedback/reward is dense. Perhaps, in real-life scenarios, the feedback is sparse, and these methods tend to fail in finding the optimum solutions and exploring the environment robustly. In this work, we have implemented two different approaches to solve such sparse reward problem, namely Curiosity and Reactive behaviour repertoire for long time step tasks. Our results have shown an immense reduction in training steps required to reach the maximum reward state in high-dimensional continuous action space compared to the baselines.