Modelling Stochastic Transition with a Novel GAN. | As 1st author. | Supervised directly by Prof. Michael Littman
Main works: We show that currently popular GANs struggle to learn stochastic transitions in model-based RL with closely matched probability distribution. In response, we propose a novel GAN, namely SGAN, accomplished by a modification to the loss of the discriminator in traditional GAN’s paradigm. We propose the optimal SGAN we are expecting and give 3 pages’ theoretical proof to show how the proposed algorithm can achieve this optimal SGAN. In experiments, SGAN advances multiple domains (including a real-world domain) significantly.
Modelling Attention in Panoramic Video with RL. | As 1st author. | Supervised directly by Prof. Mai Xu.
Main works: We establish a database collecting subjectsąŕ head movement (HM) positions on panoramic video sequences, and we find from our database that the HM data are highly consistent across subjects. We further find that deep reinforcement learning (DRL) can be applied in predicting HM positions, seen as actions of an agent. Based on our findings, we propose a DRL based HM Prediction (DHP) approach in offline and online versions, called offline-DHP and online-DHP, respectively. Experimental results validate that offline-DHP and online-DHP are effective in predicting HM positions of panoramic video in offline and online manners, respectively. Experimental results also show that the learned offline-DHP model is capable of improving the performance of online-DHP.
A Novel Network Architecture for Multi-Task RL. | As 1st author. | Supervised directly by Prof. Mai Xu.
Main works: We conduct a case study on human with similar games from Atari, and found the existence of hierarchical shared knowledges across similar tasks. Inspired by how human develop hierarchical knowledges, we propose a novel deep network, namely Generalization Tower Network (GTN), enabling task-label-free multi-task RL within a single model. The main novelty of GTN is to introduce vertical streams, the effectiveness of which is validated by Fisher Sensitivity (FS) analysis. Experimental results verify that our GTN architecture is able to advance the state-of-the-art multi-task RL, via being tested on 51 Atari games.