1

Transformers are Meta-Reinforcement Learners

PulseRL: Enabling Offline Reinforcement Learning for Digital Marketing Systems via Conservative Q-Learning

MARS-Gym: Offline Reinforcement Learning for Recommender Systems in Marketplaces

Contextual Meta-Bandit for Recommender Systems Selection

Bottom-Up Meta-Policy Search

Learning Humanoid Robot Running Skills through Proximal Policy Optimization