We study the design of explicable reward functions for a reinforcement learning
agent while guaranteeing that an optimal policy induced by the function belongs
to a set of target policies. By being explicable, we seek to capture two properties:
(a) informativeness so that the rewards speed up the agent’s convergence, and (b)
sparseness as a proxy for ease of interpretability of the rewards. The key challenge
is that higher informativeness typically requires dense rewards for many learning
tasks, and existing techniques do not allow one to balance these two properties
appropriately. In this paper, we investigate the problem from the perspective
of discrete optimization and introduce a novel framework, EXPRD, to design
explicable reward functions. EXPRD builds upon an informativeness criterion that
captures the (sub-)optimality of target policies at different time horizons in terms of
actions taken from any given starting state. We provide a mathematical analysis of
EXPRD, and show its connections to existing reward design techniques, including
potential-based reward shaping. Experimental results on two navigation tasks
demonstrate the effectiveness of EXPRD in designing explicable reward functions.