Explicit Sequence Proximity Models for Hidden State Identification

Anil Kota 1 | Sharath Chandra 1 | Parag Khanna 1 | Torbjørn S. Dahl 2, 3

1 Visvesvaraya National Institute of Technology, India | 2 University of Plymouth | 3 InstaDeep

Published

Sequence similarity is a critical concept for comparing short- and long-term memory in order to identify hidden states in partially observable Markov decision processes. While connectionist algorithms can learn a range of ad-hoc proximity functions, they do not reveal insights and generic principles that could improve overall algorithm efficiency.

Our work uses the instance-based Nearest Sequence Memory (NSM) algorithm as a basis for exploring different explicit sequence proximity models including the original NSM proximity model and two new models, temporally discounted proximity and Laplacian proximity. The models were compared using three benchmark problems, two discrete grid world problems and one continuous space navigation problem. The results show that more forgiving proximity models perform better than stricter models and that the difference between the models is more pronounced in the continuous navigation problem than in the discrete grid world problems.