Denna tjänst avvecklas 2026-01-19. Läs mer här (länk)
Emil Widham
Scaling up Maximum Entropy Deep Inverse Reinforcement Learning with Transfer Learning
Abstract
|
In this thesis an issue with common inverse reinforcement learning algorithms
is identified, which causes them to be computationally heavy. A solution is
proposed which attempts to address this issue and which can be built upon in
the future.
The complexity of inverse reinforcement algorithms is increased because
at each iteration something called a reinforcement learning step is performed
to evaluate the result of the previous iteration and guide future learning. This
step is slow to perform for problems with large state spaces and where many
iterations are required. It has been observed that the problem solved in this step
in many cases is very similar to that of the previous iteration. Therefore the
solution suggested is to utilize transfer learning to retain some of the learned
information and improve speed at subsequent steps. In this thesis different forms
of transfers are evaluated for common reinforcement learning algorithms when
applied to this problem.
Experiments are run using value iteration and Q-learning as the algorithms
for the reinforcement learning step. The algorithms are applied to two route
planning problems and finds that in both cases a transfer can be useful for
improving calculation times. For value iteration the transfer is easy to understand
and implement and shows large improvements in speed compared to the basic
method. For Q-learning the implementation contains more variables and while
it shows an improvement it is not as dramatic as that for value iteration. The
conclusion drawn is that for inverse reinforcement learning implementations
using value iteration a transfer is always recommended while for implementations
using other algorithms for the reinforcement learning step a transfer is
most likely recommended but more experimentation needs to be conducted.
|