dc.contributor.advisor | Downing, Keith | |
dc.contributor.author | Sørnes, Torstein | |
dc.date.accessioned | 2018-09-14T14:00:33Z | |
dc.date.available | 2018-09-14T14:00:33Z | |
dc.date.created | 2018-06-10 | |
dc.date.issued | 2018 | |
dc.identifier | ntnudaim:19523 | |
dc.identifier.uri | http://hdl.handle.net/11250/2562774 | |
dc.description.abstract | We introduce a domain-specific policy improvement operator for reassigning
channels during call hand-offs with the intent of reducing hand-off blocking
probability. We construct an RL agent for maximizing average grid utilization,
which uses a linear neural network as state value-function approximator and
afterstates for action selection. A variant of TD(0) with gradient correction
(TDC) is proposed for average-reward MDPs, which in conjunction with
the policy improvement operator contributes decreased hand-off call blocking
probability in a simulated centralized caller environment without any penalty to
previously shown (Singh & Bertsekas 1997) state of the art new call blocking probability.
The policy improvement operator is also applied to the table-lookup based SARSA
agent of Lilith (2004) where it shows state of the art performance in terms
of hand-off blocking probability for an all-admission agent.
While this works considers centralized systems, the policy improvement operator
is applicable to distributed agents so long as the channel usages of the
interfering neighbors of the hand-off arrival BS are known to the hand-off
departure BS. | |
dc.language | eng | |
dc.publisher | NTNU | |
dc.subject | Datateknologi, Kunstig intelligens | |
dc.title | Contributions to centralized dynamic channel allocation reinforcement learning agents | |
dc.type | Master thesis | |