Show simple item record

dc.contributor.advisorDowning, Keith
dc.contributor.authorSørnes, Torstein
dc.date.accessioned2018-09-14T14:00:33Z
dc.date.available2018-09-14T14:00:33Z
dc.date.created2018-06-10
dc.date.issued2018
dc.identifierntnudaim:19523
dc.identifier.urihttp://hdl.handle.net/11250/2562774
dc.description.abstractWe introduce a domain-specific policy improvement operator for reassigning channels during call hand-offs with the intent of reducing hand-off blocking probability. We construct an RL agent for maximizing average grid utilization, which uses a linear neural network as state value-function approximator and afterstates for action selection. A variant of TD(0) with gradient correction (TDC) is proposed for average-reward MDPs, which in conjunction with the policy improvement operator contributes decreased hand-off call blocking probability in a simulated centralized caller environment without any penalty to previously shown (Singh & Bertsekas 1997) state of the art new call blocking probability. The policy improvement operator is also applied to the table-lookup based SARSA agent of Lilith (2004) where it shows state of the art performance in terms of hand-off blocking probability for an all-admission agent. While this works considers centralized systems, the policy improvement operator is applicable to distributed agents so long as the channel usages of the interfering neighbors of the hand-off arrival BS are known to the hand-off departure BS.
dc.languageeng
dc.publisherNTNU
dc.subjectDatateknologi, Kunstig intelligens
dc.titleContributions to centralized dynamic channel allocation reinforcement learning agents
dc.typeMaster thesis


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record