Vis enkel innførsel

dc.contributor.advisorDowning, Keith
dc.contributor.authorSørnes, Torstein
dc.date.accessioned2018-09-14T14:00:33Z
dc.date.available2018-09-14T14:00:33Z
dc.date.created2018-06-10
dc.date.issued2018
dc.identifierntnudaim:19523
dc.identifier.urihttp://hdl.handle.net/11250/2562774
dc.description.abstractWe introduce a domain-specific policy improvement operator for reassigning channels during call hand-offs with the intent of reducing hand-off blocking probability. We construct an RL agent for maximizing average grid utilization, which uses a linear neural network as state value-function approximator and afterstates for action selection. A variant of TD(0) with gradient correction (TDC) is proposed for average-reward MDPs, which in conjunction with the policy improvement operator contributes decreased hand-off call blocking probability in a simulated centralized caller environment without any penalty to previously shown (Singh & Bertsekas 1997) state of the art new call blocking probability. The policy improvement operator is also applied to the table-lookup based SARSA agent of Lilith (2004) where it shows state of the art performance in terms of hand-off blocking probability for an all-admission agent. While this works considers centralized systems, the policy improvement operator is applicable to distributed agents so long as the channel usages of the interfering neighbors of the hand-off arrival BS are known to the hand-off departure BS.
dc.languageeng
dc.publisherNTNU
dc.subjectDatateknologi, Kunstig intelligens
dc.titleContributions to centralized dynamic channel allocation reinforcement learning agents
dc.typeMaster thesis


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel