• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for IKT og realfag
  • View Item
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for IKT og realfag
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Path Finding using Reinforcement Learning

Enger, Andreas Bull
Master thesis
Thumbnail
View/Open
no.ntnu:inspera:314598747:132119749.pdf (2.279Mb)
no.ntnu:inspera:314598747:132119749.zip (615bytes)
URI
https://hdl.handle.net/11250/3201746
Date
2025
Metadata
Show full item record
Collections
  • Institutt for IKT og realfag [697]
Abstract
 
 
Path planning plays a vital role in robotics and autonomous vehicles, enabling

efficient navigation toward a target. This project utilizes reinforcement learning

through trial and error to determine an optimal path. A custom environment will

be developed using OpenAI Gymnasium, where an agent navigates a scene. The

reward system encourages movement toward the target, incorporating checkpoints

to guide the agent in the desired direction.

The agent is trained using reinforcement learning algorithms, including Proximal

Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Twin

Delayed DDPG (TD3), and Soft Actor-Critic (SAC). The results of these algo-

rithms are compared to evaluate their performance. Additionally, Optuna is used

for hyperparameter optimization, and the results are analyzed against manually

set hyperparameters. Finally, the trained models are tested, and the paths gener-

ated by the agents are examined.

The project successfully met its primary objectives, demonstrating the effective-

ness of the reinforcement learning algorithms in training agents to navigate the

environment. However, while the agents were able to generate feasible paths to-

ward their targets, there is still room for improvement in terms of path efficiency,

smoothness, and overall optimization. Future enhancements could focus on refin-

ing the reward function, incorporating more sophisticated exploration strategies,

or integrating additional constraints to improve the quality of the generated paths.
 
Publisher
NTNU

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit