The objective of this master thesis is to optimize strategy in the multi-operator domain to achieve the lowest overall cost while meeting the dependability SLO by using machine learning. To reach this objective, we identified to use reinforcement learning algorithms to interact with an discrete event simulated environment to optimize the decision making both for a single operator and for multiple-operators who are working simultaneously to provide service.We designed a simulator which can be used to combine discrete event simulation and reinforcement learning algorithms. We created proper interfaces for our reinforcement learning agent to interact with one or multiple synchronized discrete event simulation processes. The reinforcement learning agent is able to complete its optimization learning process by getting real-time information from the simulator, deploying action orders to the simulator, and getting feedback from the simulator.Our results showed that by using Q-learning algorithm with our simulator, the overall cost for a single operator can be reduced by 4.9% averagely.This work has involved an analysis on how to use reinforcement learning in multi-operator scenario, what methods could be used and why these methods are considered to be appropriate for the problem. With a chapter devoted to the necessary background knowledge in reinforcement learning, the thesis should serve as an introduction to use reinforcement learning for optimal SLA/SLO contract negotiation in 5G for anyone who is interested in this field.