This experiment is part of a master's thesis conducted under the supervision of Dr. Tony A. Wood and Andreas Schlaginhaufen. The maze escaping is formulated as a constrained Markov decision process (CMDP) in a finite state and action space. The CMDP is solved via linear programming in the occupancy measure.