Text this: A multi-depot dynamic vehicle routing problem with stochastic road capacity: an MDP model and dynamic policy for post-decision state rollout algorithm in reinforcement learning