FPMCO decomposes multi-constraint RL into KL-projection sub-problems, achieving higher reward with lower computing than second-order rivals on the new SCIG robotics benchmark.
We investigate risk-averse stochastic optimization problems with a risk-shaping constraint in the form of a stochastic-order relation. Both univariate and multivariate orders are considered. We extend ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results