Reward API Reference
Reward functions used for hypergrid environment
GeneralHypergridRewardModule
Bases: BaseRewardModule[HypergridEnvState, HypergridEnvParams]
Source code in gfnx/reward/hypergrid.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
__init__(R0=0.001, R1=0.5, R2=2.0)
General reward function for hypegrids, defined as $$ R(s) = R0 + R1 \cdot \prod_{d=1}^D \ind{| s^d/(H-1) - 0.5| \in (0.25, 0.5)} + R2 \cdot \prod_{d=1}^D \ind{ | s^d/(H-1) - 0.5| \in (0.3, 0.4) } $$
Source: Madan, Kanika, et al. "Learning gflownets from partial episodes for improved convergence and stability." International Conference on Machine Learning. PMLR, 2023.
Source code in gfnx/reward/hypergrid.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | |