mapl-cirup
MArkov PLanning with CIRcuit bellman UPdates
MArkov PLanning with CIRcuit bellman UPdates (pronounced as “maple syrup”) is an MDP solver which learns a symbolic policy function, performing the Bellman update with dynamic decision cricuits. Moreover, the dynamic decision circuit produced, can be used for gradient-based parameter learning.
More details can be found in the AAAI 2024 paper and in the GitHub repository.