mapl-cirup

MArkov PLanning with CIRcuit bellman UPdates

MArkov PLanning with CIRcuit bellman UPdates (pronounced as “maple syrup”) is an MDP solver which learns a symbolic policy function, performing the Bellman update with dynamic decision cricuits. Moreover, the dynamic decision circuit produced, can be used for gradient-based parameter learning.

More details can be found in the AAAI 2024 paper and in the GitHub repository.