IRL and RL from Behavior and Internal States

Using real or artificial behavior data to extract reward functions of behaving agents and study the goodness or usefulness of such functions as we increase data size. Is there a limit here? Perhaps we find that it's not the size of the datasets but certain characteristics?

Anticipating the arrival of neural and biometric data and evaluating whether such datasets would improve our representation of values and reward functions. So is there something as good as behavior (e.g. internal thoughts/neural states) that we can train our IRL/RL algorithms on?

IRL from massively large behavior datasets

What if we could capture behavior data for an organism for hundreds of hours (or more) – could we use IRL to reliably extract reward functions and policies that are reasonable or plausible?
Note: We are in the process of curating a large dataset with millions of time points for rodent behavior to be released on eLife in August to be used at the hackathon. More updates soon.

RL from internal states

What if in addition to behavior data, we also had realtime access to the internal states (e.g. brain states) of an organism – would this improve the accuracy of our reward functions? Here, we may not have access to biological organism neural states – but another option may be to train DeepRL nets on the behavior generative models of artificial organisms.