State, Action taken, reward Taking in an additional action taken allows you to not just take the max action, but use your own inference strategy, for example, if some actions are invalid.
More...
#include <zeroordertrainer.h>
Inherits cpid::ReplayBufferFrame.
State, Action taken, reward Taking in an additional action taken allows you to not just take the max action, but use your own inference strategy, for example, if some actions are invalid.
cpid::OnlineZORBReplayBufferFrame::OnlineZORBReplayBufferFrame |
( |
std::vector< torch::Tensor > |
state, |
|
|
std::vector< long > |
actions, |
|
|
double |
reward |
|
) |
| |
|
inline |
std::vector<long> cpid::OnlineZORBReplayBufferFrame::actions |
double cpid::OnlineZORBReplayBufferFrame::reward |
std::vector<torch::Tensor> cpid::OnlineZORBReplayBufferFrame::state |
The documentation for this struct was generated from the following file: