TorchCraftAI
A bot for machine learning research on StarCraft: Brood War
|
This sampler expects as input an unordered_map<string, Variant>, containing an entry policyKey, which is a tensor of size [b, n]. More...
#include <sampler.h>
Inherits cpid::BaseSampler.
Public Member Functions | |
ContinuousGaussianSampler (const std::string &policyKey=kPiKey, const std::string &stdKey=kSigmaKey, const std::string &actionKey=kActionKey, const std::string &pActionKey=kPActionKey) | |
ag::Variant | sample (ag::Variant in) override |
ag::Variant | computeProba (const ag::Variant &in, const ag::Variant &action) override |
Public Member Functions inherited from cpid::BaseSampler | |
BaseSampler () | |
virtual | ~BaseSampler ()=default |
Protected Attributes | |
std::string | policyKey_ |
std::string | stdKey_ |
std::string | actionKey_ |
std::string | pActionKey_ |
This sampler expects as input an unordered_map<string, Variant>, containing an entry policyKey, which is a tensor of size [b, n].
It outputs the same map, with a new key kActionKey, a tensor of size [b] where each entry action[i] is sampled from a normal distribution centered in policy[i]. It also expects the stdKey to be set, it will be used as the standard deviation of the normal. It can be either a float/double, in which case the deviation will be the same for the batch, or it can be the same shape as the policy, for a finer control. It also adds a key pActionKey which corresponds to the probability of the sampled action.
cpid::ContinuousGaussianSampler::ContinuousGaussianSampler | ( | const std::string & | policyKey = kPiKey , |
const std::string & | stdKey = kSigmaKey , |
||
const std::string & | actionKey = kActionKey , |
||
const std::string & | pActionKey = kPActionKey |
||
) |
|
overridevirtual |
Reimplemented from cpid::BaseSampler.
|
overridevirtual |
Reimplemented from cpid::BaseSampler.
|
protected |
|
protected |
|
protected |
|
protected |