|
TorchCraftAI
A bot for machine learning research on StarCraft: Brood War
|
This sampler expects as input an unordered_map<string, Variant>, containing an entry policyKey, which is a tensor of size [b, n]. More...
#include <sampler.h>
Inherits cpid::BaseSampler.
Public Member Functions | |
| ContinuousGaussianSampler (const std::string &policyKey=kPiKey, const std::string &stdKey=kSigmaKey, const std::string &actionKey=kActionKey, const std::string &pActionKey=kPActionKey) | |
| ag::Variant | sample (ag::Variant in) override |
| ag::Variant | computeProba (const ag::Variant &in, const ag::Variant &action) override |
Public Member Functions inherited from cpid::BaseSampler | |
| BaseSampler () | |
| virtual | ~BaseSampler ()=default |
Protected Attributes | |
| std::string | policyKey_ |
| std::string | stdKey_ |
| std::string | actionKey_ |
| std::string | pActionKey_ |
This sampler expects as input an unordered_map<string, Variant>, containing an entry policyKey, which is a tensor of size [b, n].
It outputs the same map, with a new key kActionKey, a tensor of size [b] where each entry action[i] is sampled from a normal distribution centered in policy[i]. It also expects the stdKey to be set, it will be used as the standard deviation of the normal. It can be either a float/double, in which case the deviation will be the same for the batch, or it can be the same shape as the policy, for a finer control. It also adds a key pActionKey which corresponds to the probability of the sampled action.
| cpid::ContinuousGaussianSampler::ContinuousGaussianSampler | ( | const std::string & | policyKey = kPiKey, |
| const std::string & | stdKey = kSigmaKey, |
||
| const std::string & | actionKey = kActionKey, |
||
| const std::string & | pActionKey = kPActionKey |
||
| ) |
|
overridevirtual |
Reimplemented from cpid::BaseSampler.
|
overridevirtual |
Reimplemented from cpid::BaseSampler.
|
protected |
|
protected |
|
protected |
|
protected |
1.8.11