|
TorchCraftAI
A bot for machine learning research on StarCraft: Brood War
|
This sampler expects as input an unordered_map<string, Variant>, which contains an entry policyKey, which is a tensor of size [b, n]. More...
#include <sampler.h>
Inherits cpid::BaseSampler.
Public Member Functions | |
| MultinomialSampler (const std::string &policyKey=kPiKey, const std::string &actionKey=kActionKey, const std::string &pActionKey=kPActionKey) | |
| ag::Variant | sample (ag::Variant in) override |
| ag::Variant | computeProba (const ag::Variant &in, const ag::Variant &action) override |
Public Member Functions inherited from cpid::BaseSampler | |
| BaseSampler () | |
| virtual | ~BaseSampler ()=default |
Protected Attributes | |
| std::string | policyKey_ |
| std::string | actionKey_ |
| std::string | pActionKey_ |
This sampler expects as input an unordered_map<string, Variant>, which contains an entry policyKey, which is a tensor of size [b, n].
It outputs the same map, with a new key actionKey, a tensor of size [b] where each entry is in [0,n-1], and is the result of multinomial sampling over pi. It also adds a key pActionKey which corresponds to the probability of the sampled action.
| cpid::MultinomialSampler::MultinomialSampler | ( | const std::string & | policyKey = kPiKey, |
| const std::string & | actionKey = kActionKey, |
||
| const std::string & | pActionKey = kPActionKey |
||
| ) |
|
overridevirtual |
Reimplemented from cpid::BaseSampler.
|
overridevirtual |
Reimplemented from cpid::BaseSampler.
|
protected |
|
protected |
|
protected |
1.8.11