TorchCraftAI
A bot for machine learning research on StarCraft: Brood War
|
Functions | |
float | betaSampling (float x, float a, float b) |
Gets a sample from a Beta(a, b) if you have "x" a sample from a uniform in 0,1: x^(a-1)*(1-x)^(b-1)/(gamma(a)*gamma(b)/gamma(a+b)) with gamma http://www.cplusplus.com/reference/cmath/tgamma/. More... | |
float | thompsonSamplingScore (cherrypi::model::BuildOrderCount const &count, float thompson_a, float thompson_b) |
Computes a score for build order j based on Thompson sampling (stochastic) More... | |
float | thompsonRollingSamplingScore (cherrypi::model::BuildOrderCount const &count, float thompson_a, float thompson_b, float thompson_gamma) |
Computes a Thompson sampling score on a rolling average (w/ exponential decay) More... | |
float | ucb1Score (cherrypi::model::BuildOrderCount const &count, int allStrategyGamesCount, float ucb1_c) |
Computes UCB1 score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) Untested build orders get a priority. More... | |
float | ucb1RollingScore (cherrypi::model::BuildOrderCount const &count, int allStrategyGamesCount, float ucb1_c, float ucb1_gamma) |
Computes UCB1 score on a rolling average (w/ exponential decay) More... | |
float | expMooRollingSamplingScore (cherrypi::model::BuildOrderCount const &count, float moo_mult, float moo_gamma) |
Computes Exp Moo score on a rolling average (w/ exponential decay) More... | |
float | maxExploitScore (cherrypi::model::BuildOrderCount const &count, int allStrategyGamesCount, float ucb1_c) |
Computes UCB1-style score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) but builds with high win rate get first priority Untested build orders get second priority. More... | |
std::string | chooseBuildOrder (std::map< std::string, cherrypi::model::BuildOrderCount > const &buildOrderCounts, std::string scoreAlgorithm, float ucb1_c, float bandit_gamma, float thompson_a, float thompson_b, float moo_mult) |
Chooses the build order with maximum score according to the provided scoring algorithm The assumption is that this is called once per game, or at least acted upon based on the last call! More... | |
float cherrypi::model::score::betaSampling | ( | float | x, |
float | a, | ||
float | b | ||
) |
Gets a sample from a Beta(a, b) if you have "x" a sample from a uniform in 0,1: x^(a-1)*(1-x)^(b-1)/(gamma(a)*gamma(b)/gamma(a+b)) with gamma http://www.cplusplus.com/reference/cmath/tgamma/.
std::string cherrypi::model::score::chooseBuildOrder | ( | const std::map< std::string, cherrypi::model::BuildOrderCount > & | buildOrderCounts, |
std::string | scoreAlgorithm, | ||
float | ucb1_c, | ||
float | bandit_gamma, | ||
float | thompson_a, | ||
float | thompson_b, | ||
float | moo_mult | ||
) |
Chooses the build order with maximum score according to the provided scoring algorithm The assumption is that this is called once per game, or at least acted upon based on the last call!
float cherrypi::model::score::expMooRollingSamplingScore | ( | const cherrypi::model::BuildOrderCount & | count, |
float | moo_mult, | ||
float | moo_gamma | ||
) |
Computes Exp Moo score on a rolling average (w/ exponential decay)
float cherrypi::model::score::maxExploitScore | ( | const cherrypi::model::BuildOrderCount & | count, |
int | allStrategyGamesCount, | ||
float | ucb1_c | ||
) |
Computes UCB1-style score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) but builds with high win rate get first priority Untested build orders get second priority.
float cherrypi::model::score::thompsonRollingSamplingScore | ( | cherrypi::model::BuildOrderCount const & | count, |
float | thompson_a, | ||
float | thompson_b, | ||
float | thompson_gamma | ||
) |
Computes a Thompson sampling score on a rolling average (w/ exponential decay)
float cherrypi::model::score::thompsonSamplingScore | ( | cherrypi::model::BuildOrderCount const & | count, |
float | thompson_a, | ||
float | thompson_b | ||
) |
Computes a score for build order j based on Thompson sampling (stochastic)
For each opening you keep S and F (Successes and Failures, or S and N and F = N-S).
a and b are hyperparameters. A good initial guess is a=1, b=1. In a game, for each build order i that we can start with: sample p_i in Beta(S_i + a, F_i + b) // here you get your stochasticity j = argmax over all p_i play with build order j update S_j and F_j
float cherrypi::model::score::ucb1RollingScore | ( | const cherrypi::model::BuildOrderCount & | count, |
int | allStrategyGamesCount, | ||
float | ucb1_c, | ||
float | ucb1_gamma | ||
) |
Computes UCB1 score on a rolling average (w/ exponential decay)
float cherrypi::model::score::ucb1Score | ( | const cherrypi::model::BuildOrderCount & | count, |
int | allStrategyGamesCount, | ||
float | ucb1_c | ||
) |
Computes UCB1 score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) Untested build orders get a priority.