TorchCraftAI
A bot for machine learning research on StarCraft: Brood War
Functions
cherrypi::model::score Namespace Reference

Functions

float betaSampling (float x, float a, float b)
 Gets a sample from a Beta(a, b) if you have "x" a sample from a uniform in 0,1: x^(a-1)*(1-x)^(b-1)/(gamma(a)*gamma(b)/gamma(a+b)) with gamma http://www.cplusplus.com/reference/cmath/tgamma/. More...
 
float thompsonSamplingScore (cherrypi::model::BuildOrderCount const &count, float thompson_a, float thompson_b)
 Computes a score for build order j based on Thompson sampling (stochastic) More...
 
float thompsonRollingSamplingScore (cherrypi::model::BuildOrderCount const &count, float thompson_a, float thompson_b, float thompson_gamma)
 Computes a Thompson sampling score on a rolling average (w/ exponential decay) More...
 
float ucb1Score (cherrypi::model::BuildOrderCount const &count, int allStrategyGamesCount, float ucb1_c)
 Computes UCB1 score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) Untested build orders get a priority. More...
 
float ucb1RollingScore (cherrypi::model::BuildOrderCount const &count, int allStrategyGamesCount, float ucb1_c, float ucb1_gamma)
 Computes UCB1 score on a rolling average (w/ exponential decay) More...
 
float expMooRollingSamplingScore (cherrypi::model::BuildOrderCount const &count, float moo_mult, float moo_gamma)
 Computes Exp Moo score on a rolling average (w/ exponential decay) More...
 
float maxExploitScore (cherrypi::model::BuildOrderCount const &count, int allStrategyGamesCount, float ucb1_c)
 Computes UCB1-style score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) but builds with high win rate get first priority Untested build orders get second priority. More...
 
std::string chooseBuildOrder (std::map< std::string, cherrypi::model::BuildOrderCount > const &buildOrderCounts, std::string scoreAlgorithm, float ucb1_c, float bandit_gamma, float thompson_a, float thompson_b, float moo_mult)
 Chooses the build order with maximum score according to the provided scoring algorithm The assumption is that this is called once per game, or at least acted upon based on the last call! More...
 

Function Documentation

float cherrypi::model::score::betaSampling ( float  x,
float  a,
float  b 
)

Gets a sample from a Beta(a, b) if you have "x" a sample from a uniform in 0,1: x^(a-1)*(1-x)^(b-1)/(gamma(a)*gamma(b)/gamma(a+b)) with gamma http://www.cplusplus.com/reference/cmath/tgamma/.

std::string cherrypi::model::score::chooseBuildOrder ( const std::map< std::string, cherrypi::model::BuildOrderCount > &  buildOrderCounts,
std::string  scoreAlgorithm,
float  ucb1_c,
float  bandit_gamma,
float  thompson_a,
float  thompson_b,
float  moo_mult 
)

Chooses the build order with maximum score according to the provided scoring algorithm The assumption is that this is called once per game, or at least acted upon based on the last call!

float cherrypi::model::score::expMooRollingSamplingScore ( const cherrypi::model::BuildOrderCount count,
float  moo_mult,
float  moo_gamma 
)

Computes Exp Moo score on a rolling average (w/ exponential decay)

float cherrypi::model::score::maxExploitScore ( const cherrypi::model::BuildOrderCount count,
int  allStrategyGamesCount,
float  ucb1_c 
)

Computes UCB1-style score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) but builds with high win rate get first priority Untested build orders get second priority.

float cherrypi::model::score::thompsonRollingSamplingScore ( cherrypi::model::BuildOrderCount const &  count,
float  thompson_a,
float  thompson_b,
float  thompson_gamma 
)

Computes a Thompson sampling score on a rolling average (w/ exponential decay)

float cherrypi::model::score::thompsonSamplingScore ( cherrypi::model::BuildOrderCount const &  count,
float  thompson_a,
float  thompson_b 
)

Computes a score for build order j based on Thompson sampling (stochastic)

For each opening you keep S and F (Successes and Failures, or S and N and F = N-S).

a and b are hyperparameters. A good initial guess is a=1, b=1. In a game, for each build order i that we can start with: sample p_i in Beta(S_i + a, F_i + b) // here you get your stochasticity j = argmax over all p_i play with build order j update S_j and F_j

float cherrypi::model::score::ucb1RollingScore ( const cherrypi::model::BuildOrderCount count,
int  allStrategyGamesCount,
float  ucb1_c,
float  ucb1_gamma 
)

Computes UCB1 score on a rolling average (w/ exponential decay)

float cherrypi::model::score::ucb1Score ( const cherrypi::model::BuildOrderCount count,
int  allStrategyGamesCount,
float  ucb1_c 
)

Computes UCB1 score providing build order j: (win_j / total_j) + sqrt(2 * log(sum(total)) / total_j) Untested build orders get a priority.