EpsilonGreedyRunner
(bandit_returns
:List
[float
], epsilon
:float
=0.2
, batch_size
:int
=10000
, batches
:int
=10
, simulations
:int
=100
)
Class that is used to run simulations of Thompson sampling tests.
Attributes:
bandit_returns: List of average returns per bandit.
epsilon: Percentage of exploration.
batch_size: Number of examples per batch.
batches: Number of batches.
simulations: Number of simulations.
Methods:
init_bandits: Prepares everything for new simulation.
run: Runs the simulations and tracks performance.