
Scala for Machine Learning, Second Edition
By :

Thompson sampling is a simple strategy, introduced 80 years ago, that has received renewed attention in recent years. It is wildly used in advertising displays, marketing surveys, and financial analysis. Thompson sampling is also a Bayesian strategy, known as probability matching: The probability of selecting the arm n is the probability that n is the arm with the maximum reward [14:4].
The strategy can be summarized as:
So far, we have discussed K-armed bandits that do not maintain a state or context. It is assumed that all the arms are identical and only parameterized by their mean reward (successes and failures in the case of Bernoulli bandits). However, real-world applications, such as product recommendations or advertising targeting, require arms (a product or advertising...