← The Theory Of Computation Algorithmic Time Complexity →

Thompson Sampling

A heuristic algorithm for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It involves selecting the action that maximizes the expected reward with respect to a randomly drawn belief.

Areas of application

Multi-armed bandit problems
Online advertising
Personalized recommendation systems
A/B testing and experimentation

Example

For instance, in an online advertising platform, Thompson sampling can be used to determine which ads to show to users. The algorithm would maintain a distribution over the space of possible ads and update this distribution based on the rewards obtained from showing the ads to users. This allows the algorithm to balance exploration (trying new ads) with exploitation (showing ads that are known to be successful).

Resources

Embracing the Mindset of a Successful Data Strategist

← The Theory Of Computation Algorithmic Time Complexity →