Freund and Schapire [14] derive an algorithm that achieves
no external regret via multiplicative updating. Their algorithm
is dependent on the cumulative payoffs achieved by all strategies,
including the surmised payoffs of strategies which are not
played. In particular, let
denote the cumulative payoffs
obtained through time t via strategy i, which is computed as
follows:
.
Now the weight assigned to strategy i at time t+1, for
, is given by:
The multiplicative updating rule given in Equation 17 can be modified to become applicable in naive settings, where complete payoff information is not available, but rather the only payoff information known at time t is that which pertains to the strategy which was in fact employed at time t. Such a variant of this multiplicative updating algorithm appears in Auer, Cesa-Bianchi, Freund, and Schapire [1]. It remains to perform simulations of this naive algorithm in our model of shopbots and pricebots.