next up previous
Next: About this document Up: Pricing in agent economies Previous: Acknowledgements

References

1
R. H. Crites and A. G. Barto, ``Improving elevator performance using reinforcement learning.'' In: D. Touretzky et al., eds., Advances in Neural Information Processing Systems 8, 1017-1023, MIT Press, 1996.

2
D. Fudenberg and J. Tirole, Game Theory. Cambridge, MA: MIT Press, 1991.

3
A. Greenwald and J. O. Kephart, ``Shopbots and pricebots.'' To appear in Proceedings of IJCAI '99 (International Joint Conferences on Artificial Intelligence), July 31- August 6, 1999, Stockholm, Sweden.

4
J. Hu and M. P. Wellman, ``Self-fulfilling bias in multiagent learning.'' Proceedings of ICMAS-96, AAAI Press, 1996.

5
J. Hu and M. P. Wellman, ``Multiagent reinforcement learning: theoretical framework and an algorithm.'' Proceedings of ICML-98, 1998.

6
J. O. Kephart, J. E. Hanson and J. Sairamesh, ``Price-war dynamics in a free-market economy of software agents.'' In: Proceedings of ALIFE-VI, Los Angeles, 1998.

7
D. Kreps, A Course in Microeconomic Theory. Princeton Univ. Press, Princeton, NJ, 1990.

8
M. L. Littman, ``Markov games as a framework for multi-agent reinforcement learning,'' Proceedings of the Eleventh International Conference on Machine Learning, 157-163, Morgan Kaufmann, 1994.

9
J. Sairamesh and J. O. Kephart, ``Dynamics of price and quality differentiation in information and computational markets.'' Proceedings of the First International Conference on Information and Computation Economics (ICE-98), 28-36, ACM Press, 1998.

10
T. W. Sandholm and R. H. Crites, ``On multiagent Q-Learning in a semi-competitive domain.'' 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Workshop on Adaptation and Learning in Multiagent Systems, Montreal, Canada, 71-77, 1995.

11
G. Tesauro, ``Temporal difference learning and TD-Gammon.'' Comm. of the ACM, 38:3, 58-67, 1995.

12
G. J. Tesauro and J. O. Kephart,``Foresight-based pricing algorithms in an economy of software agents.'' Proceedings of ICE-98, 37-44, 1998.

13
G. J. Tesauro and J. O. Kephart,``Foresight-based pricing algorithms in agent economies.'' Decision Support Sciences, to appear, 1999.

14
J. M. Vidal and E. H. Durfee, ``Learning nested agent models in an information economy,'' J. of Experimental and Theoretical AI, to appear, 1998.

15
C. J. C. H. Watkins, ``Learning from delayed rewards.'' Ph. D. thesis, Cambridge University, 1989.

16
C. J. C. H. Watkins and P. Dayan, ``Q-learning.'' Machine Learning 8, 279-292, 1992.

17
W. Zhang and T. G. Dietterich, ``High-performance job-shop scheduling with a time-delay TD( tex2html_wrap_inline747 ) network.'' In: D. Touretzky et al., eds., Advances in Neural Information Processing Systems 8, 1024-1030, MIT Press, 1996.


kephart
Wed Sep 29 11:51:48 EDT 1999