THESIS
2020
xi, 98 pages : illustrations ; 30 cm
Abstract
The Iterated Prisoner’s Dilemma (IPD) is a well-known benchmark for studying the long-term
behaviors of rational agents. Researchers from diverse disciplines have used the IPD
to study the emergence of cooperation among unrelated agents. In 1981, Robert Axelrod
was the first to run some computer tournaments on the IPD. Remarkably the simple
Tit-For-Tat (TFT) strategy was the winner. The winning of cooperative strategy TFT not
only impressed computer scientists, but also influenced researchers in other fields, such
as economists and biologists. The IPD is frequently used to design experiments, or to
explain the evolution of reciprocity among people and unrelated species. In 2012, Press
and Dyson dramatically changed people’s understanding of this game by deriving what
they calle...[
Read more ]
The Iterated Prisoner’s Dilemma (IPD) is a well-known benchmark for studying the long-term
behaviors of rational agents. Researchers from diverse disciplines have used the IPD
to study the emergence of cooperation among unrelated agents. In 1981, Robert Axelrod
was the first to run some computer tournaments on the IPD. Remarkably the simple
Tit-For-Tat (TFT) strategy was the winner. The winning of cooperative strategy TFT not
only impressed computer scientists, but also influenced researchers in other fields, such
as economists and biologists. The IPD is frequently used to design experiments, or to
explain the evolution of reciprocity among people and unrelated species. In 2012, Press
and Dyson dramatically changed people’s understanding of this game by deriving what
they called zero determinant (ZD) strategies, which forces a linear relationship between
the scores of two players.
Following Press and Dyson, we model the IPD as Markov chains. We come up with
what we call invincible strategies. These are ones that will never lose against any other
strategy in terms of average payoff in the limit. We provide a simple characterization of
this class of strategies and show that invincible strategies can also be nice. We discuss its relationship with some important strategies and generalize our results to other repeated
games.
Based on Markov models, we study the property of best responses to pure strategies
and completely mixed strategies, and put forward a framework to compute such a best
response. The framework applies not only to the IPD with specific numeric payoff matrix,
but the symbolic payoff matrix with constraints as well. We summarize best responses to
some typical strategies with the help of symbolic solvers.
Finally we conduct experiments to study the evolutionary property of invincible strategies.
The results of our experiments, as well as some experiments in related works, can be
explain by our mathematical models.
Post a Comment