Evolution of Zero-Determinant Strategies Based on Replication-Aspiration Dynamic

ZHAO Qian; MAO Ya-jun

doi:10.12178/1001-0548.2021079

Volume 50 Issue 4

Jul. 2021

Article Contents

Article Navigation > Journal of University of Electronic Science and Technology of China > 2021 > 50(4): 634-640

ZHAO Qian, MAO Ya-jun. Evolution of Zero-Determinant Strategies Based on Replication-Aspiration Dynamic[J]. Journal of University of Electronic Science and Technology of China, 2021, 50(4): 634-640. doi: 10.12178/1001-0548.2021079

Citation:

ZHAO Qian, MAO Ya-jun. Evolution of Zero-Determinant Strategies Based on Replication-Aspiration Dynamic[J]. Journal of University of Electronic Science and Technology of China, 2021, 50(4): 634-640. doi: 10.12178/1001-0548.2021079

Evolution of Zero-Determinant Strategies Based on Replication-Aspiration Dynamic

doi: 10.12178/1001-0548.2021079

1.
Web Sciences Center, University of Electronic Science and Technology of China　Chengdu　611731

Received Date: 2020-03-06
Rev Recd Date: 2020-05-09

Available Online: 2021-07-23

Publish Date: 2021-06-28

Abstract

In the iterated prisoners’ dilemma game, zero-determinant strategies can unilaterally form a linear relationship between the payoffs of the players, where the extortion strategy always obtains a benefit no less than that of her opponent. We focus on the evolution of the cooperation defection and extortion strategies on the grid network when agents update their strategies by replication-aspiration dynamic. By means of Monte Carlo simulations, we find that the extortion strategy promotes the boost of the cooperation on the grid network under the mixed updating rule. We explain the results by the micro dynamic of the process and find that the existence of "cooperator - extortioner alliance" can help the cooperators resist the invasion of the defectors and the strength of the extortion strategies plays a non-trivial role on the evolution of cooperation.
- complex systems,
- emergence of cooperation,
- evolutionary game theory,
- network reciprocity,
- zero-determinant strategies

References

[1]	AXELROD R. The evolution of cooperation[M]. New York: Basic Book, 1984.
[2]	NOWAK M A. Evolutionary dynamics: Exploring the equations of life[M]. Cambridge: Harvard University Press, 2006.
[3]	LANGE P A M V L, BALLIET D, PARKS C D, et al. Social dilemmas: The psychology of human cooperation[M]. Oxford: Oxford University Press, 2015.
[4]	NOWAK M A. Five rules for the evolution of cooperation[J]. Science, 2006, 314(5805): 1560-1563.
[5]	WEIBULL J W. Evolutionary game theory[M]. Cambridge: MIT Press, 1995.
[6]	NOWAK M A, SIGMUND K. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game[J]. Nature, 1993, 364(6432): 56-58. doi: 10.1038/364056a0
[7]	PRESS W H, DYSON F J. Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent[J]. Proceedings of the National Academy of Sciences of the United States America, 2012, 109(26): 10409-10413. doi: 10.1073/pnas.1206569109
[8]	HILBE C, NOWAK M A, TRAULSEN A. Adaptive dynamics of extortion and compliance[J]. PLoS One, 2013(8): e77886.
[9]	PAN L M, HAO D, RONG Z H, et al. Zero-determinant strategies in iterated public goods game[J]. Scientific Reports, 2015, 5(1): 13096. doi: 10.1038/srep13096
[10]	HILBE C, NOWAK M A, SIGMUND K. Evolution of extortion in iterated prisoner’s dilemma games[J]. Proceedings of the National Academy of Sciences, 2013, 110(17): 6913-6918. doi: 10.1073/pnas.1214834110
[11]	TAYLOR P D, JONKER L. Evolutionary stable strategies and game dynamics[J]. Mathematical Biosciences, 1978(40): 145-156.
[12]	NOWAK M A, ROBERT M. M. Evolutionary games and spatial chaos[J]. Nature, 1992, 359: 826-829. doi: 10.1038/359826a0
[13]	BATTISTON F, CENCETTI G, IACOPINI I, et al. Networks beyond pairwise interactions: Structure and dynamics[J]. Physics Reports, 2020, 874: 91-92.
[14]	SZABÓ G, FATH G. Evolutionary games on graphs[J]. Physics Report, 2007, 446(4-6): 97-216. doi: 10.1016/j.physrep.2007.04.004
[15]	吴宗柠, 狄增如, 樊瑛. 多层网络的结构与功能研究进展[J]. 电子科技大学学报, 2020, 49(1): 106-119. WU Zong-ning, DI Zeng-ru, FAN Ying. The structure and function of multilayer networks: Progress and prospects[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(1): 106-119.
[16]	SZOLNOKI A, PERC M. Evolution of extortion in structured populations[J]. Physical Review E, 2014, 89(2): 022804. doi: 10.1103/PhysRevE.89.022804
[17]	XU Xiong-rui, RONG Zhi-hai, WU Zhi-xi, et al. Extortion provides alternative routes to the evolution of cooperation in structured populations[J]. Physical Review E, 2017, 95(5): 052302.
[18]	WU Z X, RONG Z H. Boosting cooperation by involving extortion in spatial prisoner’s dilemma games[J]. Physical Review E, 2014, 90(6): 062102.
[19]	RONG Z H, ZHAO Q, WU Z X, et al. Proper aspiration level promotes generous behavior in the spatial prisoner’s dilemma game[J]. The European Physical Journal B, 2016, 89: 166. doi: 10.1140/epjb/e2016-70286-0
[20]	荣智海, 许雄锐, 吴枝喜. 合作演化与网络博弈实验研究进展[J]. 中国科学:物理学·力学·天文学, 2020, 50(1): 118-132. RONG Zhi-hai, XU Xiong-rui, WU Zhi-xi. Experiment research on the evolution of cooperation and network game theory[J]. SCIENCE SINICA: Physica, Mechanica & Astronomica, 2020, 50(1): 118-132.
[21]	SZABÓ G, TOKE C. Evolutionary prisoner’s dilemma game on a square lattice[J]. Physical Review E, 1998, 58(1): 69-73. doi: 10.1103/PhysRevE.58.69
[22]	NOWAK M A, LIEBERMAN E, OHTSUKI H, et al. A simple rule for the evolution of cooperation on graphs and social networks[J]. Nature, 2006, 441(7092): 502-505. doi: 10.1038/nature04605
[23]	CHEN X, WANG L. Promotion of cooperation induced by appropriate payoff aspirations in a small-world networked game[J]. Physical Review E, 2008, 77(1): 017103. doi: 10.1103/PhysRevE.77.017103
[24]	WANG X, GU C, ZHAO J, et al. Evolutionary game dynamics of combining the imitation and aspiration-driven update rules[J]. Physical Review E, 2019, 100(2): 022411. doi: 10.1103/PhysRevE.100.022411
[25]	ZHANG L, HUANG C, LI H, et al. Cooperation guided by imitation, aspiration and conformity-driven dynamics in evolutionary games[J]. Physica A: Statistical Mechanics and its Applications, 2021, 561: 125260. doi: 10.1016/j.physa.2020.125260
[26]	AREFIN M, TANIMOTO J. Evolution of cooperation in social dilemmas under the coexistence of aspiration and imitation mechanisms[J]. Physical Review E, 2020, 102(3): 032120. doi: 10.1103/PhysRevE.102.032120
[27]	VAN R L, KATZENBACH C. Navigating the grey area: Game production between inspiration and imitation[J]. Convergence: The International Journal of Research into New Media Technologies, 2020, 26(2): 402-420. doi: 10.1177/1354856518786593
[28]	GAO L, PAN Q. HE M. Changeable updating rule promotes cooperation in well-mixed and structured populations[J]. Physica A: Statistical Mechanics and its Applications, 2020, 547: 124446. doi: 10.1016/j.physa.2020.124446
[29]	QUAN J, ZHOU Y, WANG X, et al. Evidential reasoning based on imitation and aspiration information in strategy learning promotes cooperation in optional spatial public goods game[J]. Chaos, Solitons & Fractals, 2020, 133: 109634.
[30]	AHSAN H M, TANAKA M, TANIMOTO J. How does conformity promote the enhancement of cooperation in the network reciprocity in spatial prisoner’s dilemma games?[J]. Chaos, Solitons & Fractals, 2020, 138: 109997.

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(4) / Tables(1)

Get Citation

PDF

XML

Article Metrics

Article views(4728) PDF downloads(59) Cited by()

Proportional views

HTML

合作行为在自然界和社会中广泛存在，极大地促进了物种的进化和人类社会的发展。但是自私个体永远在追求将自身利益最大化，利他行为与达尔文的进化理论相矛盾，因此如何令合作在自私个体之间涌现和维持受到众多领域学者的关注^[1-3]。演化博弈理论为研究和解释这一现象提供了有力的理论支持，其中囚徒困境博弈是描述个体之间博弈行为的经典博弈模型之一^[4-5]。捐助博弈是一个特殊的囚徒困境博弈模型，它描述了这样一个场景，参与个体会在博弈开始时选择合作(捐赠一个成本c)或者背叛(不捐助)。如果一方选择合作，她的对手将获得一个收益b；反之，她的对手将不会有任何收益。因此双方都选择合作的情况下，每个个体的收益都为R=b−c；如果双方都选择背叛(不捐助)会使得双方收益为P = 0。如果一方合作而另一方背叛，合作者将获得损失S = −c，而背叛者会获得T = b的收益。在捐助博弈中b>c>0，因此收益关系为T>R>P>S，并且2R>T+S。因此在没有额外因素的干扰下，参与囚徒困境博弈的理性个体为了最大化自己的利益总是会选择背叛。为了研究重复囚徒困境博弈中合作行为的演化，众多策略被研究和讨论，如“赢存输去”策略(win stay, lost shift, WSLS)、全合作策略、全背叛策略等^[6]。但是这些策略都不能单方面决定对手的收益。2012年，文献[7]发现在重复囚徒困境博弈中存在一种特殊的一步记忆策略集合，名为零行列式策略(zero-determinant strategy)。这种策略能够单方面地限制双方的期望收益满足线性关系，而不限定对手的策略。随后学者们发现在雪堆博弈和公共品博弈中都存在类似的零行列式策略能够单方面控制参与博弈的个体之间的收益关系^[8-9]。零行列式策略的发现揭示了博弈策略与期望收益之间的关系，并为研究一步记忆策略提供了新的思路。其中剥削策略最为引人关注，它总是能够获得不低于对手的收益。然而剥削策略与背叛策略博弈时，双方的收益都为P，很容易受到背叛者入侵，因此它在种群演化中是演化不稳定的，但是它能够促进合作策略在种群中的涌现^[10-11]。

在经典的演化博弈论中通常假设个体之间的交互是均匀混合的，即所有个体全部连接。但是现实社会中，个体之间的连接数是有限的，每个个体仅仅与周围少数个体接触。1992年，文献[12]发现在二维方格网络上，个体在与直接相邻的4个邻居进行博弈时，合作者可以出现并稳定存在，并且合作者可以通过结成合作簇的形式来抵抗背叛者的入侵。这一发现首次指出了网络结构对于博弈演化的重要作用，网络互惠也被认为是促进合作演化的一类重要机制。随着复杂网络理论研究的新兴，小世界、无标度等网络特性被证明对网络中合作的涌现带来了极大的影响^[13-15]。并且学者们发现引入剥削策略可以在规则网络和无标度网络上形成“合作−剥削联盟”来抵抗背叛策略的入侵，从而促使合作策略的涌现^[16-20]。

在网络演化博弈中，不仅仅是网络结构会影响合作的演化结果，演化规则也起到了至关重要的作用。演化规则是指个体期望获得更高收益而更新自身策略的规则。通常网络中的个体会通过比较自身与周围邻居的收益差，从而模仿邻居的策略不断将自己的收益最大化。演化规则也是刻画复杂网络上演化动力学的关键因素。各种不同的更新规则通过仿照自然或者是社会个体决策过程而被提出，由生物进化而演变来的最基本的策略演化规则如复制动力学^[21]，费米动力学^[22]和Moran过程^[9]等。随后一种类似于“赢则坚守，输则变通”的个体策略演化规则被提出^[23]，在这种规则中，网络上的每个个体将不再与邻居的收益对比，它会比较实际所获得的收益与期望收益，并根据收益差以费米函数形式获得概率选取下一轮的策略。在这些演化规则中，所有个体仅仅会考虑一种演化规则更新策略。但是在现实世界中个体改变自身策略并不是一成不变的。当它当前的策略所能获得的收益低于邻居的收益时，它将会模仿邻居的策略，但是当自身收益高于邻居时，为了最大化自己的收益，个体将会采取不同的演化规则。一些混合演化规则应运而生，例如部分个体进行复制动力学更新，其他根据期望更新策略^[24]；或者个体以一个固定概率进行复制动力学更新否则根据期望更新^[25]；也有将两种更新规则相结合的混合演化规则^[26]，以及一些其他混合演化规则^[27-30]。

本文基于复制−期望的混合演化规则，研究剥削策略、合作策略与背叛策略在方格网络上的演化，探索剥削策略对方格网络上合作的影响。通过蒙特卡洛仿真比较混合演化规则下的3种策略的稳态比例，并从微观角度分析和对比了复制动力学演化规则下与混合演化规则下剥削策略在演化过程对合作的影响和作用，最后讨论了剥削系数以及背叛诱惑对网络中合作策略稳态比例的影响。

3. 结束语

零行列式策略的发现丰富了博弈中的策略空间，并通过马尔科夫随机过程揭示了博弈策略与期望收益之间的关系，为博弈论的研究和发展提供了重要的理论框架。其中剥削策略能够作为催化剂和屏障促使合作策略的涌现与维持，因此受到了广泛的关注。

网络互惠作为促进合作涌现的重要机制，研究了网络结构、策略演化和网络模型之间的关系。本文基于复制−期望演化规则，探讨了在引入剥削策略后，合作策略、背叛策略与剥削策略在方格网络上的演化动力学。虽然较大剥削系数的剥削策略无法在复制动力学规则下促进合作行为在网络中的涌现，但是在混合演化规则下，这些剥削策略能够形成稳定的“合作−剥削联盟”，从而帮助合作行为在网络中能够稳定存在。通过对比不同的演化规则的微观过程，本文发现，混合演化规则可以通过期望收益的内驱动力驱使剥削者向合作者转变，从而促使合作簇的形成与“合作−剥削联盟”的稳定存在。通过研究s_E−b取值对方格网络上合作策略稳态比例的影响，本文发现在混合演化规则下，剥削系数对于合作策略的促进作用依然是非单调的。本文对于进一步理解剥削策略在真实场景中的演化作用，以及如何促进复杂系统中合作的涌现提供了新的思路。

Reference (30)

[1]	AXELROD R. The evolution of cooperation[M]. New York: Basic Book, 1984.
[2]	NOWAK M A. Evolutionary dynamics: Exploring the equations of life[M]. Cambridge: Harvard University Press, 2006.
[3]	LANGE P A M V L, BALLIET D, PARKS C D, et al. Social dilemmas: The psychology of human cooperation[M]. Oxford: Oxford University Press, 2015.
[4]	NOWAK M A. Five rules for the evolution of cooperation[J]. Science, 2006, 314(5805): 1560-1563.
[5]	WEIBULL J W. Evolutionary game theory[M]. Cambridge: MIT Press, 1995.
[6]	NOWAK M A, SIGMUND K. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game[J]. Nature, 1993, 364(6432): 56-58.
[7]	PRESS W H, DYSON F J. Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent[J]. Proceedings of the National Academy of Sciences of the United States America, 2012, 109(26): 10409-10413.
[8]	HILBE C, NOWAK M A, TRAULSEN A. Adaptive dynamics of extortion and compliance[J]. PLoS One, 2013(8): e77886.
[9]	PAN L M, HAO D, RONG Z H, et al. Zero-determinant strategies in iterated public goods game[J]. Scientific Reports, 2015, 5(1): 13096.
[10]	HILBE C, NOWAK M A, SIGMUND K. Evolution of extortion in iterated prisoner’s dilemma games[J]. Proceedings of the National Academy of Sciences, 2013, 110(17): 6913-6918.
[11]	TAYLOR P D, JONKER L. Evolutionary stable strategies and game dynamics[J]. Mathematical Biosciences, 1978(40): 145-156.
[12]	NOWAK M A, ROBERT M. M. Evolutionary games and spatial chaos[J]. Nature, 1992, 359: 826-829.
[13]	BATTISTON F, CENCETTI G, IACOPINI I, et al. Networks beyond pairwise interactions: Structure and dynamics[J]. Physics Reports, 2020, 874: 91-92.
[14]	SZABÓ G, FATH G. Evolutionary games on graphs[J]. Physics Report, 2007, 446(4-6): 97-216.
[15]	吴宗柠, 狄增如, 樊瑛. 多层网络的结构与功能研究进展[J]. 电子科技大学学报, 2020, 49(1): 106-119.	WU Zong-ning, DI Zeng-ru, FAN Ying. The structure and function of multilayer networks: Progress and prospects[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(1): 106-119.
[16]	SZOLNOKI A, PERC M. Evolution of extortion in structured populations[J]. Physical Review E, 2014, 89(2): 022804.
[17]	XU Xiong-rui, RONG Zhi-hai, WU Zhi-xi, et al. Extortion provides alternative routes to the evolution of cooperation in structured populations[J]. Physical Review E, 2017, 95(5): 052302.
[18]	WU Z X, RONG Z H. Boosting cooperation by involving extortion in spatial prisoner’s dilemma games[J]. Physical Review E, 2014, 90(6): 062102.
[19]	RONG Z H, ZHAO Q, WU Z X, et al. Proper aspiration level promotes generous behavior in the spatial prisoner’s dilemma game[J]. The European Physical Journal B, 2016, 89: 166.
[20]	荣智海, 许雄锐, 吴枝喜. 合作演化与网络博弈实验研究进展[J]. 中国科学:物理学·力学·天文学, 2020, 50(1): 118-132.	RONG Zhi-hai, XU Xiong-rui, WU Zhi-xi. Experiment research on the evolution of cooperation and network game theory[J]. SCIENCE SINICA: Physica, Mechanica & Astronomica, 2020, 50(1): 118-132.
[21]	SZABÓ G, TOKE C. Evolutionary prisoner’s dilemma game on a square lattice[J]. Physical Review E, 1998, 58(1): 69-73.
[22]	NOWAK M A, LIEBERMAN E, OHTSUKI H, et al. A simple rule for the evolution of cooperation on graphs and social networks[J]. Nature, 2006, 441(7092): 502-505.
[23]	CHEN X, WANG L. Promotion of cooperation induced by appropriate payoff aspirations in a small-world networked game[J]. Physical Review E, 2008, 77(1): 017103.
[24]	WANG X, GU C, ZHAO J, et al. Evolutionary game dynamics of combining the imitation and aspiration-driven update rules[J]. Physical Review E, 2019, 100(2): 022411.
[25]	ZHANG L, HUANG C, LI H, et al. Cooperation guided by imitation, aspiration and conformity-driven dynamics in evolutionary games[J]. Physica A: Statistical Mechanics and its Applications, 2021, 561: 125260.
[26]	AREFIN M, TANIMOTO J. Evolution of cooperation in social dilemmas under the coexistence of aspiration and imitation mechanisms[J]. Physical Review E, 2020, 102(3): 032120.
[27]	VAN R L, KATZENBACH C. Navigating the grey area: Game production between inspiration and imitation[J]. Convergence: The International Journal of Research into New Media Technologies, 2020, 26(2): 402-420.
[28]	GAO L, PAN Q. HE M. Changeable updating rule promotes cooperation in well-mixed and structured populations[J]. Physica A: Statistical Mechanics and its Applications, 2020, 547: 124446.
[29]	QUAN J, ZHOU Y, WANG X, et al. Evidential reasoning based on imitation and aspiration information in strategy learning promotes cooperation in optional spatial public goods game[J]. Chaos, Solitons & Fractals, 2020, 133: 109634.
[30]	AHSAN H M, TANAKA M, TANIMOTO J. How does conformity promote the enhancement of cooperation in the network reciprocity in spatial prisoner’s dilemma games?[J]. Chaos, Solitons & Fractals, 2020, 138: 109997.

策略	C	D	E
C	$b - c$	$ - c$	$\dfrac{ {( { {b^2} - {c^2} } ){s_{\rm{E}}} } }{ {b + c{s_{\rm{E}}} } }$
D	$b$	0	0
E	$\dfrac{{{b^2} - {c^2}}}{{b + c{s_{\rm{E}}}}}$	0	0

Evolution of Zero-Determinant Strategies Based on Replication-Aspiration Dynamic

doi: 10.12178/1001-0548.2021079

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views