Acknowledgements

Introduction

AI Development and the Coordination Problem

A Theory of International AI Coordination

Policy Implications and Discussion

Limitations

Appendix A: Theory Variables

Endnotes

Author

AI Development

Relevant Actors

The AI Coordination Problem

Literature review on international arms races and coordination

Four models of coordination

Determining payoffs in coordination scenarios

Both sides cooperate [CC]

Strategic considerations in Prisoner’s Dilemma

Strategic considerations in Stag Hunt

Strategies to shift game scenarios

Arms Races

Coordination

Prisoner’s Dilemma

Chicken

Deadlock

Stag Hunt

One side cooperates and one side defects [CD or DC]

Both sides defect [DD]

Relative risks of developing a harmful AI

Example payoff structure resulting in Prisoner’s Dilemma

Example payoff structure resulting in Deadlock

Example payoff structure resulting in Chicken game

Example structure resulting in Stag Hunt

Image: The Intelligence, Surveillance and Reconnaissance Division at the Combined Air Operations Center at Al Udeid Air Base, Qatar.

This essay first appeared in the Acheson Prize 2018 Issue of the Yale Review of International Studies.

I thank my advisor, Professor Allan Dafoe, for his time, support, and introduction to this paper’s subject matter in his Global Politics of AI seminar. Additionally, the feedback, discussion, resource recommendations, and inspiring work of friends, colleagues, and mentors in several time zones – especially Amy Fan, Carrick Flynn, Will Hunt, Jade Leung, Matthijs Maas, Peter McIntyre, Professor Nuno Monteiro, Gabe Rissman, Thomas Weng, Baobao Zhang, and Remco Zwetsloot – were vital to this paper and are profoundly appreciated. It truly takes a village, to whom this paper is dedicated.

“The first technology revolution caused World War I. The second technology revolution caused World War II. This is the third technology revolution.”
—Jack Ma (June 2017)[1]
“Artificial intelligence is the future, not only for Russia, but for all humankind. It comes with colossal opportunities, but also threats that are difficult to predict. Whoever becomes the leader in this sphere will become the ruler of the world.”
—Vladimir Putin (September 2017)[2]
“China, Russia, soon all countries w strong computer science. Competition for AI superiority at national level most likely cause of WW3 imo.”
—@elonmusk (September 2017)[3]

We have recently seen an increase in media acknowledgement of the benefits of artificial intelligence (AI), as well as the negative social implications that can arise from its development. At the same time, a growing literature has illuminated the risk that developing AI has of leading to global catastrophe[4] — and further pointed out the effect that racing dynamics has on exacerbating this risk.[5] As a result, it is becoming increasingly vital to understand and develop strategies to manage the human process of developing AI.

The current landscape suggests that AI development is being led by two main international actors: China and the United States.[6] Moreover, speculative accounts of “competition” and “arms races” have begun to increase in prominence[7], while state actors have begun to take steps that seem to support this assessment.[8] If truly present, a racing dynamic[9] between these two actors is a cause for alarm and should inspire strategies to develop an AI Coordination Regime between these two actors.

In order to assess the likelihood of such a Coordination Regime’s success, one would have to take into account the two actors’ expected payoffs from cooperating or defecting from the regime. In this paper, I develop a simple theory to explain whether two international actors are likely to cooperate or compete in developing AI and analyze what variables factor into this assessment. The paper proceeds as follows.

First, I survey the relevant background of AI development and coordination by summarizing the literature on the expected benefits and harms from developing AI and what actors are relevant in an international safety context. Here, I also examine the main agenda of this paper: to better understand and begin outlining strategies to maximize coordination in AI development, despite relevant actors’ varying and uncertain preferences for coordination. I refer to this as the AI Coordination Problem.

Next, I outline my theory to better understand the dynamics of the AI Coordination Problem between two opposing international actors. In short, the theory suggests that the variables that affect the payoff structure of cooperating or defecting from an AI Coordination Regime determine which model of coordination we see arise between the two actors (modeled after normal-form game setups). Depending on which model is present, we can get a better sense of the likelihood of cooperation or defection, which can in turn inform research and policy agendas to address this. This section defines suggested payoffs variables that impact the theory and simulate the theory for each representative model based on a series of hypothetical scenarios.

Finally, I discuss the relevant policy and strategic implications this theory has on achieving international AI coordination, and assess the strengths and limitations of the theory in practice.

In this section, I survey the relevant background of AI development and coordination by summarizing the literature on the expected benefits and harms from developing AI and what actors are relevant in an international safety context. I also examine the main agenda of this paper: to better understand and begin outlining strategies to maximize coordination in AI development, despite relevant actors’ varying and uncertain preferences for coordination. I refer to this as the AI Coordination Problem.

In recent years, artificial intelligence has grown notably in its technical capacity and in its prominence in our society. We see this in the media as prominent news sources with greater frequency highlight new developments and social impacts of AI – with some experts heralding it as “the new electricity.”[10] In the business realm, investments in AI companies are soaring.[11] In our everyday lives, we store AI technology as voice assistants in our pockets[12] and as vehicle controllers in our garages.[13] And impressive victories over humans in chess by AI programs[14] are being dwarfed by AI’s ability to compete with and beat humans at exponentially more difficult strategic endeavors like the games of Go[15] and StarCraft.[16]

On one hand, these developments outline a bright future. Advanced AI technologies have the potential to provide transformative social and economic benefits like preventing deaths in auto collisions,[17] drastically improving healthcare,[18] reducing poverty through economic bounty,[19] and potentially even finding solutions to some of our most menacing problems like climate change.[20]

At the same time, there are great harms and challenges that arise from AI’s rapid development. Uneven distribution of AI’s benefits could exacerbate inequality, resulting in higher concentrations of wealth within and among nations.[21] Moreover, racist algorithms[22] and lethal autonomous weapons systems[23] force us to grapple with difficult ethical questions as we apply AI to more society realms.

Perhaps most alarming, however, is the global catastrophic risk that the unchecked development of AI presents. Most prominently addressed in Nick Bostrom’s Superintelligence, the creation of an artificial superintelligence (ASI)[24] requires exceptional care and safety measures to avoid developing an ASI whose misaligned values and capacity can result in existential risks for mankind.[25] In a particularly telling quote, Stephen Hawking, Stuart Russell, Max Tegmark, and Frank Wilczek foreshadow this stark risk:

“One can imagine such technology outsmarting financial markets, out-inventing human researchers, out-manipulating human leaders, and developing weapons we cannot even understand. Whereas the short-term impact of AI depends on who controls it, the long-term impact depends now whether it can be controlled at all.”[26]

As new technological developments bring us closer and closer to ASI[27] and the beneficial returns to AI become more tangible and lucrative, a race-like competition between key players to develop advanced AI will become acute with potentially severe consequences regarding safety. Put another way, the development of AI under international racing dynamics could be compared to two countries racing to finish a nuclear bomb – if the actual development of the bomb (and not just its use) could result in unintended, catastrophic consequences.

First-move advantage will be decisive in determining the winner of the race due to the expected exponential growth in capabilities of an AI system and resulting difficulty of other parties to catch up. As a result, concerns have been raised that such a race could create incentives to skimp on safety.[28] Once this Pandora’s Box is opened, it will be difficult to close. But who can we expect to open the Box?

In this section, I briefly argue that state governments are likely to eventually control the development of AI (either through direct development or intense monitoring and regulation of state-friendly companies)[29], and that the current landscape suggests two states in particular – China and the United States – are most likely to reach development of an advanced AI system first.

Because of its capacity to radically affect military and intelligence systems, AI research becomes an important consideration in national security and would unlikely be ignored by political and military leaders. In the US, the military and intelligence communities have a long-standing history of supporting transformative technological advancements such as nuclear weapons, aerospace technology, cyber technology and the Internet, and biotechnology.[30]

Today, government actors have already expressed great interest in AI as a transformative technology. In 2016, the Obama Administration developed two reports on the future of AI.[31] Meanwhile, U.S. military and intelligence agencies like the NSA and DARPA continue to fund public AI research. Furthermore, in June 2017, China unveiled a policy strategy document unveiling grand ambitions to become the world leader in AI by 2030.[32] Notably, discussions among U.S. policymakers to block Chinese investment in U.S. AI companies also began at this time.[33]

In addition to boasting the world’s largest economies, China and the U.S. also lead the world in A.I. publications[34] and host the world’s most prominent tech/AI companies (US: Facebook, Amazon, Google, and Tesla; China: Tencent and Baidu). Combining both countries’ economic and technical ecosystem with government pressures to develop AI, it is reasonable to conceive of an AI race primarily dominated by these two international actors.

As discussed, there are both great benefits and harms to developing AI, and due to the relevance AI development has to national security, it is likely that governments will take over this development (specifically the US and China). Due to the potential global harms developing AI can cause, it would be reasonable to assume that government actors would try impose safety measures and regulations on actors developing AI, and perhaps even coordinate on an international scale to ensure that all actors developing AI might cooperate under an AI Coordination Regime[35] that sets, monitors, and enforces standards to maximize safety.

Despite this, there still might be cases where the expected benefits of pursuing AI development alone outweigh (in the perception of the actor) the potential harms that might arise. As a result, this tradeoff between costs and benefits has the potential to hinder prospects for cooperation under an AI Coordination Regime. This is what I will refer to as the AI Coordination Problem.

Solving this problem requires more understanding of its dynamics and strategic implications before hacking at it with policy solutions. It is the goal this paper to shed some light on these, particularly how the structure of preferences that result from states’ understandings of the benefits and harms of AI development lead to varying prospects for coordination.

In this section, I outline my theory to better understand the dynamics of the AI Coordination Problem between two opposing international actors. In short, the theory suggests the variables that affect the payoff structure of cooperating or defecting from an AI Coordination Regime that determine which model of coordination we see arise between the two actors (modeled after normal-form game setups). Depending on which model is present, we can get a better sense of the likelihood of cooperation or defection, which can in turn inform research and policy agendas to address this. This section defines suggested payoffs variables that impact the theory and simulate the theory for each representative model based on a series of hypothetical scenarios. Before getting to the theory, I will briefly examine the literature on military technology/arms racing and cooperation.

Gray[36] defines an arms race as “two or more parties perceiving themselves to be in an adversary relationship, who are increasing or improving their armaments at a rapid rate and structuring their respective military postures with a general attain to the past, current, and anticipated military and political behaviour of the other parties.”

Within the arms race literature, scholars have distinguished between types of arms races depending on the nature of arming. Huntington[37] makes a distinction between qualitative arms races (where technological developments radically transform the nature of a country’s military capabilities) and quantitative arms races (where competition is driven by the sheer size of an actor’s arsenal).

Intriligator and Brito[38] argue that qualitative/technological races can lead to greater instability than quantitative races. They suggest that new weapons (or systems) that derive from radical technological breakthroughs can render a first strike more attractive, whereas basic arms buildups provide deterrence against a first strike.

In the same vein, Sorenson[39] argues that unexpected technological breakthroughs in weaponry raise instability in arms races. This “technological shock” factor leads actors to increase weapons research and development and maximize their overall arms capacity to guard against uncertainty.

Finally, Jervis[40] also highlights the “security dilemma” where increases in an actor’s security can inherently lead to the decreased security of a rival state. As a result of this, security-seeking actions such as increasing technical capacity (even if this is not explicitly offensive — this is particularly relevant to wide-encompassing capacity of AI) can be perceived as threatening and met with exacerbated race dynamics.

So far, the readings discussed have commented on the unique qualities of technological or qualitative arms races. Additional readings provide insight on arms characteristics that impact race dynamics. For example, Jervis highlights the distinguishability of offensive-defensive postures as a factor in stability. If security increases can’t be distinguished as purely defensive, this decreases instability.[41] AI, being a dual-use technology, does not lend itself to unambiguously defensive (or otherwise benign) investments. Additionally, Koubi[42] develops a model of military technological races that suggests the level of spending on research and development varies with changes in an actor’s relative position in a race.

Although the development of AI at present has not yet led to a clear and convincing military arms race (although this has been suggested to be the case[43]), the elements of the arms race literature described above suggest that AI’s broad and wide-encompassing capacity can lead actors to see AI development as a threatening “technological shock” worth responding to with reinforcements or augmentations in one’s own security – perhaps through bolstering one’s own AI development program. As described in the previous section, this “arms” race dynamic is particularly worrisome due to the existential risks that arise from AI’s development and call for appropriate measures to mitigate it. To begin exploring this, I now look to the literature on arms control and coordination.

In order to mitigate or prevent the deleterious effects of arms races, international relations scholars have also studied the dynamics that surround arms control agreements and the conditions under which actors might coordinate with one another. Schelling and Halperin[44] offer a broad definition of arms control as “all forms of military cooperation between potential enemies in the interest of reducing the likelihood of war, its scope and violence if it occurs, and the political and economic costs of being prepared for it.”

As an advocate of structural realism, Gray[45] questions the role of arms control, as he views the balance of power as a self-sufficient and self-perpetuating system of international security that is more preferable. On the other hand, Glaser[46] argues that rational actors under certain conditions might opt for cooperative policies. Under the assumption that actors have a combination of both competing and common interests, those actors may cooperate when those common interests compel such action. As a result, it is conceivable that international actors might agree to certain limitations or cooperative regimes to reduce insecurity and stabilize the balance of power.

Using game theoretical representations of state preferences, Downs et al.[47] look at different policy responses to arms race de-escalation and find that the model or ‘game’ that underlies an arms race can affect the success of policies or strategies to mitigate or end the race.

Finally, in a historical survey of international negotiations, Garcia and Herz[48] propose that international actors might take preventative, multilateral action in scenarios under the “commonly perceived global dimension of future potential harm” (for example the ban on laser weapons or the dedication of Antarctica and outer space solely for peaceful purposes).

Together, these elements in the arms control literature suggest that there may be potential for states as untrusting, rational actors existing in a state of international anarchy to coordinate on AI development in order to reduce future potential global harms. Specifically, it is especially important to understand where preferences of vital actors overlap and how game theory considerations might affect these preferences. The theory outlined in this paper looks at just this and will be expanded upon in the following subsection.

This subsection looks at the four predominant models that describe the situation two international actors might find themselves in when considering cooperation in developing AI, where research and development is costly and its outcome is uncertain. Each model is differentiated primarily by the payoffs to cooperating or defecting for each international actor. As such, it will be useful to consider each model using a traditional normal-form game setup as seen in Table 1. In these abstractions, we assume two utility-maximizing actors with perfect information about each other’s preferences and behaviors.

Table 1. This table contains a representation of a payoff matrix. As is customary in game theory, the first number in each cell represents how desirable the outcome is for Row (in this case, Actor A), and the second number represents how desirable the same outcome is for Column (Actor B).

Depending on the payoff structures, we can anticipate different likelihoods of and preferences for cooperation or defection on the part of the actors. These differences create four distinct models of scenarios we can expect to occur: Prisoner’s Dilemma, Deadlock, Chicken, and Stag Hunt. The remainder of this subsection briefly examines each of these models and its relationship with the AI Coordination Problem.

The familiar Prisoner’s Dilemma is a model that involves two actors who must decide whether to cooperate in an agreement or not. While each actor’s greatest preference is to defect while their opponent cooperates, the prospect of both actors defecting is less desirable then both actors cooperating. This is visually represented in Table 2 with each actor’s preference order explicitly outlined.

Table 2. This table contains an ordinal representation of a payoff matrix for a Prisoner’s Dilemma game. As will hold for the following tables, the most preferred outcome is indicated with a ‘4,’ and the least preferred outcome is indicated with a ‘1.’

Actor A’s preference order: DC > CC > DD > CD

Actor B’s preference order: CD > CC > DD > DC

In the context of international relations, this model has been used to describe preferences of actors when deciding to enter an arms treaty or not.[49] For example, by defecting from an arms-reduction treaty to develop more weapons, an actor can gain the upper hand on an opponent who decides to uphold the treaty by covertly continuing or increasing arms production. Meanwhile, the escalation of an arms race where neither side halts or slows progress is less desirable to each actor’s safety than both fully entering the agreement.

This same dynamic could hold true in the development of an AI Coordination Regime, where actors can decide whether to abide by the Coordination Regime or find a way to cheat. In this model, each actor’s incentives are not fully aligned to support mutual cooperation and thus should present worry for individuals hoping to reduce the possibility of developing a harmful AI.

Similar to the Prisoner’s Dilemma, Chicken occurs when each actor’s greatest preference would be to defect while their opponent cooperates. Additionally, both actors can expect a greater return if they both cooperate rather than both defect. The primary difference between the Prisoner’s Dilemma and Chicken, however, is that both actors failing to cooperate is the least desired outcome of the game. As a result, a rational actor should expect to cooperate.[50] This is visually represented in Table 3 with each actor’s preference order explicitly outlined.

Table 3. This table contains an ordinal representation of a payoff matrix for a Chicken game.

Actor A’s preference order: DC > CC > CD > DD

Actor B’s preference order: CD > CC > DC > DD

In international relations, examples of Chicken have included the Cuban Missile Crisis and the concept of Mutually Assured Destruction in nuclear arms development.[51] An analogous scenario in the context of the AI Coordination Problem could be if both international actors have developed, but not yet unleashed an ASI, where knowledge of whether the technology will be beneficial or harmful is still uncertain. Because of the instantaneous nature of this particular game, we can anticipate its occurrence to be rare in the context of technology development, where opportunities to coordinate are continuous. As such, Chicken scenarios are unlikely to greatly affect AI coordination strategies but are still important to consider as a possibility nonetheless.

Deadlock occurs when each actor’s greatest preference would be to defect while their opponent cooperates. However, in Deadlock, the prospect of both actors defecting is more desirable than both actors cooperating. As a result, there is no conflict between self-interest and mutual benefit, and the dominant strategy of both actors would be to defect. This is visually represented in Table 3 with each actor’s preference order explicitly outlined.

Table 4. This table contains an ordinal representation of a payoff matrix for a game in Deadlock.

Actor A’s preference order: DC > DD > CC > CD

Actor B’s preference order: CD > DD > CC > DC

Deadlock is a common – if little studied – occurrence in international relations, although knowledge about how deadlocks are solved can be of practical and theoretical importance.[52] In the context of developing an AI Coordination Regime, recognizing that two competing actors are in a state of Deadlock might drive peace-maximizing individuals to pursue de-escalation strategies that differ from other game models. As a result, it is important to consider deadlock as a potential model that might explain the landscape of AI coordination.

Finally, a Stag Hunt occurs when the returns for both actors are higher if they cooperate than if either or both defect. As a result, there is no conflict between self-interest and mutual benefit, and the dominant strategy of both actors would be to cooperate. This is visually represented in Table 4 with each actor’s preference order explicitly outlined.

Table 5. This table contains a sample ordinal representation of a payoff matrix for a Stag Hunt game.

Actor A’s preference order: CC > DC > DD > CD

Actor B’s preference order: CC > CD > DD > DC

An approximation of a Stag Hunt in international relations would be an international treaty such as the Paris Climate Accords, where the protective benefits of environmental regulation from the harms of climate change (in theory) outweigh the benefits of economic gain from defecting. In the context of the AI Coordination Problem, a Stag Hunt is the most desirable outcome as mutual cooperation results in the lowest risk of racing dynamics and associated risk of developing a harmful AI.

As stated, which model (Prisoner’s Dilemma, Chicken, Deadlock, or Stag Hunt) you think accurately depicts the AI Coordination Problem (and which resulting policies should be pursued) depends on the structure of payoffs to cooperating or defecting. In each of these models, the payoffs can be most simply described as the anticipated benefit from developing AI minus the anticipated harm from developing AI. When looking at these components in detail, however, we see that the anticipated benefits and harms are linked to whether the actors cooperate or defect from an AI Coordination Regime. For example, if the two international actors cooperate with one another, we can expect some reduction in individual payoffs if both sides agree to distribute benefits amongst each other. The remainder of this section looks at these payoffs and the variables that determine them in more detail.[53]

If both sides cooperate in an AI Coordination Regime, we can expect their payoffs to be expressed as follows:

The benefit that each actor can expect to receive from an AI Coordination Regime consists of the probability that each actor believes such a regime would achieve a beneficial AI – expressed as P_(b|A) (AB) for Actor A’s belief and P_(b|B) (AB) for Actor B – times each actor’s perceived benefit of AI – expressed as b_A and b_B. Additionally, this model accounts for an AI Coordination Regime that might result in variable distribution of benefits for each actor. If the regime allows for multilateral development, for example, the actors might agree that whoever reaches AI first receives 60% of the benefit, while the other actor receives 40% of the benefit. This distribution variable is expressed in the model as d, where differing effects of distribution are expressed for Actors A and B as d_A and d_B respectively.[54]

Meanwhile, the harm that each actor can expect to receive from an AI Coordination Regime consists of the actor’s perceived likelihood that such a regime would create a harmful AI – expressed as P_(h|A) (AB) for Actor A and P_(h|B) (AB) for Actor B – times each actor’s perceived harm– expressed as h_A and h_B. Here, we assume that the harm of an AI-related catastrophe would be evenly distributed amongst actors.

If one side cooperates with and one side defects from the AI Coordination Regime, we can expect their payoffs to be expressed as follows (here we assume Actor A defects while Actor B cooperates):

For the defector (here, Actor A), the benefit from an AI Coordination Regime consists of the probability that they believe such a regime would achieve a beneficial AI times Actor A’s perceived benefit of receiving AI with distributional considerations [P_(b|A) (AB)∙b_A∙d_A]. Additionally, the defector can expect to receive the additional expected benefit of defecting and covertly pursuing AI development outside of the Coordination Regime. This additional benefit is expressed here as P_(b|A) (A)∙b_A. For the cooperator (here, Actor B), the benefit they can expect to receive from cooperating would be the same as if both actors cooperated [P_(b|B) (AB)∙b_B∙d_B].

Meanwhile, both actors can still expect to receive the anticipated harm that arises from a Coordination Regime [P_(h|A or B) (AB)∙h_(A or B)]. In this scenario, however, both actors can also anticipate to the receive additional anticipated harm from the defector pursuing their own AI development outside of the regime. Here, this is expressed as P_(h|A or B) (A)∙h_(A or B).

Finally, if both sides defect or effectively choose not to enter an AI Coordination Regime, we can expect their payoffs to be expressed as follows:

The benefit that each actor can expect to receive from this scenario is solely the probability that they achieve a beneficial AI times each actor’s perceived benefit of receiving AI (without distributional considerations): P_(b|A) (A)∙b_A for Actor A and P_(b|B) (B)∙b_B for Actor B.

Meanwhile, the harm that each actor can expect to receive from an AI Coordination Regime consists of both the likelihood that the actor themselves will develop a harmful AI times that harm, as well as the expected harm of their opponent developing a harmful AI. Together, this is expressed as:

One last consideration to take into account is the relationship between the probabilities of developing a harmful AI for each of these scenarios. Namely, the probability of developing a harmful AI is greatest in a scenario where both actors defect, while the probability of developing a harmful AI is lowest in a scenario where both actors cooperate. This is expressed in the following way:

*where A is the defecting actor

The intuition behind this is laid out in Armstrong et al.’s “Racing to the precipice: a model of artificial intelligence.”[55] The authors suggest each actor would be incentivized to skimp on safety precautions in order to attain the transformative and powerful benefits of AI before an opponent. By failing to agree to a Coordination Regime at all [D,D], we can expect the chance of developing a harmful AI to be highest as both actors are sparing in applying safety precautions to development.

***

Altogether, the considerations discussed are displayed in Table 6 as a payoff matrix. Based on the values that each actor assigns to their payoff variables, we can expect different coordination models (Prisoner’s Dilemma, Chicken, Deadlock, or Stag Hunt) to arise. The following subsection further examines these relationships and simulates scenarios in which each coordination model would be most likely.

Table 6 Payoff Matrix for AI Coordination Scenarios

Where P_h (A)∙h [D,D]>P_h (A)∙h [D,C]>P_h (AB)∙h [C,C]

3.4 Simulating the theory

Using the payoff matrix in Table 6, we can simulate scenarios for AI coordination by assigning numerical values to the payoff variables. The remainder of this subsection looks at numerical simulations that result in each of the four models and discusses potential real-world hypotheticals these simulations might reflect.

One example payoff structure that results in a Prisoner’s Dilemma is outlined in Table 7. Here, both actors demonstrate varying uncertainty about whether they will develop a beneficial or harmful AI alone, but they both equally perceive the potential benefits of AI to be greater than the potential harms. Moreover, each actor is more confident in their own capability to develop a beneficial AI than their opponent’s. The corresponding payoff matrix is displayed as Table 8.

Table 7. Payoff variables for simulated Prisoner’s Dilemma

Table 8. Payoff matrix for simulated Prisoner’s Dilemma. Here, values are measured in utility.

		Actor B
		Cooperate	Defect
Actor A	Cooperate	3.2 , 3.2	1.2 , 9
	Defect	7.6 , 0.8	2.4, 2.6

One example payoff structure that results in a Deadlock is outlined in Table 9. Here, both actors demonstrate a high degree of optimism in both their and their opponent’s ability to develop a beneficial AI, while this likelihood would only be slightly greater under a cooperation regime. Additionally, both actors perceive the potential returns to developing AI to be greater than the potential harms. The corresponding payoff matrix is displayed as Table 10.

Table 9. Payoff variables for simulated Deadlock

	P_b\|A(A)	P_b\|B(A)	P_b\|A(B)	P_b\|B(B)	P_b\|A(AB)	P_b\|B(AB)	d_A	d_B	b_A	b_B
Perceived Benefits	0.7	0.7	0.7	0.7	0.75	0.75	0.5	0.5	13	13
	P_h\|A(A)	P_h\|B(A)	P_h\|A(B)	P_h\|B(B)	P_h\|A(AB)	P_h\|B(AB)	h_A	h_B
Perceived Harms	0.3	0.3	0.3	0.3	0.25	0.25	7	7

Table 10. Payoff matrix for simulated Deadlock

		Actor B
		Cooperate	Defect
Actor A	Cooperate	3.125 , 1.25	1.025 , 10.125
	Defect	10.125 , 1.025	4.9, 4.9

One example payoff structure that results in a Chicken game is outlined in Table 11. Here, both actors demonstrate high uncertainty about whether they will develop a beneficial or harmful AI alone (both Actors see the likelihood as a 50/50 split), but they perceive the potential benefits of AI to be slightly greater than the potential harms. The payoff matrix is displayed as Table 12.

Table 11. Payoff variables for simulated Chicken game

	P_b\|A(A)	P_b\|B(A)	P_b\|A(B)	P_b\|B(B)	P_b\|A(AB)	P_b\|B(AB)	d_A	d_B	b_A	b_B
Perceived Benefits	0.5	0.5	0.5	0.5	0.9	0.9	0.5	0.5	10	10
	P_h\|A(A)	P_h\|B(A)	P_h\|A(B)	P_h\|B(B)	P_h\|A(AB)	P_h\|B(AB)	h_A	h_B
Perceived Harms	0.5	0.5	0.5	0.5	0.1	0.1	9	9

Table 4. Payoff matrix for simulated Chicken game. Here, values are measured in utility.

		Actor B
		Cooperate	Defect
Actor A	Cooperate	3.6 , 3.6	-0.9 , 4.1
	Defect	4.1 , -0.9	-4, -4

Finally, Table 13 outlines an example payoff structure that results in a Stag Hunt. Both actors are more optimistic in Actor B’s chances of developing a beneficial AI, but also agree that entering an AI Coordination Regime would result in the highest chances of a beneficial AI. Moreover, the AI Coordination Regime is arranged such that Actor B is more likely to gain a higher distribution of AI’s benefits. Both actors see the potential harms from developing AI to be significant greater than the potential benefits, but expect that cooperating to develop AI could still result in a positive benefit for both parties. The corresponding payoff matrix is displayed as Table 14.

Table 13. Payoff variables for simulated Stag Hunt

	P_b\|A(A)	P_b\|B(A)	P_b\|A(B)	P_b\|B(B)	P_b\|A(AB)	P_b\|B(AB)	d_A	d_B	b_A	b_B
Perceived Benefits	0.5	0.4	0.7	0.7	0.9	0.9	0.3	0.7	7	7
	P_h\|A(A)	P_h\|B(A)	P_h\|A(B)	P_h\|B(B)	P_h\|A(AB)	P_h\|B(AB)	h_A	h_B
Perceived Harms	0.5	0.6	0.3	0.3	0.1	0.1	17	17

Table 14. Payoff matrix for simulated Stag Hunt

		Actor B
		Cooperate	Defect
Actor A	Cooperate	0.19 , 2.71	-4.91 , 2.51
	Defect	-4.81 , -7.49	-10.1 , -10.4

I discuss in this final section the relevant policy and strategic implications this theory has on achieving international AI coordination, and assess the strengths and limitations of the theory outlined above in practice. To reiterate, the primary function of this theory is to lay out a structure for identifying what game models best represent the AI Coordination Problem, and as a result, what strategies should be applied to encourage coordination and stability.

Downs et al.[56] look at three different types of strategies governments can take to reduce the level of arms competition with a rival: (1) a unilateral strategy where an actor’s individual actions impact race dynamics (for example, by focusing on shifting to defensive weapons[57]), (2) a tacit bargaining strategy that ties defensive expenditures to those of a rival, and (3) a negotiation strategy composed of formal arms talks. In their paper, the authors suggest “Both the ‘game’ that underlies an arms race and the conditions under which it is conducted can dramatically affect the success of any strategy designed to end it”[58]. Moreover, they also argue that pursuing all strategies at once would also be suboptimal (or even impossible due to mutual exclusivity), making it even more important to know what sort of game you’re playing before pursuing a strategy[59]. Using their intuition, the remainder of this paper looks at strategy and policy considerations relevant to some game models in the context of the AI Coordination Problem. These strategies are not meant to be exhaustive by any means, but hopefully show how the outlined theory might provide practical use and motivate further research and analysis. Finally, the paper will consider some of the practical limitations of the theory.

Continuous coordination through negotiation in a Prisoner’s Dilemma is somewhat promising, although a cooperating actor runs the risk of a rival defecting if there is not an effective way to ensure and enforce cooperation in an AI Cooperation Regime. Therefore, if it is likely that both actors perceive to be in a state of Prisoner’s Dilemma when deciding whether to agree on AI, strategic resources should be especially allocated to addressing this vulnerability. Furthermore, a unilateral strategy could be employed under a Prisoner’s Dilemma in order to effect cooperation. This could be achieved through signaling lack of effort to increase an actor’s military capacity (perhaps by domestic bans on AI weapon development, for example). As a result, this could reduce a rival actor’s perceived relative benefits gained from developing AI.

As stated before, achieving a scenario where both actors perceive to be in a Stag Hunt is the most desirable situation for maximizing safety from an AI catastrophe, since both actors are primed to cooperate and will maximize their benefits from doing so. In the event that both actors are in a Stag Hunt, all efforts should be made to pursue negotiations and persuade rivals of peaceful intent before the window of opportunity closes. This can be facilitated, for example, by a state leader publicly and dramatically expressing understanding of danger and willingness to negotiate with other states to achieve this.

One final strategy that a safety-maximizing actor can employ in order to maximize chances for cooperation is to change the type of game that exists by using strategies or policies to affect the payoff variables in play. For example, Stag Hunts are likely to occur when the perceived harm of developing a harmful AI is significantly greater than the perceived benefit that comes from a beneficial AI. A relevant strategy to this insight would be to focus strategic resources on shifting public or elite opinion to recognize the catastrophic risks of AI.

Similar strategic analyses can be done on variables and variable relationships outlined in this model. For example, can the structure of distribution impact an actor’s perception of the game as cooperation or defection dominated (if so, should we focus strategic resources on developing accountability strategies that can effectively enforce distribution)? Does a more optimistic/pessimistic perception of an actor’s own or opponent’s capabilities affect which game model they adopt? Especially as prospects of coordinating are continuous, this can be a promising strategy to pursue with the support of further landscape research to more accurately assess payoff variables and what might cause them to change.

One significant limitation of this theory is that it assumes that the AI Coordination Problem will involve two key actors. Although Section 2 describes to some capacity that this might be a likely event with the U.S. and China, it is still conceivable that an additional international actor can move into the fray and complicate coordination efforts.

Moreover, the usefulness of this model requires accurately gauging or forecasting variables that are hard to work with. For example, it is unlikely that even the actor themselves will be able to effectively quantify their perception of capacity, riskiness, magnitude of risk, or magnitude of benefits. Still, predicting these values and forecasting probabilities based on information we do have is valuable and should not be ignored solely because it is not perfect information.

Finally, there are a plethora of other assuredly relevant factors that this theory does not account for or fully consider such as multiple iterations of game playing, degrees of perfect information, or how other diplomacy-affecting spheres (economic policy, ideology, political institutional setup, etc.) might complicate coordination efforts. Despite the large number of variables addressed in this paper, this is at its core a simple theory with the aims of motivating additional analysis and research to branch off. While there is certainly theoretical value in creating a single model that can account for all factors and answer all questions inherent to the AI Coordination Problem, this is likely not tractable or useful to attempt – at least with human hands and minds alone.

Independent Variables

P_b\|A(A)	Probability Actor A believes it will develop a beneficial AI
P_b\|B(A)	Probability Actor B believes Actor A will develop a beneficial AI
P_b\|A(B)	Probability Actor A believes Actor B will develop a beneficial AI
Pb\|B(B)	Probability Actor B believes it will develop a beneficial AI
P_b\|A(AB)	Probability Actor A believes AI Coordination Regime will develop a beneficial AI
P_b\|B(AB)	Probability Actor B believes AI Coordination Regime will develop a beneficial AI
d_A	Percent of benefits Actor A can expect to receive from an AI Coordination Regime
d_B	Percent of benefits Actor B can expect to receive from an AI Coordination Regime
b_A	Actor A’s perceived utility from developing beneficial AI
b_B	Actor B’s perceived utility from developing beneficial AI
P_h\|A(A)	Probability Actor A believes it will develop a harmful AI
P_h\|B(A)	Probability Actor B believes Actor A will develop a harmful AI
P_h\|A(B)	Probability Actor A believes Actor B will develop a harmful AI
P_h\|B(B)	Probability Actor B believes it will develop a harmful AI
P_h\|A(AB)	Probability Actor A believes AI Coordination Regime will develop a harmful AI
P_h\|B(AB)	Probability Actor B believes AI Coordination Regime will develop a harmful AI
h_A	Actor A’s perceived harm from developing a harmful AI
h_B	Actor B’s perceived harm from developing a harmful AI

Dependent Variable

Type of game model and prospect of coordination

[1] Kelly Song, “Jack Ma: Artificial intelligence could set off WWIII, but ‘humans will win’”, CNBC, June 21, 2017, https://www.cnbc.com/2017/06/21/jack-ma-artificial-intelligence-could-set-off-a-third-world-war-but-humans-will-win.html.

[2] Tom Simonite, “Artificial Intelligence Fuels New Global Arms Race,” Wired., September 8, 2017, https://www.wired.com/story/for-superpowers-artificial-intelligence-fuels-new-global-arms-race/

[3] Elon Musk, Twitter Post, September 4, 2017, https://twitter.com/elonmusk/status/904638455761612800.

[4] Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (Oxford University Press, 2014).

[5] Stuart Armstrong, Nick Bostrom, & Carl Shulman, “Racing to the precipice: a model of artificial intelligence development,” AI and Society 31, 2(2016): 201–206.

[6] See infra at Section 2.2 Relevant Actors

[7] E.g. Julian E. Barnes and Josh Chin, “The New Arms Race in AI,” Wall Street Journal, March 2, 2018, https://www.wsj.com/articles/the-new-arms-race-in-ai-1520009261; Cecilia Kang and Alan Rappeport, “The New U.S.-China Rivalry: A Technology Race”, March 6, 2018, https://www.nytimes.com/2018/03/06/business/us-china-trade-technology-deals.html.

[8] Elsa Kania, “Beyond CFIUS: The Strategic Challenge of China’s Rise in Artificial Intelligence,” Lawfare, June 20, 2017, https://www.lawfareblog.com/beyond-cfius-strategic-challenge-chinas-rise-artificial-intelligence (highlighting legislation considered that would limit Chinese investments in U.S. artificial intelligence companies and other emerging technologies considered crucial to U.S. national security interests).

[9] That is, the extent to which competitors prioritize speed of development over safety (Bostrom 2014: 767)

[10] “AI expert Andrew Ng says AI is the new electricity | Disrupt SF 2017”, TechCrunch Disrupt SF 2017, TechCrunch, September 20, 2017, https://www.youtube.com/watch?v=uSCka8vXaJc.

[11] McKinsey Global Institute, “Artificial Intelligence: The Next Digital Frontier,” June 2017, https://www.mckinsey.com/~/media/McKinsey/Industries/Advanced%20Electronics/Our%20Insights/How%20artificial%20intelligence%20can%20deliver%20real%20value%20to%20companies/MGI-Artificial-Intelligence-Discussion-paper.ashx: 5 (estimating major tech companies in 2016 spent $20-30 billion on AI development and acquisitions).

[12] Apple Inc., “Siri,” https://www.apple.com/ios/siri/.

[13] Tesla Inc., “Autopilot,” https://www.tesla.com/autopilot.

[14] IBM, “Deep Blue,” Icons of Progress, http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/deepblue/.

[15] Sam Byford, “AlphaGo beats Lee Se-dol again to take Google DeepMind Challenge series,” The Verge, March 12, 2016, https://www.theverge.com/2016/3/12/11210650/alphago-deepmind-go-match-3-result.

[16] Google DeepMind, “DeepMind and Blizzard open StarCraft II as an AI research environment,” https://deepmind.com/blog/deepmind-and-blizzard-open-starcraft-ii-ai-research-environment/.

[17] Michele Bertoncello and Dominik Wee, “Ten ways autonomous driving could redefine the automotive world,” Mcikinsey&Company, June 2015, https://www.mckinsey.com/industries/automotive-and-assembly/our-insights/ten-ways-autonomous-driving-could-redefine-the-automotive-world (suggesting that driverless cars could reduce traffic fataltiies by up to 90 percent).

[18] Deena Zaidi, “The 3 most valuable applications of AI in health care,” VentureBeat, April 22, 2018, https://venturebeat.com/2018/04/22/the-3-most-valuable-applications-of-ai-in-health-care/.

[19] UN News, “UN artificial intelligence summit aims to tackle poverty, humanity’s ‘grand challenges’,” United Nations, June 7, 2017, https://news.un.org/en/story/2017/06/558962-un-artificial-intelligence-summit-aims-tackle-poverty-humanitys-grand.

[20] Will Knight, “Could AI Solve the World’s Biggest Problems?” MIT Technology Review, January 12, 2016, https://www.technologyreview.com/s/545416/could-ai-solve-the-worlds-biggest-problems/.

[21] Jackie Snow, “Algorithms are making American inequality worse,” MIT Technology Review, January 26, 2018, https://www.technologyreview.com/s/610026/algorithms-are-making-american-inequality-worse/; The Boston Consulting Group & Sutton Trust, “The State of Social mobility in the UK,” July 2017, https://www.suttontrust.com/wp-content/uploads/2017/07/BCGSocial-Mobility-report-full-version_WEB_FINAL-1.pdf.

[22] Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner, “Machine Bias,” ProPublica, May 23, 2016 https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.

[23] United Nations Office for Disarmament Affairs, “Pathways to Banning Fully Autonomous Weapons,” United Nations, October 23, 2017, https://www.un.org/disarmament/update/pathways-to-banning-fully-autonomous-weapons/.

[24] Defined by Bostrom as “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills,” Nick Bostrom, “How long before suerintelligence?” Linguistic and Philosophical Investigations 5, 1(2006): 11-30.

[25] For more on the existential risks of Superintelligence, see Bostrom (2014) at Chapters 6 and 8.

[26] Stephen Hawking, Stuart Russell, Max Tegmark, Frank Wilczek, “Transcendence looks at the implications of artificial intelligence – but are we taking AI seriously enough?” The Indepndent, May 1, 2014, https://www.independent.co.uk/news/science/stephen-hawking-transcendence-looks-at-the-implications-of-artificial-intelligence-but-are-we-taking-9313474.html.

[27] An academic survey conducted showed that AI experts and researchers believe there is a 50% chance of AI outperforming humans in all tasks in 45 years. See Katja Grace, John Salvatier, Allan Dafoe, Baobao Zhang, & Owain Evans, “When Will AI Exceed Human Performance? Evidence from AI Experts” (2017: 11-21), retrieved from http://arxiv.org/abs/1705.08807.

[28] Armstrong et al., “Racing to the precipice: a model of artificial intelligence development.”

[29] There is a scenario where a private actor might develop AI in secret from the government, but this is unlikely to be the case as government surveillance capabilities improve. See Carl Shulman, “Arms Control and Intelligence Explosions,” 7th European Conference on Computing and Philosophy, Bellaterra, Spain, July 2–4, 2009: 6.

[30] Greg Allen and Taniel Chan, “Artificial Intelligence and National Security.” Report for Harvard Kennedy School: Belfer Center for Science and International Affairs, July 2017, https://www.belfercenter.org/sites/default/files/files/publication/AI%20NatSec%20-%20final.pdf: 71-110

[31] Executive Office of the President National Science and Technology Council: Committee on Technology, “Preparing for the Future of Artificial Intelligence,” Executive Office of the President of the United States (October 2016), https://obamawhitehouse.archives.gov/sites/default/files/whitehouse_files/microsites/ostp/NSTC/preparing_for_the_future_of_ai.pdf; “Artificial Intelligence, Automation, and the Economy” Executive Office of the President of the United States (December 2016), https://obamawhitehouse.archives.gov/sites/whitehouse.gov/files/documents/Artificial-Intelligence-Automation-Economy.PDF.

[32] Paul Mozur, “Beijing Wants A.I. to Be Made in China by 2030,” The New York Times, July 20, 2017, https://www.nytimes.com/2017/07/20/business/china-artificial-intelligence.html

[33] Kania, ““Beyond CFIUS: The Strategic Challenge of China’s Rise in Artificial Intelligence.”

[34] McKinsey Global Institute, “Artificial Intelligence: The Next Digital Frontier.”

[35] Outlining what this Coordination Regime might look like could be the topic of future research, although potential desiderata could include legitimacy, neutrality, accountability, and technical capacity; see Allan Dafoe, “Cooperation, Legitimacy, and Governance in AI Development,” Working Paper (2016). Such a Coordination Regime could also exist in either a unilateral scenario – where one ‘team’ consisting of representatives from multiple states develops AI together – or a multilateral scenario – where multiple teams simultaneously develop AI on their own while agreeing to set standards and regulations (and potentially distributive arrangements) in advance.

[36] Colin S. Gray, “The Arms Race Phenomenon,” World Politics, 24, 1(1971): 39-79 at 41

[37] Samuel P. Huntington, “Arms Races: Prerequisites and Results,” Public Policy 8 (1958): 41–86.

[38] Michael D. Intriligator & Dagobert L. Brito, “Formal Models of Arms Races,” Journal of Peace Science 2, 1(1976): 77–88.

[39] D. S. Sorenson, “Modeling the Nuclear Arms Race: A Search for Stability,” Journal of Peace Science 4 (1980): 169–85.

[40] Robert Jervis, “Cooperation Under the Security Dilemma.” World Politics, 30, 2 (1978): 167-214.

[41] Ibid.

[42] Vally Koubi, “Military Technology Races,” International Organization 53, 3(1999): 537–565.

[43] Edward Moore Geist, “It’s already too late to stop the AI arms race – We must manage it instead,” Bulletin of the Atomic Scientists 72, 5(2016): 318–321.

[44] Thomas C. Schelling & Morton H. Halperin, Strategy and Arms Control. (Pergamon Press: 1985).

[45] Colin S. Gray, House of Cards: Why Arms Control Must Fail, (Cornell Univ. Press: 1992).

[46] Charles Glaser, “Realists as Optimists: Cooperation as Self-Help,” International Security 19, 3(1994): 50-90.

[47] George W. Downs, David M. Rocke, & Randolph M. Siverson, “Arms Races and Cooperation,” World Politics, 38(1: 1985): 118–146.

[48] Denise Garcia and Monica Herz, “Preventive Action in World Politics,” Global Policy 7, 3(2016): 370–379.

[49] For example, see Glenn H. Snyder “Prisoner’s Dilemma’ and ‘Chicken’ Models in International Politics,” International Studies Quarterly 15, 1(1971): 66–103 and Downs et al., “Arms Races and Cooperation.”

[50] Snyder, “Prisoner’s Dilemma’ and ‘Chicken’ Models in International Politics.”

[51] Snyder, “Prisoner’s Dilemma’ and ‘Chicken’ Models in International Politics.”

[52] Stefan Persson, “Deadlocks in International Negotiation,” Cooperation and Conflict 29, 3(1994): 211–244.

[53] A full list of the variables outlined in this theory can be found in Appendix A.

[54] In a bilateral AI development scenario, the distribution variable can be described as an actor’s likelihood of winning * percent of benefits gained by winner (this would be reflected in the terms of the Coordination Regime). Together, the likelihood of winning and the likelihood of lagging = 1.

[55] See also Bostrom, Superintelligence at Chapter 14.

[56] Downs et al., “Arms Races and Cooperation.”

[57] This is additionally explored in Jervis, “Cooperation Under the Security Dilemma.”

[58] Downs et al., “Arms Races and Cooperation”, 143-144.

[59] Ibid., 145-146.

Actor B

Actor A

Ehrik Aldana