apparatchiks.exnet.su

The prisoner's dilemma is a paradox in game theory where each player makes a choice to cooperate or to defect. If both players defect, then the worst outcome is realised for both players.

However, when calculating the expected outcomes of decisions from its payoff matrix, we see that the best decision is to defect, backstabbing your opponent.

However, this also applies to your opponent. So if both you and your opponent make the rational decision, both players will literally be worse off than had any other choice been made by both players.

As a side note: this suggests that rationality is not commutative over the set of all games (the actual outcome of games are trivially non-commutative: simply consider any non-positive-sum game). Suppose that N players will make the most rational decision. Then the aggregate rationality of the group is not necessarily the sum. In the prisoner's dilemma, the aggregate of two rational decisions is doubly irrational.

This is the paradox: that playing rationally is what leads to the worst outcome, assuming your opponent is also rational.

Meta-rationality

What intrigues me about this paradox is that this forces us to surmise the existence of some higher level of rationality, one that supersedes the self-defeating rationality evident in the prisoner's dilemma. Such a rationality would need to be able to allow itself to make irrational decisions, in the furtherance of a long-term superior strategy.

Of course, this meta-rationality does not, nor cannot supersede the undeniable mathematical result of the prisoner's dilemma: it simply is the case that defecting is the best response in that doing so arrives one at the Nash equilibrium. Thus, if you were to arrive in a prisoner's dilemma situation yourself, and we assume it is as rigidly setup as the hypothetical prisoner's dilemma is, then you would be fundamentally irrational if you did not defect.

One way of looking at it, is a rationale that aims for the best of all possible outcomes, not just the short-sighted best and proximal outcome. Much like in gradient descent where one wants the global maximum point, one can be duped into becoming satisfied instead at a locally maximal point, if, at such a point, one is no better off after moving to an adjacent position. This notion of proximity is analogous to the proximity of the prisoner's dilemma in that only one move can be made.

Local maxima vs. global maxima

However, over a longer period of time, a meta-rational strategy could be one that aims for something higher than immediate short term success. A meta-rational strategy is one that sacrifices short-term gains to approach a globally optimal outcome over time. If you arrive at a single one-off prisoner's dilemma, you must make the rational decision moving you towards the local expected maximum payoff. But, a meta-rational strategy, will aim for a better strategy in the big picture.

Iterated prisoner's dilemma

In international strategy, you don't just make a single maximally rational decision, you have to stick around to experience the consequences and backlashes to your decisions. Much more analogous to international strategy is the iterated prisoner's dilemma.

When the iterated prisoner's dilemma is simulated between two agents and each can alter their strategy in response to the results of previous iterations, we see a different picture. On average, the best strategy is tit-for-tat with forgiveness. "Tit-for-tat" will reciprocate its opponents aggression as well as cooperation. It can be betrayed, but its opponent will lose out on future gains by doing so.

Tit-for-tat can be described as meta-rational. That is, it allows itself to cooperate, so long as doing so indicatively moves it towards the best of all outcomes.

Commitment as Meta-Rational Leverage

In international politics, commitment is when you constrain yourself to a restricted set of allowable decisions that you can make in the future. This is a brilliant example of a justifiable "irrational" decision that can yield genuine, long term gain.

Commitment pays a cost in that it limits your ability to choose certain decisions in the future. This is then a credible signal to a fellow political entity that informs them in advance of the likely choices you will make. To a potential ally or a weary foe, they obtain knowledge about your future decisions that can alter the decisions they make now.

West Germany and Cold War Deterrence

Take the example of the United States stationing troops and military infrastructure in Germany. From a purely local and immediate perspective, this is irrational: it incurs costs and risks entanglement in conflict. However, from a strategic, meta-rational lens, this is an ingenious two pronged strategy. Namely, it achieved the following:

The development of a "tripwire force, for which any Soviet triggering of would cause American casualties. Thus, even if unintentional, this blow up into a US and USSR military conflict. Acquiring West Germany could not justify this cost to the Soviet Union. Thus, it acted so as to deterSoviet expansion into West Germany.
A signal to West-Germany and NATO that compromises made in assuring US tutelage were optimal.
1. This commitment in West Germany allowed for a US lead counter push against German nuclearisation efforts. The US was able to garner commitments towards dual-key NATO systems that would require US authorisation for any deployment of nuclear weapons.
  
  If the US had the option to abstain entirely from conflict in West Germany, then West Germans could reasonably fear the outcome wherein the US chooses to not engage both in conventional and nuclear response to Soviet incursions. In fact, a push towards German nuclearisation had been advanced by the chancellor of West Germany, Konrad Adenauer.
  
  Furthermore, the US's forward deployment on NATO soil can be argued to have aided in other substantial agreements including the 1968 non-proliferation treaty.
2. Maintaining American zone compliance to the US's denazification programme.
  
  This was vital in post-WWII Germany whereby residual Nazi, nationalist and militant groups persisted. Much of German's administrative class still housed and relied upon ex-Nazis. A US lead denazification process was vital in preserving Germany's domestic functioning whilst gradually removing Nazi influence. This required a temporary reduction in German political autonomy. Since Germany considered the US a more than trusted ally, this trust could be granted.
3. West Germany bordered the Soviet aligned East and was well within the range of Soviet short to medium range nuclear warheads, long before the development of ICBMs.
  
  In the 1960s, rising sentiments of fear among the West German populous culminated in a Social Democratic Party push towards easing of tensions and diplomatic normalisation with the Eastern Bloc; called Ostpolitik or "new Eastern policy". This was deemed preferable to what seemed to be a more violent alternative.
  
  Subsequentially, the NATO flexible response doctrine was an additional avenue for the US to clarify their commitment to any Warsaw Pact aggression, notably: nuclear or conventional, with proportional escalation.
  
  Knowing the US's commitment extended to the deployment of nuclear arms in their defence likely assuaged Eastern political influence on West Germany.

From Germany’s point of view, this irreversibility purchases credibility. The U.S. has made the price of abandonment higher than the price of assistance. This is meta-rationality as insurance: committing rationally to future irrationality in order to alter the payoff matrix for your allies today.

When can you trust an ally?

Consider the inverse problem: if you commit too much, your allies may exploit your loyalty. Suppose instead that Germany went about intentionally provoking Russia, confident that Russia would be too fearful of a US response to engage. By provoking a potential war between the US and Russia, Germany could have benefited themselves. A moral hazard is created where the ally benefits from your commitment without being bound by reciprocal restraint.

Thus, commitment must be bounded. It must place both you and your ally in the same Nash equilibrium region, where unilateral deviation by either party is not worthwhile. The reason Germany didn't provoke Russia is obvious: they didn't want to be invaded by Russia. The fact that the US was committed to defending Germany didn't mean that Germany would come out unscathed. In fact, Germany would assuredly be annihilated entirely. The only difference is, that the US made themselves collateral to Germany's destruction.

That is to say, that the US's commitment, placed Germany into a Nash equilibrium wherein they were advantaged by the alignment of interests with the US, and disadvantaged by becoming unaligned. This placed Germany into a position where their rational decision making would keep them in the orbit of the US.

This is the variant of international strategy that I find so compelling. Those decisions that influence the board by influencing what the optimal decisions are for your fellow players to make. By making Germany worse off by defecting, you keep them on a leash.

Paths and Strategic Irreversibility

We see that individual rational decisions do not commute to a globally rational outcome. But this is not true of strategy. Strategy perturbs the strategy of others. But since strategy must be composed of many small decisions, it falls apart when those in your orbit can afford to defect.

Strategy must then concern itself with its own momentum.

This is the core of meta-rational statecraft: making present sacrifices to narrow the field of future outcomes - not just for yourself, but for your allies and adversaries. A nation can shape the strategic landscape by reshaping what others believe you can and will do.

Ideological alliances, such as those between Australia, Canada, and the U.S., are partially underwritten by shared values. But ideology alone is not enough. Trust must be engineered, not merely felt. Real leverage comes from commitments that bind others to your path - not by compulsion, but by making defection irrational.