Game theory as a learning lens, not an arithmetic class

I do not want to learn game theory as a class about “calculating Nash equilibria.”

That part matters, obviously. But if all I keep is payoff matrices, mixed strategies, and Bayesian Nash equilibrium, the subject will not be very useful to me. The useful thing is stranger and more practical: game theory gives messy human behavior a structure.

Who has room to move? Who moves first? Who can commit? Who is only doing cheap talk? Who has private information? Whose incentives contradict their words? Why does something that looks stupid become rational under a different payoff structure?

That is what I want from it.

Game theory is not about winning. It is about seeing the game.

My current definition is:

Game theory is a language for seeing interdependent decisions.

Decision theory asks: given the environment, what should I choose?

Game theory adds one more layer: the environment contains other people choosing too, and they are trying to predict what I will choose.

Now the world has mirrors.

what I do
depends on what you do
and what you do
depends on what you think I will do

A lot of real problems get stuck because we model an interactive system as a solo optimization problem.

Work communication is a simple example. You think the question is “should I work harder to prove myself?” But the real situation may be a signaling game: every extra effort sends the signal “I can absorb infinite cost.” The other side’s best response is not to appreciate you. It is to keep pushing.

Agent products have the same shape. You think the question is “should the agent notify the user more?” But the real game is repeated: one false alarm changes how the user treats future notifications. A notification is not a single event. It trains the other player’s strategy.

Open source maintenance too. You think the question is “is this PR technically correct?” But reviewer attention, maintainer risk, contribution reputation, and issue clarity are also in the game. A patch is not an isolated object. It is an action that changes whether someone wants to spend time on you.

This is the value: game theory forces vague human mess into incentives, information, sequence, commitment, and repeated interaction.

Schelling first, because reality is dirty

If I had to start with one book, I would start with Thomas Schelling’s The Strategy of Conflict, not a textbook.

Not because Schelling is more formal. The opposite. His strength is that he studies the dirty cases: bargaining, deterrence, limited war, extortion, tacit bargaining. The Google Books description gets the shape right: these are situations with common interest and conflict at the same time.

That is much closer to ordinary life than pure competition.

Most real relationships are not zero-sum. Companies and employees, users and products, maintainers and contributors, friends, partners, countries: usually there is some shared interest and some conflict. The hard part is not defeating the other side. It is changing expectations inside mutual dependence.

The Schelling lenses I want to keep:

commitment   how do I make an action credible?
threat       what punishment will the other side believe?
focal point  where do people coordinate without talking?
brinkmanship how does manufactured risk change behavior?
tacit bargain how does unspoken coordination happen?

Commitment is the one that hits hardest.

Often the question is not “do I want to do X?” It is “does the other side believe I will do X?” Credible commitment often requires reducing your own freedom.

That is counterintuitive. More freedom feels like more power. But in game theory, someone who can always back down cannot make a credible threat. Someone who can always change their mind cannot make a credible promise. Someone who can always work more cannot make a credible boundary.

So some moves that look weaker are actually mechanisms for credibility.

not replying instantly
not accepting vague work
not absorbing scope creep for free
putting approval points into the flow
making external agent actions held drafts

Those are not just emotional boundaries. They change the other side’s best response.

Yale is the skeleton

Yale Game Theory with Ben Polak is a good spine.

The course moves from “put yourself into other people’s shoes,” iterative deletion, and median voter theorem into Nash equilibrium, mixed strategies, evolutionary stability, sequential games, backward induction, repeated games, asymmetric information, signaling, and auctions.

That path goes from toy static games toward something closer to the real world:

normal form games
  -> dominated strategies
  -> Nash equilibrium
  -> mixed strategies
  -> evolutionary stability
  -> sequential games
  -> backward induction
  -> repeated games
  -> asymmetric information
  -> signaling / auctions

I would use it as a concept map, not a completionist video checklist.

The four pieces I care about first:

Dominated strategies: what choices should not stay on the table?
Nash equilibrium: where does the system get stuck after everyone best-responds?
Backward induction: reason from the future back to the present, especially around threats and commitments.
Repeated games: a move that is rational once can become stupid when it trains the future.

The fourth one matters most.

Many work and relationship problems are misread when treated as one-shot games. Winning one argument may be rational locally and destructive globally. Over-notifying once may feel safe but trains users to ignore the agent. Making a PR “complete” may feel helpful but trains reviewers to expect a large diff every time.

Repeated games turn short-term wins into long-term state training.

That is exactly why agent experience is not UI polish. It is repeated game design.

Model Thinking keeps the model from becoming a religion

Model Thinking is not only a game theory course, but it may be the best foundation.

The course page makes the right claim: people who think with models outperform people who do not, and people who think with many models outperform people who use only one.

That guards against the classic learning failure: learn one model, then see it everywhere.

Cooperation becomes prisoner’s dilemma. Competition becomes zero-sum. Any message becomes signaling. Any stable behavior becomes equilibrium. At that point the model is not helping you think. It is doing cosplay.

Better: treat models as lenses, not conclusions.

segregation model      local preferences -> macro separation
threshold model        small changes -> tipping points
network model          position -> influence
game theory model      strategies reshape each other
agent-based model      simple rules -> complex behavior

The useful question is not “is this a prisoner’s dilemma?” It is:

what variable dominates this situation?
information asymmetry?
sequence?
network structure?
thresholds?
repeated interaction?
non-credible commitment?

A model is a compression of reality.

If the compression is wrong, rigorous reasoning only makes the wrong answer look cleaner.

Evolutionary games explain why strategies survive

Stanford’s Behavioral Evolution fits well after the basic game theory spine.

Sapolsky starts by killing a common intuition: animals do not act “for the good of the species.” Behavior that looks like group sacrifice can often be explained through gene replication, kinship, and reciprocal cooperation.

The three building blocks in the lecture:

individual selection    reproduce yourself, leave more gene copies
kin selection           help relatives because genes overlap
reciprocal altruism     cooperate with non-relatives under strict conditions

This is extremely useful for thinking about cooperation.

Cooperation is not “everyone is nice.” Cooperation needs structure.

Kin selection has shared genes. Reciprocal altruism needs repeated interaction, individual recognition, memory, and ways to punish free riders. In other words, cooperation is not a moral miracle. It is a set of conditions.

Those conditions translate directly to human systems and software systems:

will we meet again?
can I recognize you?
can I remember history?
can betrayal be punished?
can cooperation be rewarded?

If those conditions are missing, stable cooperation is a fantasy.

Open source works this way. A one-off drive-by PR rarely produces deep trust. A long-running contributor gets more interpretive charity, not because maintainers are nicer, but because repeated interaction makes reputation a state variable.

Agent systems work the same way. If an agent has no reliable memory, no traceable behavior, and no continuous identity, humans cannot form stable cooperation with it. Every interaction is a stranger game. Trust resets to zero.

That is why provenance and memory are not accessories. They are conditions for cooperation to evolve.

Game Theory 101 is for formal details

Game Theory 101 is a useful short-lecture toolbox.

The episode you sent is about ex ante and interim dominance in Bayesian games. The point is not “two more terms.” The point is that the timing of information changes the object of strategy.

Before you know your type, you are choosing a full contingent plan for the player. After you know your type, you are choosing as that specific type.

ex ante     before knowing which type you are
interim     after knowing your type

This distinction transfers well.

ex ante: what if I get tired? what if the reviewer does not reply? what if the agent gets stuck?
interim: I am tired now / the reviewer did not reply / the agent is stuck. What do I do?

Ex ante strategy is policy. Interim strategy is local action.

A system that only decides in the interim gets dragged around by current emotion and local information. A good policy decides in advance: when to continue, when to pause, when to escalate, when to stop.

This matters to me because I can easily fall into “interim infinite solving”: already tired, still recalculating whether to continue; already in scope creep, still recalculating whether to do a little more; agent already off track, still recalculating whether to give it one more prompt.

Better: write the policy first.

if a task has no new evidence after 90 minutes, stop
if a PR exceeds the original scope, split it
if an agent repeats the same error twice, stop prompting and inspect the toolchain
if learning turns into link collecting for 20 minutes, write an output

That is not self-discipline theatre. It is ex ante policy protecting the interim self.

What each book is for

I would not read these in prestige order. I would assign them jobs.

The Strategy of Conflict

Schelling. Highest priority.

Read it for commitment, threat, focal points, and tacit bargaining. Best for situations that are both cooperative and conflictual.

That is most of reality.

Thinking Strategically / The Art of Strategy

Dixit and Nalebuff.

Good entry points. They translate game theory into business, politics, and everyday examples. The value is not depth, but recognition: “oh, this is also a game.”

Strategies and Games: Theory and Practice

Prajit Dutta.

A more serious textbook. Useful after the Yale course if I want to turn concepts into derivations.

Evolutionary Dynamics

Martin Nowak.

For evolutionary games, cooperation, and replicator dynamics. More mathematical, but the theme is strong: strategies in life are not solved once. They evolve through replication, variation, and selection.

Co-opetition

Nalebuff and Brandenburger.

Business strategy. The useful idea is that competition and cooperation are not opposites. Often you first enlarge the pie together, then fight over how to divide it.

Very relevant to platforms, ecosystems, and open source.

Good Strategy Bad Strategy

Rumelt is not game theory, but belongs here anyway.

Game theory can tempt you into clever moves. Rumelt drags you back to strategy basics: diagnosis, guiding policy, coherent action. Strategy without diagnosis is just a wish list.

This book is an antidote.

What I actually want to steal

I do not want to keep formulas. I want to keep questions.

1. Is this solo optimization or interactive strategy?

If the other side changes behavior based on my action, I should stop using a solo optimizer brain.

2. Is this one-shot or repeated?

One-shot optimal moves can poison long-term cooperation, especially around trust, notifications, review, and collaboration.

3. Who has private information? Who is guessing whose type?

Many conflicts are not preference conflicts. They are type uncertainty. The other side does not know whether I am “occasionally firm” or “always willing to yield.” I do not know whether they are genuinely constrained or testing the boundary.

4. Which commitments are credible? Which are just words?

A promise that can be withdrawn for free is not a promise. A threat with no cost is not a threat.

5. How was the current equilibrium trained?

Bad states are often trained by repeated best responses.

If I absorb scope every time, the system learns to give me scope.

If an agent answers confidently while uncertain, the user learns not to trust it.

If a product pushes raw activity at humans, humans learn to ignore activity.

6. Can I change the payoff, or only the action?

This is the big one.

Ordinary effort picks better actions inside the old payoff structure. Real strategy changes the payoff: rules, sequence, defaults, visible information, approval points, exit costs.

Do not get better at playing a bad game.

Change the game.

A learning order that fits me

I would not grind through all of this from beginning to end. That turns into collection addiction.

I would go like this:

1. Read Schelling first: get the reality smell
2. Watch Yale lectures 1-12: build the normal form / Nash / mixed / evolution spine
3. Interleave Model Thinking: avoid one-model brain
4. Watch Stanford Behavioral Evolution: learn the structural conditions for cooperation
5. Use Game Theory 101 for specific gaps: Bayesian games, dominance, signaling
6. Then decide whether Dutta / Nowak formalism is worth the cost

For every concept, write one personal example.

Do not copy definitions.

credible threat -> how do I design stop-loss so “I will stop” is believable?
focal point -> how can a product default let users coordinate without talking?
repeated game -> how does an agent notification policy train user trust?
Bayesian type -> how does a reviewer infer I am a reliable contributor from PR history?
reciprocal altruism -> why do open source systems need identity, memory, and future interaction?

If a concept cannot migrate into my own problems, I have not learned it yet.

Resource index

Courses and videos:

Books:

Thomas Schelling, The Strategy of Conflict
Avinash K. Dixit, Susan Skeath, David H. Reiley, Games of Strategy
Avinash K. Dixit, Competitive Strategy, Options, and Games
Martin Nowak, Evolutionary Dynamics: Exploring the Equations of Life
Avinash K. Dixit and Barry Nalebuff, Thinking Strategically
Barry Nalebuff and Adam Brandenburger, Co-opetition
Richard Rumelt, Good Strategy Bad Strategy
Prajit Dutta, Strategies and Games: Theory and Practice
Avinash K. Dixit and Barry Nalebuff, The Art of Strategy

Compressed version

Game theory is useful to me not because it makes me better at winning.

It helps me notice when I am treating an interactive system as a personal effort problem.

When that happens, more effort usually trains the bad equilibrium harder.

Real strategy is not trying harder inside the old game.

Real strategy is seeing the game, then changing it.