Implementing Minimax with Alpha-Beta Pruning for Game AI

Why Game AI?

Game AI sits at the intersection of algorithms, strategy, and real-time decision-making. It's one of the purest tests of algorithmic thinking — your agent must evaluate millions of potential futures and choose optimally, all within milliseconds.

The Minimax Foundation

Minimax is a recursive algorithm that assumes both players play optimally. The maximizing player picks the move that leads to the highest score; the minimizing player picks the lowest. For standard 3×3 Tic-Tac-Toe, the game tree has roughly 255,168 possible games — small enough for exhaustive search.

function minimax(board, depth, isMaximizing):
    if terminal(board): return evaluate(board)
    
    if isMaximizing:
        bestScore = -Infinity
        for each move in availableMoves(board):
            score = minimax(applyMove(board, move), depth + 1, false)
            bestScore = max(bestScore, score)
        return bestScore
    else:
        bestScore = +Infinity
        for each move in availableMoves(board):
            score = minimax(applyMove(board, move), depth + 1, true)
            bestScore = min(bestScore, score)
        return bestScore

The result: a provably unbeatable AI. Against optimal play, the best outcome for a human is a draw.

Scaling to Super Tic-Tac-Toe

Super Tic-Tac-Toe amplifies the complexity dramatically. The board consists of 9 sub-boards arranged in a 3×3 grid. Each move on a sub-board determines which sub-board your opponent must play in next. The branching factor jumps from ~5 (standard) to ~20-40 per move.

Pure Minimax can't handle this — the search tree explodes. Alpha-Beta Pruning cuts it down by eliminating branches that can't possibly influence the final decision:

Alpha: the best score the maximizer can guarantee
Beta: the best score the minimizer can guarantee
When alpha >= beta, prune the remaining branches

In practice, Alpha-Beta Pruning reduces the effective branching factor from b to approximately √b, making Super Tic-Tac-Toe tractable with a depth limit of 6-8 plies.

Multi-Board Evaluation Heuristic

Since we can't search to terminal states in Super Tic-Tac-Toe, I designed a weighted evaluation function:

Won sub-boards: +100 points (strategic center board: +150)
Two-in-a-row on main grid: +50 points
Sub-board control (two-in-a-row within a sub-board): +10 points
Forced moves (sending opponent to a won/drawn board): +30 points

The heuristic captures both local tactics (winning sub-boards) and global strategy (controlling the main grid).

Performance Results

Metric	Standard TicTacToe	Super TicTacToe
Search depth	Exhaustive	6-8 plies
Nodes evaluated/move	~9,000	~50,000
Response time	<1ms	~200ms
Win rate vs random	100%	97.3%

What I Learned

The biggest lesson: evaluation functions are where the real intelligence lives. The search algorithm is mechanical — it explores and prunes. But the heuristic that scores non-terminal positions is where domain knowledge, creativity, and engineering judgment converge.

Source code: GitHub - Super Tic-Tac-Toe | GitHub - AI Tic-Tac-Toe