Peg Solitaire RL
Output
Training Data + Round 1
I ran DFS on all of the puzzles to extract their solutions:
(kits) z5362216@k201:~/minesweeper $ python3 train_peg_katana.py \
> --boards english european 6x6 triangle5 triangle6 star \
> --diamond41-solutions diamond41_solutions.txt \
> --epochs 400 \
> --batch-size 128 \
> --lr 0.001 \
> --channels 128 \
> --res-blocks 8 \
> --attention-layers 3 \
> --drop-path 0.2 \
> --label-smoothing 0.1 \
> --mixup-alpha 0.2 \
> --ema-decay 0.9995 \
> --temperature 0.5 \
> --eval-games 200 \
> --solutions-per-start 5 \
> --out models
============================================================
Universal Peg Solitaire Training
============================================================
Started: 2026-01-15 13:27:09
GPU: NVIDIA H200
============================================================
Step 1: Collecting Training Data
============================================================
Boards: ['english', 'european', '6x6', 'triangle5', 'triangle6', 'star']
Solutions per start: 5
Processing english...
Solving english from (3, 3) (32 pegs)...
Solved in 0.37s, 31 moves
Solving english from (0, 3) (32 pegs)...
Solved in 0.38s, 31 moves
Solving english from (2, 0) (32 pegs)...
Solved in 0.15s, 31 moves
Solving english from (1, 3) (32 pegs)...
Solved in 0.47s, 31 moves
Total: 2480 training samples from 4 starting position(s)
Processing european...
Solving european from (1, 3) (36 pegs)...
Solved in 66.04s, 35 moves
Total: 700 training samples from 1 starting position(s)
Processing 6x6...
Solving 6x6 from (2, 2) (35 pegs)...
Solved in 4.91s, 34 moves
Solving 6x6 from (0, 0) (35 pegs)...
Solved in 10.63s, 34 moves
Solving 6x6 from (0, 2) (35 pegs)...
Solved in 0.02s, 34 moves
Solving 6x6 from (1, 1) (35 pegs)...
Solved in 54.10s, 34 moves
Total: 2720 training samples from 4 starting position(s)
Processing triangle5...
Solving triangle5 from (0,) (14 pegs)...
Solved in 0.00s, 13 moves
Solving triangle5 from (3,) (14 pegs)...
Solved in 0.00s, 13 moves
Solving triangle5 from (10,) (14 pegs)...
Solved in 0.00s, 13 moves
Solving triangle5 from (12,) (14 pegs)...
Solved in 0.00s, 13 moves
Total: 260 training samples from 4 starting position(s)
Processing triangle6...
Solving triangle6 from (0,) (20 pegs)...
Solved in 0.00s, 19 moves
Solving triangle6 from (3,) (20 pegs)...
Solved in 0.02s, 19 moves
Solving triangle6 from (6,) (20 pegs)...
Solved in 0.00s, 19 moves
Solving triangle6 from (15,) (20 pegs)...
Solved in 0.02s, 19 moves
Total: 380 training samples from 4 starting position(s)
Processing star...
Solving star from (0,) (9 pegs)...
Solved in 0.00s, 8 moves
Solving star from (5,) (9 pegs)...
Solved in 0.00s, 8 moves
Total: 80 training samples from 2 starting position(s)
Loading diamond41 solutions from diamond41_solutions.txt...
Loaded 248 solutions for diamond41
Added 1560 diamond41 samples
Saved to models/training_data.json
Total samples: 8180
6x6: 2720
diamond41: 1560
english: 2480
european: 700
star: 80
triangle5: 260
triangle6: 380
============================================================
Step 2: Training
============================================================
Parameters: 2,653,620
Training for 400 epochs...
Epoch 1/400 | Loss: 4.3549 | Acc: 18.5% | LR: 0.000994
Epoch 10/400 | Loss: 3.1665 | Acc: 35.0% | LR: 0.000501
Epoch 20/400 | Loss: 2.4748 | Acc: 57.4% | LR: 0.001000
Epoch 30/400 | Loss: 2.7241 | Acc: 49.0% | LR: 0.000854
Epoch 40/400 | Loss: 2.6360 | Acc: 52.6% | LR: 0.000501
Epoch 50/400 | Loss: 2.5210 | Acc: 56.0% | LR: 0.000147
Epoch 60/400 | Loss: 2.4930 | Acc: 57.6% | LR: 0.001000
Epoch 70/400 | Loss: 2.3036 | Acc: 61.8% | LR: 0.000962
Epoch 80/400 | Loss: 2.6344 | Acc: 53.1% | LR: 0.000854
Epoch 90/400 | Loss: 2.5161 | Acc: 55.6% | LR: 0.000692
Epoch 100/400 | Loss: 2.1946 | Acc: 64.6% | LR: 0.000501
Epoch 110/400 | Loss: 2.1105 | Acc: 65.4% | LR: 0.000309
Epoch 120/400 | Loss: 2.4915 | Acc: 55.9% | LR: 0.000147
Epoch 130/400 | Loss: 2.2627 | Acc: 62.9% | LR: 0.000039
Epoch 140/400 | Loss: 2.2844 | Acc: 60.1% | LR: 0.001000
Epoch 150/400 | Loss: 2.4334 | Acc: 56.1% | LR: 0.000990
Epoch 160/400 | Loss: 2.4256 | Acc: 58.2% | LR: 0.000962
Epoch 170/400 | Loss: 2.2687 | Acc: 62.9% | LR: 0.000916
Epoch 180/400 | Loss: 2.3684 | Acc: 60.5% | LR: 0.000854
Epoch 190/400 | Loss: 2.2158 | Acc: 61.9% | LR: 0.000778
Epoch 200/400 | Loss: 2.4369 | Acc: 58.2% | LR: 0.000692
Epoch 210/400 | Loss: 2.2957 | Acc: 58.6% | LR: 0.000598
Epoch 220/400 | Loss: 2.1747 | Acc: 65.2% | LR: 0.000501
Epoch 230/400 | Loss: 2.1394 | Acc: 66.1% | LR: 0.000403
Epoch 240/400 | Loss: 2.2170 | Acc: 62.1% | LR: 0.000309
Epoch 250/400 | Loss: 2.5308 | Acc: 54.3% | LR: 0.000223
Epoch 260/400 | Loss: 1.9152 | Acc: 69.7% | LR: 0.000147
Epoch 270/400 | Loss: 2.3018 | Acc: 62.2% | LR: 0.000085
Epoch 280/400 | Loss: 2.1810 | Acc: 62.9% | LR: 0.000039
Epoch 290/400 | Loss: 2.3095 | Acc: 60.3% | LR: 0.000011
Epoch 300/400 | Loss: 2.2366 | Acc: 61.0% | LR: 0.001000
Epoch 310/400 | Loss: 2.5112 | Acc: 55.8% | LR: 0.000998
Epoch 320/400 | Loss: 2.2704 | Acc: 62.5% | LR: 0.000990
Epoch 330/400 | Loss: 2.3118 | Acc: 60.1% | LR: 0.000978
Epoch 340/400 | Loss: 2.2709 | Acc: 61.0% | LR: 0.000962
Epoch 350/400 | Loss: 2.3365 | Acc: 61.4% | LR: 0.000941
Epoch 360/400 | Loss: 2.2527 | Acc: 62.7% | LR: 0.000916
Epoch 370/400 | Loss: 2.2394 | Acc: 57.2% | LR: 0.000887
Epoch 380/400 | Loss: 2.3419 | Acc: 59.7% | LR: 0.000854
Epoch 390/400 | Loss: 2.2350 | Acc: 63.5% | LR: 0.000817
Epoch 400/400 | Loss: 2.3246 | Acc: 55.9% | LR: 0.000778
Training completed in 7.0 minutes
Best accuracy: 74.6%
============================================================
Step 3: Evaluation
============================================================
Temperature = 0.5 (200 games each):
6x6 : 18.5%
english : 0.0%
european : 14.5%
star : 11.5%
triangle5 : 0.0%
triangle6 : 0.0%
Average : 7.4%
Temperature = 0 (greedy):
6x6 : 0.0%
english : 0.0%
european : 0.0%
star : 17.5%
triangle5 : 0.0%
triangle6 : 0.0%
Average : 2.9%
============================================================
Step 4: Saving
============================================================
Saved PyTorch model to models/peg_universal.pth
Saved ONNX model to models/peg_universal.onnx (10.19 MB)
============================================================
Done!
============================================================
this first run-through was underwhelming though
Round 2
so we ran it all again, but with the data flag on the python file:
Backlinks (2)
“We all die. The goal isn’t to live forever, the goal is to create something that will.” — Chuck Palahniuk
Originally the AI suffix stood for archived intellect, however these days it has concretised to becoming an Augmenting Infrastructure — a place from which to branch out in many directions.
Within this site you will find self-contained material in the form of project posts and blog posts, but also external links 1 to other work – my own as well as not.
2. Roam /roam/
Here lie bottom-up (not top-down) notes.