ConnectZero-Nakalipithecus

An AlphaZero-based Reinforcement Learning agent for Connect 4 game.

Architecture: ResNet (5 Residual Blocks) + Dual Head (Policy & Value).

Framework: PyTorch.

Training Platform: Kaggle T4 GPU.

Author: Chakrabhuana Vishnu Deva.

Training result

Total Parameter of the Model: 1,497,742
Starting Training for 5 Iterations...

--- Iteration 1 ---
Self-Playing 100 games...
Data Collected: 1359 samples
Avg Loss: 2.9339

--- Iteration 2 ---
Self-Playing 100 games...
Data Collected: 1644 samples
Avg Loss: 2.6747

--- Iteration 3 ---
Self-Playing 100 games...
Data Collected: 1739 samples
Avg Loss: 2.4139

--- Iteration 4 ---
Self-Playing 100 games...
Data Collected: 1678 samples
Avg Loss: 2.3377

--- Iteration 5 ---
Self-Playing 100 games...
Data Collected: 2370 samples
Avg Loss: 2.1712
Model Saved!
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support