| | --- |
| | language: en |
| | tags: |
| | - code |
| | - algorithms |
| | - competitive-programming |
| | - multi-label-classification |
| | - codebert |
| | datasets: |
| | - xCodeEval |
| | metrics: |
| | - f1 |
| | - precision |
| | - recall |
| | library_name: transformers |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # CodeBERT Algorithm Tagger |
| |
|
| | A fine-tuned CodeBERT model for multi-label classification of algorithmic problems from competitive programming platforms like Codeforces. |
| |
|
| | ## Model Description |
| |
|
| | This model predicts algorithmic tags/categories for competitive programming problems based on their problem descriptions and solution code. |
| |
|
| | **Supported Tags:** |
| | - math |
| | - graphs |
| | - strings |
| | - number theory |
| | - trees |
| | - geometry |
| | - games |
| | - probabilities |
| |
|
| | ## Training Data |
| |
|
| | - **Dataset**: xCodeEval (Codeforces problems) |
| | - **Training examples**: 2,147 problems (filtered for focus tags) |
| | - **Test examples**: 531 problems |
| | - **Source**: Problems and solutions from Codeforces platform |
| |
|
| | ### Model Architecture |
| |
|
| | - **Input**: Concatenated problem description and solution code |
| | - **Encoder**: CodeBERT (RoBERTa-based architecture) |
| | - **Output**: 8-dimensional binary classification (one per tag) |
| |
|
| | ## Usage |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | pip install transformers torch |
| | ``` |
| |
|
| |
|