FinAI-BERT-IslamicBanks: A Domain-Specific BERT Model for AI Discourse in Islamic Finance

FinAI-BERT-IslamicBanks is a fine-tuned transformer-based model based on bert-base-uncased, specifically developed to detect AI-related disclosures in the context of Islamic banking. The model is tailored for financial NLP tasks and trained on manually annotated sentences from 855 annual reports issued by 106 Islamic banks across 25 countries between 2015 and 2024.

Intended Use

The model is designed to support:

Academic research on AI adoption in Islamic finance
Regulatory screening of AI-related disclosures
Technology and ESG audits of Islamic financial institutions
Index construction for benchmarking AI readiness in Islamic banking

Performance

Accuracy: 98.67%
F1 Score: 0.9868
ROC AUC: 0.9999
Brier Score: 0.0027

The model demonstrates high semantic sensitivity, excellent calibration, and strong generalization across diverse report formats.

Training Data

Total examples: 2,632 sentence-level instances
- 1,316 AI-related (seed word filtered + manually verified)
- 1,316 Non-AI (randomly sampled)

Training Setup

Base model: bert-base-uncased
Tokenizer: WordPiece
Environment: Google Compute Engine (GPU)
Batch size: 8
Epochs: 3
Max sequence length: 128
Loss function: Cross-entropy
Optimizer: AdamW
Precision: FP16 (mixed-precision enabled)
Framework: Hugging Face Transformers (Trainer API)

Files

config.json, tokenizer.json, vocab.txt, model.safetensors: Model files
tokenizer_config.json, special_tokens_map.json: Tokenizer configuration

Citation

If you use this model in your research or applications, please cite our paper:

Zafar, M. B. (2025). FinAI-BERT-IslamicBanks: A Domain-Specific Model for Detecting AI Disclosures in Islamic Banking. SSRN. https://ssrn.com/abstract=5337214

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "bilalzafar/FinAI-BERT-IslamicBanks"

tok = AutoTokenizer.from_pretrained(model_id)
mdl = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "The bank deployed machine learning models to automate credit risk assessment."
inputs = tok(text, return_tensors="pt", truncation=True, padding=True, max_length=128)

with torch.no_grad():
    probs = torch.softmax(mdl(**inputs).logits, dim=-1).squeeze()

pred = int(torch.argmax(probs).item())          # 1 = AI, 0 = Non-AI
label = "AI" if pred == 1 else "Non-AI"
score = float(probs[pred].item())

print(f"Classification: {label} | Score: {score:.5f}")
### Output {Classification: AI | Score: 0.99961}

Downloads last month: 31

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for bilalzafar/FinAI-BERT-IslamicBanks

Base model

google-bert/bert-base-uncased

Finetuned

(6256)

this model