When Context Flips, Safety Breaks: Diagnosing Brittle Safety in Aligned Language Models
Paper • 2605.27851 • Published • 1
AI Safety & AI Security
XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs