-
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
Paper • 2502.04322 • Published • 3 -
narutatsuri/evaluation-actionable
Text Classification • 8B • Updated • 7 -
narutatsuri/evaluation-informative
Text Classification • 8B • Updated • 6 -
narutatsuri/response_selection_model-actionable
Text Classification • 8B • Updated
Narutatsu Ri
narutatsuri
AI & ML interests
None yet
Recent Activity
updated
a collection
3 days ago
Speak Easy updated
a collection
3 days ago
Speak Easy updated
a collection
3 days ago
Speak Easy