arxiv:2511.14366
Zihan Ma
MichaelErchi
ยท
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 22 hours ago
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions
new activity
2 months ago
opencompass/CodeForce_SAGA:Update README.md
authored
a paper
3 months ago
How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity