arxiv:2601.11868
Orfeas Menis Mastromichalakis
menorf
·
AI & ML interests
None yet
Recent Activity
authored
a paper
15 days ago
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
upvoted
a
paper
16 days ago
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
new activity
29 days ago
harborframework/parity-experiments:aime