Table-R1

university

https://nlp.cs.yale.edu/

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

yilunzhao authored a paper 1 day ago

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

yilunzhao authored a paper 1 day ago

PuzzlePlex: Benchmarking Foundation Models on Reasoning and Planning with Puzzles

yilunzhao authored a paper 1 day ago

MSRS: Evaluating Multi-Source Retrieval-Augmented Generation

View all activity

yilunzhao

authored 14 papers 1 day ago

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

Paper • 2507.13300 • Published Jul 17, 2025 • 20

PuzzlePlex: Benchmarking Foundation Models on Reasoning and Planning with Puzzles

Paper • 2510.06475 • Published Oct 7, 2025 • 2

MSRS: Evaluating Multi-Source Retrieval-Augmented Generation

Paper • 2508.20867 • Published Aug 28, 2025

FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering

Paper • 2510.06426 • Published Oct 7, 2025 • 3

SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing

Paper • 2506.04583 • Published Jun 5, 2025

FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents

Paper • 2411.05764 • Published Nov 8, 2024

MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval

Paper • 2510.09510 • Published Oct 10, 2025 • 8

FinTrust: A Comprehensive Benchmark of Trustworthiness Evaluation in Finance Domain

Paper • 2510.15232 • Published Oct 17, 2025 • 6

Rethinking Composed Image Retrieval Evaluation: A Fine-Grained Benchmark from Image Editing

Paper • 2601.16125 • Published 16 days ago • 13

SAGE: Benchmarking and Improving Retrieval for Deep Research Agents

Paper • 2602.05975 • Published 2 days ago • 10

yilunzhao

authored a paper 22 days ago

Patient-Similarity Cohort Reasoning in Clinical Text-to-SQL

Paper • 2601.09876 • Published 24 days ago • 6

yilunzhao

authored 4 papers 7 months ago

Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers

Paper • 2507.02694 • Published Jul 3, 2025 • 19

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Paper • 2507.10787 • Published Jul 14, 2025 • 13

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

Paper • 2507.06223 • Published Jul 8, 2025 • 14

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Paper • 2507.01001 • Published Jul 1, 2025 • 46

yilunzhao

authored a paper 8 months ago

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification

Paper • 2506.15569 • Published Jun 18, 2025 • 12

AI & ML interests

Recent Activity

Team members 3

Table-R1's activity

AI & ML interests

Recent Activity

Team members 3

Table-R1's activity