3 6

Shaojie Zhang

zhshj0110

https://github.com/Eezekiel

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

upvoted a paper about 2 months ago

Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs

upvoted a paper about 2 months ago

HyperClick: Advancing Reliable GUI Grounding via Uncertainty Calibration

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

Paper • 2511.13026 • Published Nov 17 • 25

upvoted 2 papers about 2 months ago

Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs

Paper • 2506.22139 • Published Jun 27 • 2

HyperClick: Advancing Reliable GUI Grounding via Uncertainty Calibration

Paper • 2510.27266 • Published Oct 31 • 20

New activity in OpenGVLab/ScaleCUA-Data 2 months ago

Missing or Unavailable Dataset Files

#3 opened 2 months ago by

yk3701208

upvoted 2 papers 3 months ago

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

Paper • 2510.00406 • Published Oct 1 • 65

BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent

Paper • 2509.15566 • Published Sep 19 • 14

upvoted an article 4 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9

•

New activity in lmms-lab/LLaVA-NeXT-Video-32B-Qwen over 1 year ago

On Video dataset

#2 opened over 1 year ago by

HelloJiang