description [ICML 2026][LLM Agent][GUI agent] Video2GUI 用「元数据粗筛 → 视频质量精筛 → Gemini-3-Pro 提任务/动作 → 高分辨率三帧精确空间 grounding」四段流水线把 5 亿条 YouTube 视频元数据炼成 WildGUI(12.7M 轨迹、124.… Video2GUI 用「元数据粗筛 → 视频质量精筛 → ...
Model × Quantization × Inference Engine × Hardware × OS × Runtime Config It does not rank agent behavior, cloud APIs, or model intelligence in isolation. It ranks whether a full local stack is fast, ...