ReFAct: Empowering Multimodal Web Agents with Visual and Context Focusing
A focusing framework for multimodal web agents that improves visual grounding and context selection during dynamic web tasks.
ReFAct: Empowering Multimodal Web Agents with Visual and Context Focusing
A focusing framework for multimodal web agents that improves visual grounding and context selection during dynamic web tasks.
MobileFlow: A Multimodal LLM for Mobile GUI Agent
MATEval: A Multi-Agent Discussion Framework for Advancing Open-Ended Text Evaluation
DEE: Dual-stage Explainable Evaluation Method for Text Generation
Customer Complaint Guided Fault Localization Based on Domain Knowledge Graph
Conditional Generation Net for Medication Recommendation
A Two-Phase Approach for Predicting Highway Passenger Volume
LSTM Multi-modal UNet for Brain Tumor Segmentation