Apply Now
Location
bellaterra, cataluña, Spain
Posted
June 16, 2026

Job Description

POSTDOC POSITION IN MULTIMODAL FOUNDATION MODELS FOR DOCUMENT UNDERSTANDING

Call reference: _MDU

We are seeking a postdoc to join the Vision, Language and Reading group at the Computer Vision Center (CVC) in Barcelona, Spain.

The position is initially for 1 year and linked to the project “Multimodal LLMs for Document Understanding” (MuDocU), financed by the Spanish Ministry of Science.

The successful candidate is expected to participate in large‑scale training efforts, research on multimodal pre‑training and finetuning methods, and applications on the specific use case of Document Understanding.

KEY DUTIES

  • Lead and contribute to specific Work Packages (WPs) within the project, ensuring timely delivery of objectives and milestones.
  • Conduct cutting‑edge research on multimodal large language models (LLMs), and manage the publication of research findings in top‑tier conferences and journals.
  • Prepar...