MMFM 2025 : The 4th Workshop on What is Next in Multimodal Foundation Models?

October 19-20, 2025Honolulu, Hawai'i
Topics:


We invite contributions on the topic of multimodal foundation models (MMFMs). Topics of interest include, but are not limited to:


Vision / Sound / Speech / Action / Language tasks using Foundation Models in any possible combination

Data and model scaling properties of MMFMs

Self/semi/weakly supervised training of MMFMs

Multimodal grounding in foundation models

Generative MMFMs (e.g., text-to-image/video/3D generation)

Ethics, risks, and fairness of MMFMs

Efficient training and inference of MMFMs

Parameter-efficient fine-tuning, prompting, and adapters for MMFMs

Generalization of MMFMs to other domains

Multi-task and continual learning for MMFMs

Compositionality and Reasoning Approaches for MMFMs



Submission Guidelines:

Full-length papers with proceedings (ICCV format).

Accepted papers in this track will be published in the ICCV workshop proceedings. The work should be original and 4-8 pages long (excluding references) and submitted using the ICCV format. For this track, we follow the ICCV 2025 Dual Submission Policy. Thus, the work should be sufficiently original and not under review in other venues.

Extended abstracts (non-archived).

The minimum page limit is two pages, and the maximum is four pages (including references). We encourage the submission of work in progress and work that has been previously published on topics relevant to the workshop. Note that these submissions will not be published in the proceedings. For authors who want to submit their accepted work at this workshop to a different journal or conference, please check their double submissions policy.

Papers accepted to ICCV 2025 (non-archived)

Full-length papers (up to 8 pages, excluding references) accepted to the main ICCV 2025 conference on topics relevant to the workshop.