MMFM 2025 : The 4th Workshop on What is Next in Multimodal Foundation Models?
Topics:
We invite contributions on the topic of multimodal foundation models (MMFMs). Topics of interest include, but are not limited to:
Vision / Sound / Speech / Action / Language tasks using Foundation Models in any possible combination
Data and model scaling properties of MMFMs
Self/semi/weakly supervised training of MMFMs
Multimodal grounding in foundation models
Generative MMFMs (e.g., text-to-image/video/3D generation)
Ethics, risks, and fairness of MMFMs
Efficient training and inference of MMFMs
Parameter-efficient fine-tuning, prompting, and adapters for MMFMs
Generalization of MMFMs to other domains
Multi-task and continual learning for MMFMs
Compositionality and Reasoning Approaches for MMFMs
Submission Guidelines:
Full-length papers with proceedings (ICCV format).
Accepted papers in this track will be published in the ICCV workshop proceedings. The work should be original and 4-8 pages long (excluding references) and submitted using the ICCV format. For this track, we follow the ICCV 2025 Dual Submission Policy. Thus, the work should be sufficiently original and not under review in other venues.
Extended abstracts (non-archived).
The minimum page limit is two pages, and the maximum is four pages (including references). We encourage the submission of work in progress and work that has been previously published on topics relevant to the workshop. Note that these submissions will not be published in the proceedings. For authors who want to submit their accepted work at this workshop to a different journal or conference, please check their double submissions policy.
Papers accepted to ICCV 2025 (non-archived)
Full-length papers (up to 8 pages, excluding references) accepted to the main ICCV 2025 conference on topics relevant to the workshop.
We invite contributions on the topic of multimodal foundation models (MMFMs). Topics of interest include, but are not limited to:
Vision / Sound / Speech / Action / Language tasks using Foundation Models in any possible combination
Data and model scaling properties of MMFMs
Self/semi/weakly supervised training of MMFMs
Multimodal grounding in foundation models
Generative MMFMs (e.g., text-to-image/video/3D generation)
Ethics, risks, and fairness of MMFMs
Efficient training and inference of MMFMs
Parameter-efficient fine-tuning, prompting, and adapters for MMFMs
Generalization of MMFMs to other domains
Multi-task and continual learning for MMFMs
Compositionality and Reasoning Approaches for MMFMs
Submission Guidelines:
Full-length papers with proceedings (ICCV format).
Accepted papers in this track will be published in the ICCV workshop proceedings. The work should be original and 4-8 pages long (excluding references) and submitted using the ICCV format. For this track, we follow the ICCV 2025 Dual Submission Policy. Thus, the work should be sufficiently original and not under review in other venues.
Extended abstracts (non-archived).
The minimum page limit is two pages, and the maximum is four pages (including references). We encourage the submission of work in progress and work that has been previously published on topics relevant to the workshop. Note that these submissions will not be published in the proceedings. For authors who want to submit their accepted work at this workshop to a different journal or conference, please check their double submissions policy.
Papers accepted to ICCV 2025 (non-archived)
Full-length papers (up to 8 pages, excluding references) accepted to the main ICCV 2025 conference on topics relevant to the workshop.