SPED 2025 : Speech Technology and Human-Computer Dialogue Conference
The “SpeD 2025” Organizing Committee warmly invites you to attend the 13th Conference on Speech Technology and Human-Computer Dialogue, in Cluj-Napoca, Romania. The conference will be held in-person at the Technical University of Cluj-Napoca.
The conference will bring together scientists, developers, and professionals to present their work, meet colleagues, discuss new ideas, and build collaboration between university, research center, and commercial sector research groups. The technical program will include oral sessions, keynotes by renowned speakers, and demonstrations of latest research on a wide range of topics positioned at the forefront of science and engineering in speech technology and human-computer dialogue.
The past editions of the “SpeD” conference series were sponsored by IEEE and EURASIP (technical sponsors), the proceedings being indexed by the IEEE Xplore® Digital Library, Scopus, and the Web of Science Conference Proceedings Citation Index (the WoS indexing process has not been finalized for the previous 2023 edition). This year, papers accepted and presented during the conference will also be submitted for inclusion into IEEE Xplore, subject to meeting IEEE Xplore’s scope and quality requirements, and for indexing in Web of Science.
Joint event
This year, the Language Data Space (LDS) Workshop organised by the Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy, will be co-located with “SpeD”.
Main Topics
Self-Supervised and Generative Models for Speech Representation
Robust Spoken Language Recognition and Understanding
Efficient and Low-Resource Speech Recognition for Edge and Embedded Systems
Neural Text-to-Speech (TTS) and Expressive Speech Synthesis
End-to-End Speech-to-Speech Translation and Multimodal Language Models
Speaker Recognition, Diarization, and Adaptive Speaker Embeddings
Conversational Search, Spoken Document Understanding, and Retrieval-Augmented Generation (RAG)
Paralinguistic Speech Processing and Emotion Recognition in the Wild
Speech Enhancement, Dereverberation, and Noise-Robust Processing
AI-driven Speech Technology: Large-Scale Models and Fine-Tuning Strategies
Conversational AI, Large Language Models, and Multimodal Dialogue Systems
Speech Forensics, Deepfake Detection, and Synthetic Speech Analysis
Clinical Speech Processing for Health, Well-being, and Cognitive Assessment
Multilingual and Low-Resource Speech Data Collection, Annotation, and Benchmarking
Human-Centric Speech Interfaces: UX, Personalization, and Ethical Design
Voice AI for Smart Environments, Assistive Tech, and Wearable Devices
Speech Pathology, Augmentative Communication, and AI-Driven Therapy
Bias, Fairness, and Ethical Considerations in Speech AI Deployment
Next-Gen Speech and Speaker Recognition: Continual Learning and Adaptation
Multimodal and Audio-Visual Speech Processing with Foundation Models
Cross-Modal Information Retrieval and Multisensory AI
Advanced Audio Signal Processing for Spatial and 3D Audio Applications
AI-Powered Generative Models and Conversational Embodied Agents
Efficient, Scalable, and Sustainable Deep Learning for Speech Processing
Additional Topics in NLP and Multimodal Processing
Text Summarization and Abstractive Generation
Language Modeling and Pre-trained Architectures (e.g., Transformers)
Automatic Question Answering and Knowledge Extraction
Cross-lingual and Multilingual Natural Language Processing Applications
NLP Applicattions is Social Media, Programming and Virtual Reality
NLP for Low-Resource Languages
Bias Detection and Fairness in NLP Systems
Explainability and Interpretability in NLP Models
Speech-to-Image and Image-to-Text Systems
Event Detection and Narrative Understanding
Spoken and Written Language Alignment Models
Multimodal Emotion Analysis, Recogition and Generation
Gesture and Gaze Integration in Multimodal Systems
Multimodal Interaction and Dialogue Systems
Multimodal Data Fusion Techniques
Schedule (provisional)
Paper submission (5 – 6 pages, IEEE format): June 2, 2025.
Notification of acceptance and reviewers’ comments: August 15, 2025.
Submission of final papers: September 5, 2025.
Conference: October 19-22, 2025.
The conference will bring together scientists, developers, and professionals to present their work, meet colleagues, discuss new ideas, and build collaboration between university, research center, and commercial sector research groups. The technical program will include oral sessions, keynotes by renowned speakers, and demonstrations of latest research on a wide range of topics positioned at the forefront of science and engineering in speech technology and human-computer dialogue.
The past editions of the “SpeD” conference series were sponsored by IEEE and EURASIP (technical sponsors), the proceedings being indexed by the IEEE Xplore® Digital Library, Scopus, and the Web of Science Conference Proceedings Citation Index (the WoS indexing process has not been finalized for the previous 2023 edition). This year, papers accepted and presented during the conference will also be submitted for inclusion into IEEE Xplore, subject to meeting IEEE Xplore’s scope and quality requirements, and for indexing in Web of Science.
Joint event
This year, the Language Data Space (LDS) Workshop organised by the Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy, will be co-located with “SpeD”.
Main Topics
Self-Supervised and Generative Models for Speech Representation
Robust Spoken Language Recognition and Understanding
Efficient and Low-Resource Speech Recognition for Edge and Embedded Systems
Neural Text-to-Speech (TTS) and Expressive Speech Synthesis
End-to-End Speech-to-Speech Translation and Multimodal Language Models
Speaker Recognition, Diarization, and Adaptive Speaker Embeddings
Conversational Search, Spoken Document Understanding, and Retrieval-Augmented Generation (RAG)
Paralinguistic Speech Processing and Emotion Recognition in the Wild
Speech Enhancement, Dereverberation, and Noise-Robust Processing
AI-driven Speech Technology: Large-Scale Models and Fine-Tuning Strategies
Conversational AI, Large Language Models, and Multimodal Dialogue Systems
Speech Forensics, Deepfake Detection, and Synthetic Speech Analysis
Clinical Speech Processing for Health, Well-being, and Cognitive Assessment
Multilingual and Low-Resource Speech Data Collection, Annotation, and Benchmarking
Human-Centric Speech Interfaces: UX, Personalization, and Ethical Design
Voice AI for Smart Environments, Assistive Tech, and Wearable Devices
Speech Pathology, Augmentative Communication, and AI-Driven Therapy
Bias, Fairness, and Ethical Considerations in Speech AI Deployment
Next-Gen Speech and Speaker Recognition: Continual Learning and Adaptation
Multimodal and Audio-Visual Speech Processing with Foundation Models
Cross-Modal Information Retrieval and Multisensory AI
Advanced Audio Signal Processing for Spatial and 3D Audio Applications
AI-Powered Generative Models and Conversational Embodied Agents
Efficient, Scalable, and Sustainable Deep Learning for Speech Processing
Additional Topics in NLP and Multimodal Processing
Text Summarization and Abstractive Generation
Language Modeling and Pre-trained Architectures (e.g., Transformers)
Automatic Question Answering and Knowledge Extraction
Cross-lingual and Multilingual Natural Language Processing Applications
NLP Applicattions is Social Media, Programming and Virtual Reality
NLP for Low-Resource Languages
Bias Detection and Fairness in NLP Systems
Explainability and Interpretability in NLP Models
Speech-to-Image and Image-to-Text Systems
Event Detection and Narrative Understanding
Spoken and Written Language Alignment Models
Multimodal Emotion Analysis, Recogition and Generation
Gesture and Gaze Integration in Multimodal Systems
Multimodal Interaction and Dialogue Systems
Multimodal Data Fusion Techniques
Schedule (provisional)
Paper submission (5 – 6 pages, IEEE format): June 2, 2025.
Notification of acceptance and reviewers’ comments: August 15, 2025.
Submission of final papers: September 5, 2025.
Conference: October 19-22, 2025.