Vietnamese Full-Duplex Conversation Dataset
MarketplaceNative Vietnamese conversations captured in full-duplex stereo, with North-Central-South dialect coverage and natural turn-taking.
Overview
Naturalistic, two-speaker Vietnamese conversations captured at studio quality in full-duplex stereo. Pairs of native Vietnamese speakers from Hanoi, Ho Chi Minh City, and central Vietnam discuss everyday topics for the full duration of the session — no read scripts, no scene cuts. Each recording preserves real overlapping speech, backchannels, hesitations, and code-switching, so downstream models train on the way Vietnamese actually sounds in the wild. Every clip is collected from paid contributors with explicit consent, scene-level provenance, and metadata for speaker demographics, dialect, and acoustic environment.
Key highlights
- 01
Northern (Hanoi), Central (Huế), and Southern (Ho Chi Minh) dialect pairings with per-speaker dialect tags.
- 02
Six-tone system preserved with per-utterance tonal accuracy — essential for Vietnamese ASR/TTS where tone disambiguates lexical meaning.
- 03
English loanwords, French legacy vocabulary, and modern Vietnamese internet slang captured in their natural conversational context.
- 04
Family kinship terms, age-based pronoun shifts (anh/chị/em/cháu), and honorifics tagged in the speaker metadata layer.
Technical specifications
Coverage
Hundreds of paired sessions from native Vietnamese speakers across Vietnam — coverage extends to bespoke dialects, age groups, and topical targets on request.
Capture specs
Stereo full-duplex audio at 48 kHz / 24-bit per channel from studio-grade microphones, with per-speaker channel isolation, calibrated noise floor, and continuous capture for the full lifespan of each session — not cherry-picked moments.
Annotations
Speaker / expert metadata shipped with every session: age, gender, region, dialect, native language, and acoustic environment. Annotations available at request.
Use cases
- Full-duplex conversational AI training and evaluation
- Speaker diarization and Vietnamese ASR / TTS modelling
- Turn-taking, backchannel, and overlap-handling research
- Voice agent benchmarks for natural, multi-party conversation
Request samples
Share your use case and we'll send sample clips, pricing, and recommended next steps for your pipeline.
More datasets
Full-Duplex Conversational Audio
American English Full-Duplex Two-Speaker Conversational Dataset
Two-speaker American English conversations captured in full-duplex stereo, covering everyday topics with overlapping speech, backchannels, and natural disfluencies preserved.
Full-Duplex Conversational Audio
French Full-Duplex Conversation Dataset
Naturalistic French conversations between native speakers, captured in full-duplex stereo with overlapping speech and authentic turn-taking.
Full-Duplex Conversational Audio
Mandarin Full-Duplex Conversation Dataset
Native-speaker Mandarin Chinese conversations recorded in full-duplex stereo across mainland and overseas dialect regions.
Full-Duplex Conversational Audio
Spanish Full-Duplex Conversation Dataset
Two-speaker Spanish conversations spanning Latin American and European dialects, captured in stereo full-duplex with natural overlap.