Vietnamese Full-Duplex Conversation Dataset

Marketplace

Native Vietnamese conversations captured in full-duplex stereo, with North-Central-South dialect coverage and natural turn-taking.

Overview

Naturalistic, two-speaker Vietnamese conversations captured at studio quality in full-duplex stereo. Pairs of native Vietnamese speakers from Hanoi, Ho Chi Minh City, and central Vietnam discuss everyday topics for the full duration of the session — no read scripts, no scene cuts. Each recording preserves real overlapping speech, backchannels, hesitations, and code-switching, so downstream models train on the way Vietnamese actually sounds in the wild. Every clip is collected from paid contributors with explicit consent, scene-level provenance, and metadata for speaker demographics, dialect, and acoustic environment.

Key highlights

  • 01

    Northern (Hanoi), Central (Huế), and Southern (Ho Chi Minh) dialect pairings with per-speaker dialect tags.

  • 02

    Six-tone system preserved with per-utterance tonal accuracy — essential for Vietnamese ASR/TTS where tone disambiguates lexical meaning.

  • 03

    English loanwords, French legacy vocabulary, and modern Vietnamese internet slang captured in their natural conversational context.

  • 04

    Family kinship terms, age-based pronoun shifts (anh/chị/em/cháu), and honorifics tagged in the speaker metadata layer.

Technical specifications

Coverage

Hundreds of paired sessions from native Vietnamese speakers across Vietnam — coverage extends to bespoke dialects, age groups, and topical targets on request.

Capture specs

Stereo full-duplex audio at 48 kHz / 24-bit per channel from studio-grade microphones, with per-speaker channel isolation, calibrated noise floor, and continuous capture for the full lifespan of each session — not cherry-picked moments.

Annotations

Speaker / expert metadata shipped with every session: age, gender, region, dialect, native language, and acoustic environment. Annotations available at request.

Use cases

  • Full-duplex conversational AI training and evaluation
  • Speaker diarization and Vietnamese ASR / TTS modelling
  • Turn-taking, backchannel, and overlap-handling research
  • Voice agent benchmarks for natural, multi-party conversation

Request samples

Share your use case and we'll send sample clips, pricing, and recommended next steps for your pipeline.

More datasets

View all
Ready to bring AI into the real world?