Mandarin Full-Duplex Conversation Dataset
MarketplaceNative-speaker Mandarin Chinese conversations recorded in full-duplex stereo across mainland and overseas dialect regions.
Overview
Naturalistic, two-speaker Mandarin Chinese conversations captured at studio quality in full-duplex stereo. Pairs of native Mandarin Chinese speakers from mainland China, Taiwan, and the Chinese diaspora discuss everyday topics for the full duration of the session — no read scripts, no scene cuts. Each recording preserves real overlapping speech, backchannels, hesitations, and code-switching, so downstream models train on the way Mandarin Chinese actually sounds in the wild. Every clip is collected from paid contributors with explicit consent, scene-level provenance, and metadata for speaker demographics, dialect, and acoustic environment.
Key highlights
- 01
Mainland Putonghua and Taiwanese Mandarin pairings with per-utterance tone variation tagged — critical for Mandarin TTS training.
- 02
Chinglish code-switching, English loanwords, and modern Chinese internet slang ("yyds", "emo", "绝绝子") preserved verbatim.
- 03
Family-style group conversation rhythm with overlapping turn-taking, rapid topic shifts, and culturally specific honorifics.
- 04
Regional accent coverage spanning Beijing, Shanghai, Guangdong (Mandarin-speaking), Taipei, and the overseas Chinese diaspora.
Technical specifications
Coverage
Hundreds of paired sessions from native Mandarin Chinese speakers across Mainland China and Taiwan — coverage extends to bespoke dialects, age groups, and topical targets on request.
Capture specs
Stereo full-duplex audio at 48 kHz / 24-bit per channel from studio-grade microphones, with per-speaker channel isolation, calibrated noise floor, and continuous capture for the full lifespan of each session — not cherry-picked moments.
Annotations
Speaker / expert metadata shipped with every session: age, gender, region, dialect, native language, and acoustic environment. Annotations available at request.
Use cases
- Full-duplex conversational AI training and evaluation
- Speaker diarization and Mandarin Chinese ASR / TTS modelling
- Turn-taking, backchannel, and overlap-handling research
- Voice agent benchmarks for natural, multi-party conversation
Request samples
Share your use case and we'll send sample clips, pricing, and recommended next steps for your pipeline.
More datasets
Full-Duplex Conversational Audio
American English Full-Duplex Two-Speaker Conversational Dataset
Two-speaker American English conversations captured in full-duplex stereo, covering everyday topics with overlapping speech, backchannels, and natural disfluencies preserved.
Full-Duplex Conversational Audio
French Full-Duplex Conversation Dataset
Naturalistic French conversations between native speakers, captured in full-duplex stereo with overlapping speech and authentic turn-taking.
Full-Duplex Conversational Audio
Spanish Full-Duplex Conversation Dataset
Two-speaker Spanish conversations spanning Latin American and European dialects, captured in stereo full-duplex with natural overlap.
Full-Duplex Conversational Audio
Vietnamese Full-Duplex Conversation Dataset
Native Vietnamese conversations captured in full-duplex stereo, with North-Central-South dialect coverage and natural turn-taking.