French Full-Duplex Conversation Dataset
MarketplaceNaturalistic French conversations between native speakers, captured in full-duplex stereo with overlapping speech and authentic turn-taking.
Overview
Naturalistic, two-speaker French conversations captured at studio quality in full-duplex stereo. Pairs of native French speakers from metropolitan France, Belgium, and French-speaking parts of Canada discuss everyday topics for the full duration of the session — no read scripts, no scene cuts. Each recording preserves real overlapping speech, backchannels, hesitations, and code-switching, so downstream models train on the way French actually sounds in the wild. Every clip is collected from paid contributors with explicit consent, scene-level provenance, and metadata for speaker demographics, dialect, and acoustic environment.
Key highlights
- 01
Hexagonal France, Belgian, and Québécois pairings with per-speaker dialect tags so models can learn regional pronunciation drift.
- 02
Verlan, register switches between formal vous and informal tu, and Parisian abbreviations ("chui", "ouais") preserved as-spoken.
- 03
Café-style conversational cadence — interruptions, agreement particles ("voilà", "bah"), and rhetorical questions captured intact.
- 04
Regional vocabulary and pronunciation differences across Lyon, Marseille, Brussels, and Montréal annotated in the speaker metadata.
Technical specifications
Coverage
Hundreds of paired sessions from native French speakers across France, Belgium, and Canada — coverage extends to bespoke dialects, age groups, and topical targets on request.
Capture specs
Stereo full-duplex audio at 48 kHz / 24-bit per channel from studio-grade microphones, with per-speaker channel isolation, calibrated noise floor, and continuous capture for the full lifespan of each session — not cherry-picked moments.
Annotations
Speaker / expert metadata shipped with every session: age, gender, region, dialect, native language, and acoustic environment. Annotations available at request.
Use cases
- Full-duplex conversational AI training and evaluation
- Speaker diarization and French ASR / TTS modelling
- Turn-taking, backchannel, and overlap-handling research
- Voice agent benchmarks for natural, multi-party conversation
Request samples
Share your use case and we'll send sample clips, pricing, and recommended next steps for your pipeline.
More datasets
Full-Duplex Conversational Audio
American English Full-Duplex Two-Speaker Conversational Dataset
Two-speaker American English conversations captured in full-duplex stereo, covering everyday topics with overlapping speech, backchannels, and natural disfluencies preserved.
Full-Duplex Conversational Audio
Mandarin Full-Duplex Conversation Dataset
Native-speaker Mandarin Chinese conversations recorded in full-duplex stereo across mainland and overseas dialect regions.
Full-Duplex Conversational Audio
Spanish Full-Duplex Conversation Dataset
Two-speaker Spanish conversations spanning Latin American and European dialects, captured in stereo full-duplex with natural overlap.
Full-Duplex Conversational Audio
Vietnamese Full-Duplex Conversation Dataset
Native Vietnamese conversations captured in full-duplex stereo, with North-Central-South dialect coverage and natural turn-taking.