We are exploring a potential voice data collection project in Japan, focusing on family relationships (parents and siblings). The data will be used for speech recognition or voice model development.
Target Participants
A total of 100 family pairs (200 individuals) consisting of:
Parent Child pairs:
Father + son (male child, age 14+)
Mother + daughter (female child, age 14+)
Sibling pairs:
Brothers (male + male, 14+)
Sisters (female + female, 14+)
Mixed siblings (male + female, 14+)
Each category includes 20 pairs, totaling 100 pairs (200 participants).
At least half of the child participants must be aged 14 or above.
Geographic Coverage
Participants must be from across Japan, distributed as follows:
Kanto 30 pairs
Kansai 30 pairs
Hokkaido/Tohoku 10 pairs
Chubu 10 pairs
Chugoku/Shikoku 10 pairs
Kyushu 10 pairs
️ Total = 100 pairs
Recording Requirements
Each participant must record:
1 trigger word pronunciation
10 free-form utterances (spontaneous speech)
Reading aloud (~2 minutes of text)
Technical specifications:
Environment: Quiet indoor space or soundproof room (< 40 dB(A) background noise)
Equipment: High-quality lavalier microphone placed near the mouth
Format:WAV, 48kHz, 16-bit, uncompressed
️ Project Nature
Status: Request for Information (RFI) not yet confirmed
Objective: Assess feasibility, potential resources, and anticipated challenges
Deliverable: Feedback on availability, capacity, and risks
️ Potential Challenges (to assess)
Recruitment difficulty: Finding genuine parent-child or sibling pairs in required age ranges across all Japanese regions.
Recording logistics: Ensuring controlled acoustic environments nationwide.
Equipment consistency: Participants access to or provision of quality lavalier microphones.
Compliance: Managing consent and privacy for minors (1417).
Cost and coordination: Travel, scheduling, and regional distribution management.
In short:
This is a Japanese family voice recording RFI involving 100 family pairs (parents and siblings) from various regions of Japan, each recording structured and free speech using high-quality audio standards. The client is seeking to confirm whether vendors can recruit, record, and deliver this dataset and to identify any feasibility concerns before proceeding.
