
Ditutup
Disiarkan
Dibayar semasa penghantaran
I’m assembling a 250-hour corpus of Saudi Arabic narration to train a Generative AI model, and I want every minute captured by native speakers of the three key dialects in the Kingdom—Najdi, Khaleeji, and Hijazi. The recordings must sound clear, natural, and studio-quality. Please deliver clean 48 kHz / 16-bit WAV files with consistent volume, no background noise, and no processing other than gentle normalization. Simple, conversational narration is all that’s required; no dialogues, interviews, or advertising reads. To keep the dataset balanced, I’d like roughly equal coverage of each dialect. You may record alone or coordinate a small team, as long as every speaker is a genuine native of the dialect they read in. Deliverables • 250 total hours of narrated audio, evenly split across Najdi, Khaleeji, and Hijazi • A spreadsheet listing file names, length, speaker dialect, and a one-line description of the content for quick reference • All raw takes as separate files plus a “final” trimmed version for each segment I will review a five-minute sample from each dialect before we move on to full production to be sure the audio chain and accent are spot-on. Once approved, we can break the work into manageable milestones and keep progressing until the full 250 hours are complete.
ID Projek: 40265702
13 cadangan
Projek jarak jauh
Aktif 4 hari yang lalu
Tetapkan bajet dan garis masa anda
Dapatkan bayaran untuk kerja anda
Tuliskan cadangan anda
Ianya percuma untuk mendaftar dan membida pekerjaan
13 pekerja bebas membida secara purata $479 USD untuk pekerjaan ini

Hello, I will begin by organizing a team of native speakers fluent in Najdi, Khaleeji, and Hijazi dialects to record 250 hours of clear, natural, and studio-quality Arabic narration. The recordings will be delivered as clean 48 kHz / 16-bit WAV files with consistent volume and no background noise, following your requirements for simple, conversational narration. To ensure a balanced dataset, we will aim for roughly equal coverage of each dialect. I propose starting with a small initial milestone where we provide a five-minute sample from each dialect for your review before moving on to full production. This will allow us to ensure that the audio chain and accents are spot-on before completing the full 250 hours. Best regards,
$250 USD dalam 7 hari
7.7
7.7

السلام عليكم ورحمة الله وبركاته I am an Arab who lived in Saudi Arabia, so I speak Saudi Arabic fluently and can single-handedly record the full 250 hours across the Najdi, Khaleeji, and Hijazi dialects without needing to rely on a team. I use a professional studio microphone to ensure clean 48 kHz / 16-bit WAV files with zero background noise, and my extensive background in editing means I can easily manage the precise file trimming and gentle normalization you require. I will also keep the tracking spreadsheet perfectly organized as I deliver the raw and final segments. I am ready to record and send over the initial 5-minute samples for your review right now so we can get started.
$750 USD dalam 7 hari
3.1
3.1

As an experienced audio engineer and native speaker of Arabic (including the Najdi, Khaleeji, and Hijazi dialects), I am the perfect fit for your project. I understand the importance of clear, high-quality audio recordings with balanced coverage among the dialects. With a strong technical background in delivering clean, noise-free 48kHZ/16-bit WAV files, I can ensure professional-grade recordings that meet your specifications to assist in training your Generative AI model. Having worked on numerous projects requiring precise sound engineering, I guarantee consistent volume levels across all files through gentle normalization. Moreover, I not only possess the skills to record alone but can also coordinate a small team for efficient work delivery. My proficiency as a voice talent and fluency in these specific dialects will enable me to provide natural and conversational narrations needed for your corpus. Evidenced by my high rating for on-time delivery and budget management, I consistently achieve my clients' goals without compromising quality. I’ll approach this project the same way and provide milestone-based deliverables according to your specifications. Ultimately, my aim is to ensure that each file, from the raw takes to the final trimmed version, exceeds your expectations in terms of clarity and content per the provided one-line descriptions.
$500 USD dalam 7 hari
2.7
2.7

Send me 1 minute of your track — I’ll mix it FREE so you can hear the quality. Hi, I’m Munna Mhm — Professional Audio Mixing Engineer & Music Producer with 13+ years of experience. I deliver clean, powerful, radio-ready mixes across Hip-Hop, Pop, EDM, Rock, R&B, cinematic music, podcasts, and voiceovers. Using industry-standard EQ, compression, saturation, stereo imaging, reverb, delay & more — I turn your track into a polished, industry-level sound.
$300 USD dalam 5 hari
1.7
1.7

I’ve successfully delivered high-fidelity speech datasets for regional Arabic dialects, and I understand the specific phonetic nuances required to make a Saudi Generative AI model sound truly authentic. For a 250-hour corpus, the priority isn't just raw volume; it’s capturing the distinct prosody, glottal stops, and lexical variety of Najdi and Hejazi regions to ensure the model generalizes effectively across the Kingdom's linguistic landscape. My experience managing large-scale data collection pipelines for TTS and ASR ensures that every hour of your requirement meets the rigorous standards needed for high-performance machine learning. To execute this, I will implement a multi-stage validation pipeline using professional-grade recording environments and high-quality condensers sampling at 48kHz for maximum spectral clarity. I’ll curate a diverse roster of native Saudi narrators, ensuring a balanced gender distribution while monitoring for acoustic consistency and precise silence-to-speech ratios via automated LUFS normalization. Each segment will undergo Signal-to-Noise Ratio (SNR) checks followed by manual linguistic auditing to prevent dialect drift or phonetic inaccuracies. The final dataset will be delivered with structured metadata in JSON format, mapping each file to its transcript for seamless integration into your training architecture. Are you looking for a specific distribution across the major Saudi provinces, or should the corpus lean toward a more standardized Modern Standard Arabic? Additionally, do you have specific preferences for the narration content—such as conversational or literary—to better align with the model's intended application? I’m available to discuss how we can scale this collection efficiently while maintaining the highest linguistic integrity. Let’s chat to align on your technical specs or jump on a brief call to finalize the roadmap.
$625 USD dalam 21 hari
0.0
0.0

Hello, I’ve worked on large-scale voice recording and dialect-specific projects before, and I can help you execute this corpus professionally and efficiently. I understand the technical and linguistic requirements clearly: • 48 kHz / 16-bit WAV format • Clean studio-quality audio • No background noise • Minimal processing (gentle normalization only) • Natural, conversational narration style • Balanced coverage across Najdi, Khaleeji, and Hijazi dialects I have experience coordinating native speakers and managing structured recording workflows to ensure consistency in tone, pacing, and audio levels across large datasets. I can either record directly (where appropriate) or coordinate a small team of verified native speakers for proper dialect authenticity. I also understand dataset discipline — organized file naming, raw + final trimmed versions, and a clean tracking spreadsheet with metadata (file name, length, dialect, description). I’m fully comfortable starting with the required 5-minute sample per dialect to confirm accent and audio chain before scaling production. I can manage this project with professionalism, clear milestones, and consistent delivery. Looking forward to collaborating. Best Regards Beishoy
$250 USD dalam 5 hari
0.0
0.0

GenAI is only as intelligent as its training data. A Najdi recording with a hint of an outside accent, or a Hijazi clip with a -50dB noise floor, isn't just "bad audio"—it’s a data pollutant that compromises your entire model. I will provide a surgically clean 250-hour corpus where the linguistic boundaries between Najdi, Khaleeji, and Hijazi are strictly preserved. I understand the technical weight of 48 kHz / 16-bit WAV deliverables. My process ensures 100% dialect purity through a "Native-Only" vetting phase for every speaker. For a previous LLM project, I managed high-volume vocal datasets where we maintained a zero-rejection rate by enforcing strict acoustic controls (no processing, just -1.0dB normalization). You will receive a structured, error-free CSV mapping every file to its specific dialect and content description, ready for immediate model ingestion. I have three native speakers—one for each specific Saudi region—ready to record the 5-minute vetting samples to prove our audio chain and accent accuracy. Should I start with the Najdi or Hijazi sample first to establish the conversational baseline you need? send a message lets get started patiently waiting, peculiar
$500 USD dalam 7 hari
0.0
0.0

Hello there, We bring 8 years of AI/ML data pipeline engineering and large-scale dataset curation for generative model training — building a balanced 250-hour Saudi Arabic corpus across Najdi, Khaleeji, and Hijazi is fundamentally a data engineering problem as much as an audio one. I want to be upfront: the listed budget won't cover full voice talent compensation for 250 hours of native-speaker narration. My bid covers the technical pipeline and project management layer — automated ingestion that validates incoming audio against your 48 kHz/16-bit WAV spec, checks SNR thresholds (rejecting noise above -60 dB), normalizes to consistent LUFS levels, and auto-generates your metadata spreadsheet with filename, duration, dialect tag, and content description. Zero manual bookkeeping errors. We'd build acoustic fingerprinting for duplicate detection and silence-trimming scripts to prevent dead air from inflating hour counts, plus a real-time dashboard tracking per-dialect hours so you never drift past ~83 hours each. Voice talent sourcing and compensation would need to be scoped separately — happy to discuss realistic numbers there. A parallel project we delivered processed 60,000+ multilingual field survey records with automated validation and quality scoring — same structure of multiple source variants feeding one unified clean dataset. Naveen Brainstack Technologies
$750 USD dalam 60 hari
0.0
0.0

am a self-motivated and reliable individual seeking an opportunity to work in an online environment where I can contribute my skills and grow professionally. I have strong communication skills, good time management, and the ability to work independently without supervision. Through my academic experience in media and communication, I have developed research, writing, and digital content skills that are valuable in remote work. I am comfortable using online tools such as email, social media platforms, Google Workspace, and virtual meeting applications. I am a fast learner, adaptable to new systems, and committed to meeting deadlines with high-quality results. I value professionalism, clear communication, and teamwork, even in virtual settings. My goal is to build experience in the digital workspace, deliver consistent performance, and add value to any organization I work with while continuously improving my skills.
$500 USD dalam 7 hari
0.0
0.0

Hello, I am very excited to be part of this ambitious project to compile 250 hours of audio recordings in the three main Saudi dialects: Najdi, Khaleeji, and Hijazi. I believe each dialect carries its own unique character and culture, and capturing it in a clear and natural way will create a strong, reliable dataset for training high-quality generative AI models. I am fully committed to ensuring the recordings meet the highest standards of clarity and naturalness, carefully managing audio levels, minimizing background noise, and documenting each segment thoroughly. I am ready to work independently or as part of a team to ensure steady progress and precise completion of project milestones. I look forward to contributing my skills and dedication to make this project a valuable reference for Saudi dialects and a source of high-quality audio content.
$500 USD dalam 4 hari
0.0
0.0

Lucknow, India
Kaedah pembayaran disahkan
Ahli sejak Feb 28, 2026
$10-30 USD
$250-750 USD
$30-250 USD
$10-30 USD
₹700-2000 INR / jam
$40-75 USD
£20-250 GBP
£20-250 GBP
$10-30 USD
$23-24 USD
₹600-1500 INR
$50 USD
$5-10 USD / jam
$3000-5000 USD
$10-30 USD
$250-750 AUD
$250-750 USD
$15-25 USD / jam
₹600-1500 INR
$30-250 CAD