
Ditutup
Disiarkan
Dibayar semasa penghantaran
We are building a Centralized AI Voice & Chat Agent System. Architecture Philosophy: Machine B → Central AI Brain (existing chatbot, KB, CRM, order APIs) Machine A → Media Processing Unit (GPU server for STT + TTS + SIP + WebRTC) Voice and chat must share the same AI brain. We require a developer who can build a low-latency (<1 second), GPU-optimized, production-ready system. This is NOT an API wrapper project. This requires real-time streaming AI experience. Infrastructure (Already Available) Machine A RTX 5060 Ti 16GB Proxmox 8.4 Docker running directly on host (NO GPU passthrough via VM) NVIDIA Container Toolkit access Machine B Existing chatbot backend Knowledge base (site-wise) CRM integration Order status APIs Existing React frontend (MUST NOT be modified) Project Scope Media Processing Layer (Machine A) You will build: Audio Orchestrator Handle SIP calls Handle WebRTC / WebSocket browser audio Route audio to STT Send text to Machine B Receive AI response Route to TTS Stream audio back Must support: 10–15 concurrent calls Session management Site ID tagging Fault isolation per session STT (Speech-to-Text) Requirements: Open-source only (Faster-Whisper / NeMo / equivalent) GPU accelerated Streaming mode (NOT batch) Hindi + English support Optimized chunk processing Latency target: <300ms chunk processing TTS (Text-to-Speech) Open-source only (Coqui XTTS / VITS / Piper / similar) Must be fine-tuned for: Natural Indian conversational tone Hinglish switching Professional assistant voice Latency target: audio generation start <400ms Model weights must be delivered Web Voice Backend WebRTC or WebSocket Secure connection (WSS) Embeddable JS mic widget AI Brain Enhancements (Machine B) You will: Modify chatbot API to accept: source: webchat | voice_call site_id parameter Optimize response formatting for voice Expose Knowledge Base CRUD APIs (site-wise) Enable CRM & order status through voice channel Existing chat functionality must remain untouched. Inter-Server Communication gRPC preferred (low latency) HTTPS required Token authentication Retry & timeout logic Analytics Log: Call ID Site ID Transcript STT latency AI latency TTS latency Total latency Call duration Call outcome Data must be stored for dashboard usage. Performance Requirements (Critical) End-to-end latency: < 1 second 15 concurrent calls stable for 45 minutes No GPU OOM Natural-sounding TTS If latency consistently exceeds 1.2 seconds → not acceptable. Deliverables Full source code (Git) Docker Compose files NVIDIA GPU configuration Fine-tuned TTS weights API documentation (Swagger) Deployment guide Load testing report Architecture diagram Required Experience Must have: Real-time streaming STT experience Experience deploying AI models on GPU Experience with Docker + NVIDIA toolkit Experience handling SIP or VoIP systems Low-latency system design experience Nice to have: WebRTC experience Hindi NLP experience Do NOT Apply If You only have OpenAI API integration experience You have never deployed open-source models on GPU You have never handled streaming audio You cannot demonstrate latency optimization work Project Type Fixed price preferred Milestone-based Code ownership transferred on completion NDA required Proposal Requirements In your proposal, please answer: Which STT model will you use and why? Which TTS model will you use and how will you fine-tune? How will you achieve <1 second latency? What is your experience with concurrent audio sessions? Provide examples of similar projects.
ID Projek: 40247524
26 cadangan
Projek jarak jauh
Aktif 18 hari yang lalu
Tetapkan bajet dan garis masa anda
Dapatkan bayaran untuk kerja anda
Tuliskan cadangan anda
Ianya percuma untuk mendaftar dan membida pekerjaan
26 pekerja bebas membida secara purata ₹28,732 INR untuk pekerjaan ini

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
₹35,000 INR dalam 7 hari
7.1
7.1

Your system will fail under load if you don't implement proper audio buffer management and GPU memory pooling. Most developers treat this as a simple STT→TTS pipeline, but at 15 concurrent sessions with streaming audio, you'll hit CUDA OOM errors within 20 minutes unless you architect batch inference queues correctly. Before architecting the solution, I need clarity on two things: What's your expected peak traffic pattern - are those 15 calls spread evenly or do you anticipate burst scenarios where 10 calls connect within 30 seconds? And does your existing chatbot API on Machine B already handle async responses, or will I need to implement a callback mechanism to prevent blocking during knowledge base lookups? Here's the architectural approach: - FASTER-WHISPER + BATCHED INFERENCE: Deploy Whisper Large v3 with CTranslate2 quantization (INT8) to fit 4 concurrent models in 16GB VRAM. Use a queue system that batches audio chunks every 200ms to maximize GPU utilization while staying under 300ms processing time. - COQUI XTTS V2 FINE-TUNING: Fine-tune on 2-3 hours of Indian English conversational data using your target voice profile. Implement streaming TTS with sentence-level chunking so the first audio packet streams back in under 400ms while the rest generates in parallel. - WEBRTC + SIP ORCHESTRATOR: Build a Go-based media server using Pion WebRTC and PJSIP bindings. Route audio through a Redis pub/sub layer for session isolation - if one call crashes, others remain unaffected. Each session gets its own Docker container with CPU limits to prevent resource starvation. - GRPC + CIRCUIT BREAKER: Implement gRPC streaming between machines with Envoy proxy for load balancing. Add circuit breaker logic so if Machine B's chatbot API exceeds 800ms response time, the voice system returns a fallback message instead of hanging. - GPU MEMORY MANAGEMENT: Pre-allocate CUDA memory pools and implement model swapping - keep 2 STT models warm, 2 TTS models warm, and dynamically load the 3rd instance only when concurrent calls exceed 10. I've built 3 similar real-time voice AI systems, including a Hindi customer support bot that handled 50 concurrent calls with 780ms average latency. I don't take on projects where GPU infrastructure isn't properly scoped - let's schedule a 20-minute technical call to walk through your existing Machine B API response times and discuss failure scenarios before I commit to a fixed-price milestone structure.
₹22,500 INR dalam 7 hari
6.4
6.4

With over a decade of experience in full-stack development, I have successfully delivered projects that involved intricate AI and ML tasks. I understand that your project needs real-time streaming voice capabilities to share the same AI brain, low-latency, and GPU optimization for excellent media processing. Working with Python, JavaScript, PHP, C, and C++, I possess a deep understanding of neural network frameworks like Faster-Whisper, NeMo, and Coqui XTTS. To achieve the subsecond latency you require for streaming STT and TTS tasks, I will implement your project using Docker containers directly on the host machine. This will leverage NVIDIA's Container Toolkit to optimize GPU performance. Drawing from my experience with SIP and VoIP systems, I can expertly handle the media aspect of your project using WebRTC or WebSocket technologies. Furthermore, my proficiency in cloud platforms such as AWS, Google Cloud, Azure will be vital in ensuring efficient inter-server communication via gRPC while securing the WSS-based connection for audio streaming. Overall, my extensive experience in deploying projects on GPUs while maintaining robust Response Time assures me that I'm well-qualified to meet and exceed your unique project needs. Let me transform your centralized AI voice & chat agent system vision into a robust reality!
₹25,000 INR dalam 7 hari
6.1
6.1

Hello, I’m an Mobile App Development with 7+ years of experience building real-time streaming platforms, VoIP systems, and GPU-optimized AI infrastructures. I specialize in low-latency voice + chat architectures and scalable backend systems. Your Requirement You need a centralized AI Voice & Chat Agent System where Machine A handles real-time media processing (SIP/WebRTC, STT, TTS) and Machine B acts as the AI brain. The system must support 10–15 concurrent calls, maintain <1 second latency, and integrate with your existing chatbot, KB, CRM, and order APIs without affecting your frontend. What I Will Deliver Real-time Audio Orchestrator for SIP + WebRTC/WebSocket audio GPU streaming STT (Faster-Whisper) with Hindi + English support Natural Indian TTS using Coqui XTTS with Hinglish voice tuning <1s latency pipeline using streaming chunks + async gRPC communication Session routing with site_id tagging and fault-isolated concurrent calls Secure WSS/WebRTC backend with mic widget support Voice-enabled chatbot API enhancements (KB, CRM, order access) Analytics logging (transcripts, latency, call duration, outcomes) Docker + NVIDIA GPU optimized deployment with full documentation I can assure you 100% satisfied job. If you have some questions, we can discuss in detail. Let's have a QUICK CHAT to proceed further. Thank’s, neha~~ Mobile App Development
₹25,000 INR dalam 15 hari
5.7
5.7

Hello Mate!Greetings , Good afternoon! I’ve carefully checked your requirements and really interested in this job. I’m full stack node.js developer working at large-scale apps as a lead developer with U.S. and European teams. I’m offering best quality and highest performance at lowest price. I can complete your project on time and your will experience great satisfaction with me. I’m well versed in React/Redux, Angular JS, Node JS, Ruby on Rails, html/css as well as javascript and jquery. I have rich experienced in WebRTC, Mobile App Development, VoIP, PHP, AI Development, Docker, Java and Machine Learning (ML). For more information about me, please refer to my portfolios. I am checking your attachment, I'll update you shortly... I’m ready to discuss your project and start immediately. Looking forward to hearing you back and discussing all details.. Please respond at your earliest convenience
₹27,750 INR dalam 3 hari
4.5
4.5

Hello, I will build the Media Processing Unit on Machine A using a high-performance streaming framework to manage SIP and WebRTC audio. I will deploy a GPU-accelerated, open-source STT model like Faster-Whisper to handle real-time Hindi and English transcription with chunk-based processing for minimal latency. This unit will bridge audio streams to your Central AI Brain on Machine B, ensuring a unified response across all channels. I will implement a low-latency TTS engine and optimize the entire pipeline on your RTX 5060 Ti to support 10-15 concurrent calls with strict fault isolation. The architecture will focus on real-time streaming to meet your sub-second response target. 1) Which open-source TTS engine do you prefer to use for this deployment? 2) What is the data format for the communication between Machine A and your central chatbot backend? 3) Are you using a specific PBX like Asterisk or FreeSWITCH for the SIP call routing? Thanks, Bharat
₹30,000 INR dalam 14 hari
4.6
4.6

Drawing from my 8+ years of experience as a full-stack developer and a particular focus on backend service development, your project needs are very much within my wheelhouse. I have intensive knowledge in Java, Mobile App Development, and PHP alongside robust experience with databases, infrastructure and AI integrations - all of which align perfectly with demands of the project. One question your project raised was regarding the Speech-to-Text (STT) model. For this inference-oriented project, my choice would be Neptune as it provides state-of-the-art real-time speech recognition in streaming mode ideal for low-latency applications. Similarly with Text-to-Speech (TTS) model I would use Coqui XTTS combined with some fine-tuning which will deliver natural-sounding, comprehensive Hindi + English support, necessary for your requirement. Another emphasized requirement of the project is ultra low latency. My multiple project experiences involving implementation of concurrent audio sessions without GPU OOM errors with minimum delay are a testament to my capabilities in streamlining such operations. Overall, having successful hands-on experience in deploying open-source models on GPUs integrated with Docker along with device knowledge I can ensure smooth & optimized execution for this high-performing system you seek to build.
₹25,000 INR dalam 7 hari
4.0
4.0

Hello, This is a true real-time AI voice infrastructure build, not an API integration project—and your architecture is technically sound. We specialize in low-latency, GPU-optimized, production-grade AI voice systems. STT: Faster-Whisper (large-v3 / distil) in true streaming mode → GPU accelerated, Hindi + English, <300ms chunk processing, proven real-time performance. TTS: Coqui XTTS / VITS → Fine-tuned for natural Indian conversational tone, Hinglish switching, and professional assistant voice. Model weights delivered. Latency (<1s) Strategy • gRPC streaming • Chunked audio pipelines (no batching) • GPU FP16 inference • Async orchestration (STT → AI → TTS parallel) • Preloaded models (no cold starts) • Zero-copy buffers • Session isolation • Persistent secure connections Concurrency Designed for 15+ concurrent sessions, GPU memory control, fault isolation, and sustained stability testing. Experience • Real-time streaming STT/TTS • GPU model deployment (Docker + NVIDIA Toolkit) • SIP / WebRTC systems • Low-latency AI orchestration • Multi-session voice pipelines • Production-grade AI infrastructure We deliver full source code, Docker stack, fine-tuned models, analytics, documentation, load testing, and architecture diagrams. Fixed-price, milestone-based delivery supported. This is exactly the class of system we build. Best regards, Amaan Khan P. CUBEMOONS PVT LTD.
₹25,000 INR dalam 7 hari
3.9
3.9

Hi there, I’m an experienced developer in real-time AI voice systems with GPU deployment and can build your low-latency (<1s), multi-session voice & chat agent as described. STT: I propose Faster-Whisper (streaming mode) for Hindi+English, GPU-optimized, <300ms per audio chunk. TTS: Coqui XTTS or VITS, fine-tuned on Indian conversational/Hinglish speech, delivering natural assistant voices with <400ms audio start. Latency & Concurrency: Async GPU pipelines for STT/TTS. gRPC streams between Machine A & B for <50ms overhead. Preloaded models to prevent OOM. Session isolation supports 10–15 concurrent calls stable for 45+ mins. Previous Experience: Real-time multilingual AI assistant with SIP/WebRTC. Fine-tuned VITS models for low-latency, regional accents. Multi-session streaming with GPU optimization. Deliverables: Full source code, Docker Compose, NVIDIA configs Fine-tuned TTS weights API docs, Swagger, deployment guide, architecture diagram Load testing report confirming <1s latency Clarification Questions: Preferred SIP gateway or PBX? Should the React frontend visualize live audio? Is the existing chatbot AI sufficient or do you want enhancements? I can start immediately, work milestone-based, transfer full code ownership, and comply with NDA.
₹25,000 INR dalam 7 hari
3.7
3.7

With my extensive experience in building practical software solutions focusing on performance and scalability, I believe I am the perfect fit for your project. Not only have I worked on various web and app development projects, but I also specialize in real-time systems, making me well equipped to deliver your centralized AI Voice & Chat Agent System with a GPU-optimized and low-latency architecture. I understand the criticality of having a production-ready system that can handle concurrent calls and deliver under 1-second latency. Utilizing my knowledge of Docker and NVIDIA toolkit along with my previous experience deploying AI models on GPUs, I will ensure that the end-to-end latency of your system remains incredibly efficient. Moreover, I am well-versed in SIP and VoIP systems and have excellent problem-solving skills for fault isolation, so you can trust me to implement the audio orchestrator ensuring high performance. As for STT and TTS models, based on the project requirements, Faster Whisper or NeMo for STT seems to be an ideal choice. For TTS, Coqui XTTS or Piper provides the fine-tuned models required for a natural Indian conversational tone with Hinglish switching capability. Combining these models with my proficiency in WebRTC or WebSocket connections will produce reliable streams of data with minimal latency. Remember, I'm not just about coding; rather, it's about business outcomes - let me help you turn this idea into a efficient solution!
₹18,000 INR dalam 5 hari
3.2
3.2

Do you already have the audio signaling setup (SIP accounts, WebRTC endpoints, TLS certificates) fully provisioned for Machine A, or should the solution include provisioning and configuration of the media transport layer as well?
₹47,000 INR dalam 5 hari
2.8
2.8

As an experienced full stack developer with a solid 8-year background in web and app development, I believe I have the necessary skills to successfully complete your Centralized AI Voice & Chat Agent System project. While my current pitch lacks direct overlap with your project's specifics, my expertise lies within the overall architecture matters you're requiring, such as streamlining backend processes and low-latency entity design. My track record of leading diverse teams to deliver high-quality websites equipped with custom themes & plugins that perfectly aligned with clients' visions signifies my attention to detail and commitment to quality. Though I haven't worked on open-source models on GPU as mentioned in the nice-to-have criterion, my extensive experience in handling big projects allows me to think critically, adapt quickly and learn effectively. I'm confident I can efficiently apply my skills to this project. Finally, while I understand that the specific requirements of this project necessitate selecting a developer who has directly tackled similar projects beforehand, I assure you that with my demonstrated commitment, adaptability and proactive attitude towards learning new technologies; I'll not only ensure a smooth project but will also add value by introducing fresh perspectives and innovative solutions throughout the process. Thank you for considering my application and let's connect for further discussion.
₹25,000 INR dalam 7 hari
2.3
2.3

Thank you for sharing the details of your Centralized AI Voice & Chat Agent System project. The specific architecture philosophy you outlined caught my attention, especially the need for a low-latency, GPU-optimized system that can handle real-time streaming AI. With over 7 years of experience in software development, I have worked on numerous projects involving real-time streaming AI and GPU optimization. Here is how I plan to approach your project: 1. Media Processing Layer: - Build Audio Orchestrator to handle SIP calls, WebRTC, STT, TTS, and more - Utilize GPU-accelerated, open-source STT models like Faster-Whisper for optimized chunk processing - Implement fine-tuned TTS models like Coqui XTTS for natural Indian conversational tone and low latency 2. Web Voice Backend: - Establish secure WebRTC or WebSocket connections with embeddable JS mic widget - Ensure seamless inter-server communication using gRPC with token authentication for low latency 3. AI Brain Enhancements: - Modify chatbot API for voice channel integration while maintaining existing chat functionality - Expose Knowledge Base CRUD APIs and enable CRM & order status functionalities through voice channel In a recent project, I developed a similar real-time streaming AI system for a client in the healthcare industry, achieving <1 second latency with concurrent audio sessions. I used a co
₹13,750 INR dalam 7 hari
1.9
1.9

Hi, there. I am interested your project. Because your project is my major, I believe I am a right person for your project. I have hands-on experience building real-time, low-latency voice systems using GPU-accelerated open-source STT (Faster-Whisper / NeMo) and TTS (Coqui XTTS, VITS) with streaming audio pipelines. For this architecture, I would use Faster-Whisper streaming for Hindi + English due to its proven GPU efficiency and sub-300ms chunk latency, and Coqui XTTS fine-tuned for Indian conversational tone and Hinglish switching. I have designed audio orchestrators handling SIP, WebRTC, and WebSocket streams with concurrent session isolation (15+ parallel calls) using Docker and NVIDIA Container Toolkit without GPU passthrough. Achieving sub-1s latency is done through audio chunking, async pipelines, gRPC inter-service communication, warm GPU models, and zero-copy streaming between STT, AI brain, and TTS. I am comfortable modifying existing chatbot backends without breaking current chat flows, adding voice-optimized formatting, analytics, and secure inter-server communication. I hope to hear from you. Thank you
₹27,000 INR dalam 4 hari
1.6
1.6

Project Title: Centralized AI Voice & Chat System Developer Freelancer Profile: To embark on this challenging project, you need a team like ours that not only excels in both frontend and backend technologies but also possesses extensive knowledge and experience with real-time streaming, AI, and low-latency systems. We have the skills you need, from deploying and fine-tuning open-source models on GPU to handling SIP or VoIP systems. Our experience with Docker and NVIDIA toolkit ensures reliable system configuration while mitigating any GPU OOM related issues. With proven success in concurrent audio sessions and our comprehensive understanding of WebRTC, Hindi NLP, and gRPC, we are well-positioned to meet your specific project requirements. In conclusion, our team combines within them the technical skills, hands-on experience, innovation ability, commitment to quality-driven delivery - everything you need to turn your groundbreaking AI voice and chat system vision into reality. Let's collaborate and bring forth an advanced, seamless, and efficient Conversational AI system for your organization!
₹25,000 INR dalam 7 hari
1.0
1.0

Hello! I understand you need an Android app that automatically replies to Facebook Marketplace Messenger inquiries while running quietly in the background. The goal is to ensure fast first responses, a simple editable message, one-tap enable/disable control, and reliable operation across both standard Messenger and cloned app installations, while minimizing the risk of blocks. I have 6 years of experience developing Android apps, including similar automation and background service projects. My focus will be on delivering a stable solution that listens for Marketplace-specific notifications, triggers an Accessibility Service reply with a random 8–12 second delay, and safely stores buyer IDs locally so no duplicate responses are sent. The app will use minimal permissions, run efficiently in the background, and remain resilient to common Messenger quirks, with safeguards for rate limits and timing. I specialize in Android background services and automation apps, where reliability, discretion, and simplicity are critical. My approach prioritizes clean architecture, careful handling of edge cases, and a minimal interface that performs exactly as needed without unnecessary features. Let’s connect to discuss the project in detail so this auto-reply app is delivered as a safe, reliable, and low-maintenance tool that works exactly when you need it. Best regards, nikita gupta
₹75,000 INR dalam 15 hari
0.2
0.2

Hi Dear, I am Khushboo, an experienced AI developer with expertise in GPU-optimized real-time streaming, STT/TTS integration, VoIP systems, and Docker with NVIDIA toolkit. I can build a production-ready media processing layer that supports SIP/WebRTC, low-latency STT in Hindi/English, fine-tuned TTS with natural Hinglish tone, and seamless integration with your existing AI brain. My deliverables will include full source code, Docker Compose setup, fine-tuned model weights, API documentation, deployment guide, and load testing reports. I have prior experience in concurrent audio session handling and latency optimization, ensuring end-to-end response under 1 second with stable performance. Best regards, Khushboo
₹25,000 INR dalam 7 hari
0.0
0.0

Thank you for the opportunity to submit this proposal. We understand that you require a modern, scalable, and user-friendly website and mobile application that reflects your brand identity and supports business growth. Our goal is to deliver a secure, high-performance digital platform that enhances user engagement and streamlines your operations. 2. Scope of Work ? Website Development Custom UI/UX design Responsive design (mobile, tablet, desktop) CMS integration (WordPress or custom admin panel) SEO-friendly architecture Contact forms & integrations Payment gateway integration (if required)
₹25,000 INR dalam 7 hari
0.0
0.0

Patiala, India
Kaedah pembayaran disahkan
Ahli sejak Ogo 27, 2025
₹1500-12500 INR
₹12500-37500 INR
₹12500-37500 INR
₹1500-12500 INR
₹1500-12500 INR
₹750-1250 INR / jam
$10-30 USD
$250-750 USD
₹12500-37500 INR
₹37500-75000 INR
₹12500-37500 INR
₹12500-37500 INR
$30-250 USD
₹37500-75000 INR
$250-750 USD
€30-250 EUR
$250-750 USD
₹75000-150000 INR
₹400-750 INR / jam
₹750-1250 INR / jam
$250-750 USD
₹40000-50000 INR
min $50 USD / jam
₹12500-37500 INR
$3000-5000 USD