
Open
Posted
•
Ends in 4 hours
Paid on delivery
Quiero poner en línea una aplicación web que permita al usuario mantener una conversación natural por voz con una inteligencia artificial. Flujo que necesito: 1. El usuario habla; la aplicación captura el audio y lo transforma a texto mediante STT. 2. Ese texto se envía a un modelo de lenguaje (por ejemplo, OpenAI GPT-4) para generar la respuesta. 3. La respuesta se convierte de nuevo a voz usando TTS y se reproduce al instante para el usuario. Entrego flexibilidad en la selección de librerías o servicios—Web Speech API, Whisper, Amazon Polly, Google Cloud Text-to-Speech, etc.—siempre que el resultado sea estable y de baja latencia. El código debe quedar bien documentado y listo para desplegarse en un hosting estándar (puede ser Vercel, Render o similar) con instrucciones paso a paso. Busco como entregables: • Front-end limpio con un botón de “hablar” y visualización del texto reconocido y generado. • Back-end o funciones serverless que gestionen la llamada al LLM y al servicio de voz. • Archivo README donde expliques instalación, variables de entorno y cómo cambiar de proveedor STT/TTS si fuera necesario. • Breve guía sobre cómo ampliar a móvil en el futuro. Si ya has construido algo parecido o conoces buenas prácticas para reducir la latencia, menciónalo en tu propuesta.
Project ID: 40465311
76 proposals
Open for bidding
Remote project
Active 24 hours ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
76 freelancers are bidding on average $151 USD for this job

Hello, Your project description cuts off mid-sentence, so I'm guessing the scope isn't fully locked yet — that's actually the first thing we need to nail. Are you building a chatbot that handles customer support, lead qualification, or something else entirely? The architecture changes completely depending on what conversations need to happen and whether you need voice integration. We've built conversational AI platforms and web apps for 12+ years across SaaS, marketplaces, and custom platforms. Most of our chatbot work involves integrating with existing backends, handling multi-language support, and making sure the AI doesn't sound robotic. That's the part most agencies skip. The budget range you've listed is wide, which makes sense since we're still figuring out what "full scope" means here. Once we talk through whether you need voice, what data the bot accesses, and your timeline, I'll give you a real number. Send me a quick message with the rest of your description and let's jump on a 15-minute call. That's usually enough to scope this properly. Regards, Nurul Hasan
$200 USD in 7 days
8.7
8.7

Hi, With over 15 years of diverse experience in Full-Stack development and a specialty in AI integration, I am confident in my ability to deliver on your vision for a web application with conversational AI. My portfolio showcases my expertise and the successful delivery of complex projects similar to yours. This includes the most important elements you specified: low-latency audio-to-text conversion using STT (Speech to Text), natural language processing utilizing leading language models like OpenAI GPT-X, and real-time text-to-speech rendering with proven services such as Google's TTS or Amazon Polly. When it comes to web hosting, I'm comfortable with all mainstream platforms including, but not limited to, Vercel and Render. The resultant code will be thoroughly documented, ensuring seamless handover and easy future modifications, whether you intend to scale onto mobile or add more providers for STT/TTS. In addition to this technical prowess, I take pride in my meticulous approach to implementation, problem-solving skills, and delivering clean and scalable solutions exactly tailored to clients' needs. By choosing me for the job, you can be assured of an application that is not just functional but also reliable. Moreover, my 30-day guarantee period after project completion assures you of continued support even after we have wrapped up this particular project. Thank you for considering me for this project - let's bring your we Thanks!
$75 USD in 3 days
8.3
8.3

Hi! My name is Marjan and I'm here to offer you my services as a skilled applicant with over a decade of experience working on Freelancer.com. l believe I am the best fit candidate for this project due to my extensive experience; I would like to have a discussion to get to know that we both are on the same page. Once the scope will be locked, I will start working on it right away.
$140 USD in 7 days
7.0
7.0

Hello, With 4 years of experience in Website Design, Web Development, and AI Development, I am well-equipped to handle your project for creating a web conversational assistant with AI. I understand the requirements outlined in the project description and am confident in providing a professional solution that meets your needs. I have expertise in PHP, Website Design, Graphic Design, HTML, Web Development, Voice Assistance Devices, AI Chatbot, and AI Development. I have carefully reviewed the project requirements and believe I can deliver this project with precision. I am open to discussing further details in chat to ensure the project's success. Looking forward to discussing the project details with you further. Best regards, Taimoor from Pixels Soft
$199 USD in 7 days
6.8
6.8

Hello! With over 13 years of experience designing and implementing custom web solutions, I am the perfect fit for your "Asistente conversacional web con IA" project. My extensive skill set includes expertise in Python, which is ideally suited to the task at hand given my previous AI development projects and proficiency with Flask and Django frameworks. To further assure you of my capabilities, I recently completed a project involving AI chatbots that would enhance user engagement on your application. In line with your specific requirements, it is also pertinent to mention that I have successfully incorporated numerous speech-to-text (STT) and text-to-speech (TTS) APIs including Whisper, Amazon Polly, Google Cloud Text-to-Speech into previous applications. One of the key aspects that sets me apart in client discussions is my ability to deliver swift yet low latency solutions while ensuring that all code is thoroughly documented for smooth deployment onto standard hosting platforms. Having recently worked on the integration of OpenAI GPT-4 into an application similar to yours, I am well accustomed to managing LLM calls effectively and creating reliable serverless functions as desired. Keeping in view future scalability needs, I would also provide a brief guideline on expanding the solution to mobile use.
$40 USD in 1 day
6.3
6.3

Salve! ★★★★ ( Assistente web vocale basato su IA con flusso conversazionale in tempo reale: STT → LLM → TTS ) ★★★★ Comprensione del progetto: Hai bisogno di un'applicazione web in cui gli utenti possano parlare, convertire il parlato in testo (STT), inviarlo a un modello di IA come GPT-4 e ricevere una risposta vocale tramite TTS. Il sistema deve avere una bassa latenza, essere stabile, implementabile su Vercel/Render e ben documentato per future espansioni, incluso il supporto per dispositivi mobili. ⚜ Acquisizione vocale in tempo reale con Web Speech API / Whisper ⚜ Integrazione con GPT-4 / LLM per risposte conversazionali ⚜ Integrazione Text-to-Speech (Google / Polly / simili) ⚜ Front-end pulito con pulsante di registrazione vocale + visualizzazione della trascrizione in tempo reale ⚜ Backend Serverless o Node.js per l'orchestrazione delle API ⚜ Ottimizzazione a bassa latenza per un flusso conversazionale fluido ⚜ Guida completa all'implementazione + documentazione README Il mio approccio prevede la progettazione iniziale di un front-end minimalista, seguita dalla realizzazione della pipeline STT → LLM → TTS, con gestione asincrona ottimizzata e un'architettura pronta per l'implementazione. Mettiamoci in contatto per discutere dello stack tecnologico che preferisci. Cordiali saluti, Farhin B.
$110 USD in 10 days
6.6
6.6

Dear Client, Hello There! I’m Md Toriqul Islam, an experienced web designer and full-stack developer with 10+ years of expertise in elegant event websites, responsive UI/UX, and modern custom-themed designs. I understand you need a tropical disco-inspired wedding website for your Koh Samui destination event featuring schedules, venue details, travel guidance, RSVP links, accommodation information, countdown timer, and immersive visuals matching your White Lotus-meets-disco aesthetic. My skills include React, WordPress, custom web design, animation effects, responsive development, and creative branding-focused UI design. Feel free to share your references and imagery. I’m ready to start immediately and create a visually stunning experience for your guests. Best regards, Md Toriqul Islam
$80 USD in 2 days
6.0
6.0

Hey there, I'm Vishal Maharaj, a web developer with over 25 years of experience in PHP, HTML, AI Development, and Website Design based in Perth, Australia. I'm passionate about taking on your project to create a web-based conversational AI assistant. I understand the flow you require: audio to text via STT, language model processing, and text to speech via TTS for instant user response. I would approach the project by carefully selecting and integrating suitable libraries or services for stable, low-latency results. The code will be well-documented and deploy-ready on standard hosting platforms like Vercel or Render. Let's discuss further details and kickstart this project together. Cheers, Vishal Maharaj
$250 USD in 5 days
5.1
5.1

Being in the industry for close to a decade, think of me as your one-stop destination for AI development, HTML/CSS proficiency, PHP know-how, and robust Web Development skills. Over the years, I’ve successfully completed numerous projects much like yours within limited budgets. My approachability and honesty have been much appreciated by my clients. I understand the importance of promptitude and stability for a project of this nature. In line with the project description, my team and I will deliver a clean-cut front-end design with an easily understandable layout- guaranteeing smooth flow right through to your back-end dealings and serverless functions. You'd also be given a comprehensive Readme file that'll keep things simple for any prospective changes or upgrade- including the addition of a mobile application layer in the near future. Furthermore, my experience provides me deep insights into leveraging top-tier libraries (like OpenAI GPT-4) or services (such as Amazon Polly or Google Cloud Text-to-Speech,) for effective results without sacrificing on low latency. With us on board, you get maximum flexibility without compromising on stability.
$140 USD in 7 days
5.4
5.4

Hola, puedo ayudarte a desarrollar una aplicación web de conversación por voz con IA, enfocada en baja latencia, estabilidad y una experiencia natural para el usuario. Tengo experiencia integrando STT, LLM y TTS en flujos en tiempo real utilizando tecnologías como Whisper, Web Speech API, OpenAI, Google Cloud TTS y arquitecturas serverless listas para despliegue en Vercel o Render. Mi enfoque sería construir un front-end limpio y responsivo con captura de voz en vivo, visualización del texto reconocido y respuesta generada, junto con un backend ligero que gestione las llamadas al modelo y la síntesis de voz de forma eficiente. También puedo dejar el proyecto completamente documentado, incluyendo variables de entorno, instrucciones de despliegue y una estructura flexible para cambiar fácilmente de proveedor STT/TTS en el futuro. Además, puedo compartir buenas prácticas para reducir latencia, como streaming parcial de respuestas, prefetch de audio y manejo optimizado de sesiones. Una vez revisemos tus preferencias técnicas y el flujo exacto de conversación, puedo proponerte una arquitectura clara y un cronograma realista de implementación.
$250 USD in 15 days
5.0
5.0

Hola, He leído detalladamente tu solicitud y entiendo perfectamente el reto: el éxito de un asistente conversacional de voz radica en la baja latencia (low latency). Como desarrollador senior (basado en UK), he construido arquitecturas de IA de voz a voz similares y sé exactamente cómo reducir los tiempos de espera entre el usuario y la IA. Para garantizar que la conversación sea fluida y natural, propongo el siguiente stack tecnológico, optimizado para despliegue en Vercel (Serverless/Edge functions): STT (Speech-to-Text): Utilizar la Web Speech API nativa del navegador para el frontend. Esto proporciona transcripción instantánea con latencia cero. Como alternativa de alta precisión, podemos integrar OpenAI Whisper API. LLM: OpenAI GPT-4o (Omni) o GPT-3.5-turbo, configurado con stream: true en el backend para empezar a recibir la respuesta token por token sin esperar a que la IA termine de pensar. TTS (Text-to-Speech): ElevenLabs (vía WebSocket para streaming de audio) o Google Cloud TTS. Transmitiremos el audio en fragmentos para que empiece a sonar casi al instante. Mis entregables coinciden 100% con tu lista: Frontend limpio y responsivo (HTML/JS/React) con botón interactivo (Push-to-talk) y transcripción en pantalla. Backend serverless seguro para ocultar las API Keys y gestionar las llamadas. Un README detallado con el paso a paso para desplegar, configurar variables de entorno y cambiar proveedores. Una breve guía técnica sobre cómo migrar esta lógica a React Native o Flutter para futuras apps móviles. Me encantaría mostrarte cómo estructuro el código para reducir el “Time-to-First-Byte” (TTFB) en este tipo de aplicaciones. ¿Estás disponible para un chat rápido y discutir qué voz o tono prefieres para el asistente? Un saludo, Ross C.
$200 USD in 4 days
5.2
5.2

Soy Juan Pablo. Entendí perfectamente lo que necesitas: una aplicación web capaz de ofrecer una conversación por voz fluida, natural y con baja latencia entre el usuario y una IA. He construido asistentes conversacionales con STT, LLM y TTS en tiempo real, así que puedo entregarte un front-end limpio con botón de “hablar”, transcripción inmediata, respuesta generada y reproducción instantánea, junto con un backend o funciones serverless que gestionen llamadas al modelo y al proveedor de voz que elijas. Puedo dejar todo documentado, con variables de entorno claras, un README para desplegar en Vercel o Render y una guía para extenderlo a móvil más adelante. Si quieres, puedo explicarte cómo reduzco latencia en asistentes de voz, cómo estructuro pipelines STT‑LLM‑TTS o cómo preparo frontends conversacionales antes de comenzar. Puedo iniciar de inmediato.
$350 USD in 3 days
5.1
5.1

Hello, I am available now. I have read your project description carefully and I understand what you want. 300% Confidence!!! I have 7+ years of experience in Web Development, PHP. I have completed similar projects. Please contact me. Best regards, Steven
$140 USD in 7 days
4.6
4.6

Hello, Hola, ¿Tienes una lista de características específicas que te gustaría que incluya en la aplicación? Me entusiasma la idea de desarrollar una aplicación web que permita interacciones fluidas entre usuarios e inteligencia artificial. Propongo implementar un flujo en el que el usuario hable, se convierta el audio en texto utilizando STT, que luego se envíe a un modelo de lenguaje como GPT-4 para generar respuestas. Estas respuestas se convertirán a voz mediante TTS, asegurando una experiencia instantánea y natural. Estoy flexible en cuanto a las librerías, eligiendo las que ofrezcan estabilidad y baja latencia. Para asegurar que el resultado cumpla con tus expectativas, ¿puedes aclararme qué modelo específico de LLM prefieres? ¿Tienes alguna preferencia por las librerías STT/TTS? Además, ¿qué tipo de hosting prefieres entre Vercel y Render? El costo y el tiempo son estimaciones hasta que confirmemos los detalles. Estoy listo para empezar y espero con interés tu respuesta. Saludos, [Tu Nombre] Relevant Portfolio: • https://www.freelancer.com/u/amjad2 Best Regards, Amjad Iqbal
$150 USD in 3 days
4.8
4.8

Dear Client, Are you looking for a full-stack engineer who can design and deploy a low-latency, voice-to-voice AI web application with an optimized audio-streaming pipeline? I am a senior full-stack developer and AI integration specialist with extensive experience building real-time communication systems, managing asynchronous web webhooks, and optimizing cloud architectures for speed. I specialize in handling complex browser-side media capture, configuring serverless endpoints or WebSockets to handle concurrent streams, and minimizing the perceived latency of Text-to-Speech (TTS) and Speech-to-Text (STT) integrations. I will provide a comprehensive README file detailing step-by-step instructions for hosting on Vercel or Render, configuration options for switching vendors, and a detailed architect's blueprint for porting the audio layer into a cross-platform mobile framework like React Native or Flutter down the road. I am available to start immediately and look forward to delivering a fast, beautifully engineered voice AI application for your brand. Best regards, Oleksandr
$140 USD in 1 day
4.8
4.8

✋ Hi There!!! ✋ THE GOAL OF THE PROJECT:- TO BUILD A LOW LATENCY WEB VOICE ASSISTANT WITH STT LLM AND TTS FOR NATURAL REAL TIME CONVERSATION. I have carefully read requirement for voice based conversational web app with speech to text LLM processing and text to speech output, including deployment and documentation. I am best fit due to strong experience in real time AI voice systems and API integrations. 1 STT integration using Web Speech API or Whisper for accurate transcription 2 LLM response generation using GPT API with optimized low latency flow 3 TTS integration with Google or AWS Polly for real time voice output I will provide UI design, database management, testing, and full source code delivery at project completion plus deployment guide and README documentation. I have 9+ years experience as full stack developer. I have built similar AI chatbot and voice assistant systems with optimized response pipelines. Looking forward to chat with you for make a deal Best Regards Elisha Mariam!
$111 USD in 11 days
4.9
4.9

Good day! Somos un equipo de desarrollo full-stack con experiencia en aplicaciones conversacionales con IA, integración de STT/TTS y optimización de baja latencia. Podemos construir una aplicación web fluida donde el usuario hable naturalmente con la IA en tiempo real. Nuestro equipo puede entregar: ✔ Front-end limpio y responsive con botón “Hablar” ✔ Captura de voz + transcripción en tiempo real ✔ Integración con GPT-4 u otro LLM ✔ Conversión de respuesta a voz con TTS de baja latencia ✔ Backend/serverless listo para Vercel, Render o similar ✔ Código documentado y estructura escalable ✔ README completo con instalación y variables de entorno ✔ Guía para futura expansión móvil Tecnologías posibles: • Whisper / Web Speech API para STT • OpenAI / Claude / Gemini para IA • Amazon Polly / Google TTS / ElevenLabs para voz También podemos optimizar: • Streaming de audio y respuestas • Caché y manejo de sesiones • Tiempo de respuesta y estabilidad • Cambio flexible de proveedor STT/TTS ¿Por qué nosotros? • Experiencia con apps de voz e IA conversacional • Buenas prácticas para reducir latencia • Front-end moderno y UX fluida • Código limpio y fácil de mantener • Despliegue y documentación profesional Podemos comenzar inmediatamente y entregar un MVP funcional rápidamente.
$140 USD in 7 days
4.3
4.3

Hi, I've built voice-based applications that integrate speech-to-text and text-to-speech functionalities using Web Speech API and Google Cloud services. My experience includes handling real-time audio processing and generating responses with AI models like GPT-4. I can start with a small test project to ensure we align before moving to a full solution, including a clean front-end, robust back-end, and detailed documentation. Let’s get started! Best Regards, Ivica
$140 USD in 7 days
4.1
4.1

Hello, Capturaremos el audio del usuario, utilizaremos la Web Speech API para convertirlo a texto, y luego enviamos ese texto a GPT-4 para generar la respuesta. La respuesta se transformará nuevamente en voz usando Google Cloud Text-to-Speech. He implementado aplicaciones similares, donde utilicé Whisper para STT y Amazon Polly para TTS, logrando una latencia inferior a 200 ms en el procesamiento. Para este proyecto, propongo un enfoque modular usando funciones serverless para el backend, facilitando la escalabilidad y manteniendo la documentación clara para el despliegue en Vercel. Para la visualización en el front-end, ¿prefieres que el texto reconocido se muestre en tiempo real o después de que la respuesta se haya generado?
$150 USD in 3 days
4.2
4.2

I JUST COMPLETED A SIMILAR PROJECT. I have just built a voice-controlled chatbot using Whisper for STT and Google Cloud Text-to-Speech for a smooth user experience. You want a natural, low-latency conversation with clear speech recognition and responsive AI answers. I will ensure seamless audio capture, quick API calls to GPT-4, and efficient TTS playback, with thorough documentation covering deployment and environment adjustments. REACH OUT FOR A FREE CONSULTATION, I PROMISE NOT TO MAKE THIS MORE COMPLICATED THAN IT NEEDS TO BE. Regards, Stefan.
$150 USD in 14 days
4.1
4.1

Madrid, Spain
Payment method verified
Member since Feb 13, 2021
€30-250 EUR
€30-250 EUR
€8-30 EUR
€30-250 EUR
€30-250 EUR
$750-1500 USD
₹12500-37500 INR
$250-750 USD
$750-1500 USD
₹12500-37500 INR
₹600-1500 INR
$250-750 USD
₹500 INR
$250-750 USD
€250-750 EUR
₹600-1500 INR
₹600-1500 INR
$1500-3000 USD
€8-30 EUR
₹600-1500 INR
$10-30 CAD
£1500-3000 GBP
₹12500-37500 INR
₹1500-12500 INR
€5000-10000 EUR