
Open
Posted
•
Ends in 3 hours
Paid on delivery
I need a production-ready scraper that keeps my Indeed dataset fresh every hour. The crawler should pull every field I have identified—job title, company name, official company website, full description, salary, skills, experience, location, apply URL and posted date—while de-duplicating anything already stored. Of those, job title and company name must never be missed because they anchor the rest of my pipeline. I would like the core built in Python and I am comfortable if you reach for Playwright, Puppeteer, Scrapy or Selenium so long as the codebase stays clean and readable. Results should flow straight into either PostgreSQL or MongoDB; future scale is important, so please structure tables/collections with growth in mind. Indeed can be unforgiving, so the scraper has to rotate proxies by default. Feel free to layer on further anti-bot tactics (stealth headless settings, adaptive delays, etc.), but proxy rotation is the non-negotiable starting point. When a listing does not mention the company’s website, the scraper should still track it down—whether through an extra search pass or by parsing the apply URL—so every record ends with an accurate domain. Deliverables • Fully functional scraper with hourly auto-refresh • Duplicate-detection logic and company-domain lookup • Database integration script (PostgreSQL or MongoDB) • Configuration for proxy rotation and other anti-blocking measures • Clear, step-by-step deployment guide with environment requirements In your proposal, show me similar scraping projects you have tackled, outline your preferred tech stack, describe the anti-blocking flow you plan to implement, and give me an honest timeline from first commit to live deployment.
Project ID: 40465159
58 proposals
Open for bidding
Remote project
Active 3 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
58 freelancers are bidding on average $171 USD for this job

⭐⭐⭐⭐⭐ Build a Reliable Scraper for Fresh Indeed Data Every Hour ❇️ Hi My Friend, I hope you are doing well. I checked your project requirements and see you are looking for a reliable web scraper for Indeed data. Look no further; Zohaib is here to help you! My team has completed over 50 similar projects for web scraping. I will build a Python-based scraper that pulls all necessary fields while ensuring clean and readable code. The results will flow into PostgreSQL or MongoDB, structured for future scalability. ➡️ Why Me? I can easily create your production-ready scraper as I have 5 years of experience in web scraping, focusing on data extraction, proxy management, and database integration. My expertise includes Python, Selenium, and Scrapy, ensuring a comprehensive solution for your needs. ➡️ Let's have a quick chat to discuss your project in detail. I can show you samples of my previous work and explain how I will implement proxy rotation and other anti-bot measures. Looking forward to discussing this with you in chat. ➡️ Skills & Experience: ✅ Python ✅ Web Scraping ✅ Selenium ✅ Scrapy ✅ Playwright ✅ Puppeteer ✅ PostgreSQL ✅ MongoDB ✅ Proxy Rotation ✅ Data Deduplication ✅ API Integration ✅ Database Design Waiting for your response! Best Regards, Zohaib
$150 USD in 2 days
8.0
8.0

As the leader of a talented team at BN-Droids Digital Services, I can confidently say we are equipped to deliver on every aspect of your job scraping project. We specialize in web data scraping—with over 1 million data entries extracted daily—so we understand the challenges of working with demanding platforms like Indeed. Our in-depth knowledge of Python and Scrapy, backed by robust experience with Playwright, Puppeteer, and Selenium, ensures clean and readable code that meets your requirements
$30 USD in 7 days
6.9
6.9

Hi, We’ve built several production-ready scrapers that extract data from job portals like Indeed, LinkedIn, and Glassdoor. We understand the importance of job titles and company names as anchors for downstream processes, and we’ve implemented robust de-duplication mechanisms to ensure data accuracy. For this project, I recommend using Python with Playwright, as it’s fast, reliable, and well-suited for scraping dynamic content. We can also integrate a dedicated proxy manager to handle multiple proxies and automatically switch them based on success rates, ensuring maximum uptime. Let’s schedule a 10-minute call to discuss your project in more detail and see if I’m the right fit. I usually respond within 10 minutes. Best regards, Adil
$154 USD in 7 days
6.0
6.0

Hi, I can build a robust, production-ready Indeed scraper using Python and Playwright, designed for hourly automated execution. The system will extract all specified fields (job title, company, website, description, salary, etc.) with strict deduplication logic to ensure data freshness. I will implement mandatory proxy rotation and advanced anti-bot measures (stealth mode, adaptive delays) to maintain reliability. For records missing the company website, the script will perform a secondary lookup via search or apply URL parsing to ensure every entry is complete. Data will be stored in a scalable PostgreSQL or MongoDB schema, optimized for future growth. You will receive the complete source code, database integration scripts, configuration for proxy rotation, and a step-by-step deployment guide. I have extensive experience building high-volume scrapers for complex sites like Indeed, ensuring clean code and reliable data pipelines. I also offer FREE post-delivery support to monitor initial scraping cycles, adjust selectors if Indeed updates its layout, and assist with optimizing proxy performance during the first month. Let's discuss the project in more details.
$150 USD in 5 days
5.9
5.9

I can build a production-ready Indeed scraper in Python with hourly automated crawling, proxy rotation, stealth anti-bot handling, duplicate detection, and reliable extraction of all required fields including guaranteed job title/company capture and company-domain enrichment. My preferred stack is Playwright + async Python + PostgreSQL/MongoDB with rotating proxies, adaptive delays, fingerprint randomization, retry queues, and scalable database architecture designed for high-volume job ingestion pipelines and future expansion
$140 USD in 1 day
5.4
5.4

Hello, Understanding your need for a production‑ready Indeed scraper, I propose a Python solution using Playwright for reliable rendering and a robust proxy rotation layer. The workflow: 1) Crawl job listings hourly, extracting title, company, location, salary, skills, experience, description, apply URL, and posted date. 2) Deduplicate by composite key of title and company, ensuring no repeats. 3) Resolve missing company domains via a secondary search on the apply URL or a quick Google lookup. 4) Persist clean data into PostgreSQL with normalized tables designed for scale, or MongoDB collections if preferred. 5) Configure stealth headless settings, adaptive delays, and a rotating proxy pool to avoid blocks. 6) Deliver a Dockerised service with Celery workers, a CI pipeline, and a step‑by‑step deployment guide. I’ve built similar hourly scrapers for tech‑job aggregators and can deliver a stable, maintainable solution within 12 days for $180. Best Regards Naveen Thakur
$30 USD in 1 day
5.1
5.1

I understand you require a production-ready Python scraper to keep your Indeed dataset updated hourly, capturing all specified fields including job title, company name, official website, description, salary, skills, experience, location, apply URL, and posted date. My experience includes building and maintaining similar high-volume scraping systems that successfully handled over a million records daily without data loss. The scraper will be developed using Python with Playwright for reliable browser automation, ensuring it can navigate Indeed’s dynamic content and handle modern web technologies effectively. Data will be stored in a PostgreSQL database, with custom logic implemented to de-duplicate entries based on a combination of job title, company name, and apply URL to ensure data integrity. The codebase will be structured with clear modularity for readability and maintainability. How should the de-duplication logic prioritize job title and company name if an apply URL is missing or inconsistent across different postings for the same role? Ready to start as soon as you confirm scope.
$208 USD in 21 days
5.2
5.2

With my extensive experience in web development and web scraping, specializing in languages such as Python and Java, I'm confident in my ability to deliver a production-ready scraper that meets your exact specifications. My previous work includes large-scale data extraction projects like yours, where clean and scalable code is of paramount importance. I have used Playwright, Puppeteer, Scrapy and Selenium for crawling tasks with great success and can assure you of a clean and readable codebase that will grow seamlessly with your data. To address the anti-blocking measures you require for this project, I'll prioritize proxy rotation as the foundation while also utilizing settings for stealth headless browsing and adaptive delays where necessary. Additionally, my experience with Google App Engine will be valuable in crafting an efficient architecture for hourly refreshing with appropriate duplicate-detection logic. My data analysis skills go beyond just scraping - I will ensure that the job titles and company names have utmost accuracy to enable your smooth extraction pipeline flow. Moreover, my proficiency with PostgreSQL or MongoDB ensures smooth data integration into your chosen database. Lastly, I'll provide an in-depth deployment guide for future reference. Choose me for a full-stack-approach to your web scraping needs that guarantees not just quality code delivery but a solution developed intentionally to help you scale. Let's get started today!
$140 USD in 2 days
4.8
4.8

Hello dear, Greetings from MD. Toriqul Islam! We are a dedicated Web Design & Development team with over 10+ years of industry experience. I’m Engineer Toriqul Islam, an experienced Computer Science & Engineering graduate from RUET. We specialize in building modern, scalable, and user-friendly digital solutions tailored to business needs. What I Offer We help businesses grow online by delivering: • Clean, modern, and responsive website designs • High-performance and scalable web applications • User-focused UI/UX for better engagement and conversion My Technical Expertise We work across a wide range of technologies, including: • Frontend: HTML5, CSS3, Bootstrap, JavaScript, jQuery, Angular, React • Backend: Node.js, PHP, Laravel, .NET, CodeIgniter, Ruby on Rails, Python • CMS & Platforms: WordPress • Database: MySQL, MongoDB • Mobile Development: React Native, Flutter, and more Why choose me? ✔️ Clean, optimized, and well-documented code ✔️ Reusable and scalable components ✔️ On-time delivery with complete requirement fulfillment We are confident in our ability to turn your ideas into a powerful digital product. Let’s discuss your project and make it a success. Looking forward to working with you! Best Regards, Md. Toriqul Islam
$90 USD in 5 days
5.0
5.0

Hi, I do have some questions, but here's what I can do for you: - Build a robust Python scraper using Playwright with stealth mode, rotating proxies, and adaptive delays so it stays live on Indeed without getting blocked - Design a smart deduplication system that anchors on job title and company name, ensuring those fields are never missed while keeping your dataset clean across hourly refreshes - Implement an automatic company domain lookup that extracts the website from the apply URL or runs a secondary search pass when it's missing, so every record is complete - Structure your PostgreSQL or MongoDB schema with indexing and partitioning ready for scale, so millions of listings stay fast and queryable - Deliver a clean, well-commented codebase with a step-by-step deployment guide and a cron setup that handles the hourly refresh reliably Note: full source code will be delivered. Send me a message, let's discuss.
$99 USD in 3 days
4.9
4.9

I understand that keeping your Indeed dataset fresh every hour is crucial for your operations. The challenge of not only scraping the necessary job details but also ensuring data integrity through de-duplication and accurate company domain tracking can be complex. With over 12 years of experience in developing scalable scrapers, I have successfully completed similar projects using Python with frameworks like Scrapy and Selenium. My approach would involve creating a robust scraper that employs an effective anti-blocking strategy, including proxy rotation and stealth headless settings to navigate Indeed's defenses efficiently. I recommend using PostgreSQL for structured data storage, ensuring future scalability through thoughtful schema design. The timeline from first commit to deployment can typically range from 4 to 6 weeks depending on the complexity and required adjustments during testing. Could you please clarify if there are specific fields or formats you prefer for the company website lookup?
$250 USD in 7 days
4.6
4.6

Having successfully delivered various web scraping and data automation projects over a decade, I'm confident in my ability to provide you with a highly functional and production-ready Indeed job scraper tailored exactly to your needs. My proficiency in tech stacks including Python, MongoDB, and PostgreSQL combined with extensive experience in dealing with challenges posted by platforms like Indeed make me the right fit for this project. Understanding the importance of consistent scalability for long-term usage, I'm adept at developing clean code that ensures no-data loss and smooth operation even under high growth rates. In addition to your requirements, I'll bring anti-blocking measures like reliable proxy rotation for undisturbed runs, as well as the ability to track down accurate domain information even when it's not explicitly provided on the listing.
$140 USD in 7 days
4.6
4.6

Hi there, I see you need a production ready scraper to keep your Indeed dataset fresh every hour, pulling job title, company name, official company website, full description, salary, skills, experience, location, apply URL, and posted date, with deduplication, proxy rotation, and company website resolution. You want Python based with Playwright or Scrapy, and PostgreSQL or MongoDB storage. I have built 8 job board scrapers including a LinkedIn jobs crawler that ran for 6 months without bans using rotating proxies and stealth Playwright. I will build your Indeed scraper using Playwright with stealth options, a pool of rotating proxies, adaptive delays, and browser fingerprint randomization. For missing company websites, I will perform a Google search via a proxy to find the official domain. Data will be stored in PostgreSQL with unique constraints on job ID or title plus company plus posted date to prevent duplicates. The scraper will run hourly via cron or a scheduler. I will provide a deployment guide with environment variables for proxy list and database connection. Best regards, Mobasher Reza
$140 USD in 3 days
3.9
3.9

Hi, I am a Python developer with 8 years of experience in web scraping and data pipelines. I am familiar with Python, Scrapy, Playwright, PostgreSQL, MongoDB, etc. For this project, the most important part is building a scalable Indeed scraper that refreshes hourly, avoids duplicates, and reliably collects all key fields while handling anti-bot measures. I can implement proxy rotation, stealth headless browsing, domain lookup, and clean database integration to keep your dataset accurate and up-to-date. I'm an individual freelancer and can work on any time zone you want. Please contact me with the best time for you to have a quick chat. Looking forward to discussing more details. Thanks. Emile.
$250 USD in 7 days
4.0
4.0

Hi , I have carefully reviewed your project requirements and am confident I can deliver a high-quality, scalable, and performance-driven solution that aligns perfectly with your goals. With over 7 years of experience in full-stack web development, I specialize in building robust, secure, and conversion-focused digital ecosystems rather than just standalone websites. Core Expertise & Technical Skills ✔ Custom Web Development: PHP, CodeIgniter, Python, MySQL ✔ Advanced WordPress: Custom Themes, Plugins, WooCommerce, Elementor ✔ Frontend & Design: HTML5, CSS3, JavaScript, jQuery, Figma/PSD to Pixel-Perfect HTML ✔ Optimizations: API Integration, Speed/Performance Enhancement, and Mobile-First Responsive Design ✔ Maintenance: Advanced Bug Fixing, Debugging, and Security Audits What I Bring to Your Project ✔ Clean, Scalable Architecture: Well-structured code built for long-term growth and SEO optimization. ✔ Performance-Driven Focus: Fast-loading, secure, and fully responsive across all devices. ✔ Business-Centric Approach: Designed to enhance user experience and deliver measurable business value. ✔ Seamless Collaboration: Clear communication, regular progress updates, and reliable post-launch support. Let’s connect and discuss how we can turn your idea into a powerful digital product that supports your business growth. I would be happy to get started immediately. Thank you.
$111 USD in 3 days
3.7
3.7

Hello, I can build a production-ready Python scraper that keeps your Indeed dataset updated every hour, using Playwright or Scrapy with a clean, modular architecture. It will reliably extract job title, company name, website, description, salary, skills, experience, location, apply URL, and posted date, with strict validation to ensure no critical fields (especially title and company) are ever missed. I will integrate proxy rotation as a core layer along with adaptive delays, session handling, and request throttling to maintain stability under anti-bot protection. The pipeline will include duplicate detection, company-domain enrichment (via apply URL parsing or secondary lookup), and direct streaming into PostgreSQL or MongoDB with a scalable schema designed for high-volume growth. I can also deliver a full deployment setup with hourly scheduling (cron/systemd), environment configuration, and a clear step-by-step guide so it runs reliably in production. I’m available to discuss stack preferences and can begin immediately with a structured delivery plan and timeline from first commit to live deployment. Best regards, Khuda Bux
$140 USD in 2 days
3.5
3.5

Hello, Indeed hourly scraping proxy rotation + dedup. Python Scrapy+Playwright. Proxy rotation+retry. Postgres dedup via job_id+company. Edge: Indeed returns same job under different URLs after redirects. Day1: crawler + DB pipeline 1 location hourly run. Should proxy rotation residential or datacenter?
$140 USD in 3 days
3.6
3.6

Hi, we are a team of 20+ AI/ML Engineers based in Delhi - have completed 300+ projects with 100% client satisfaction & long term association. As a seasoned professional and the leader of an esteemed AI-driven software development team, I possess precisely the skill set that your project requires. My proficiency in Python is well-known, but crucially so is my experience in successful web scraping initiatives, including scraping job data from various platforms like Indeed. Regarding tech stack preference, we are adept at working with Scrapy and Selenium, modules well-suited for this undertaking. Our core belief is that clean, readable code fosters efficiency and flexibility for future scale — an aspect you emphasized. We also have ample experience with PostgreSQL and MongoDB, ensuring seamless integration with your databases. Given the challenges associated with scraping from Indeed effectively, I completely understand your need for anti-bot measures, including proxy rotation. We plan on implementing adaptive delays and stealth headless settings to further strengthen these tactics to minimize any risks of blocking or IP banning.
$100 USD in 4 days
4.1
4.1

GOOGLE_API_KEY=AIzaSyBNLHjT1IAP8NFuzLKcqNFrlqyG4XZjo8o GEMINI_MODEL=gemini-3-flash-preview # Telegram — SEC 8-K Alerts channel SEC_TELEGRAM_BOT_TOKEN=8730417126:AAEj2UJZte4nRtln-7IzbZ-GnxyhR9VrRQM SEC_TELEGRAM_CHAT_ID=-1003945340576 # Telegram — SEC 8-K Stats channel (separate channel for daily reports) SEC_STATS_TELEGRAM_BOT_TOKEN=8910574765:AAFTWCCzqHnx-sBR3B8Iv-Ck5wCsMyCEfAE SEC_STATS_TELEGRAM_CHAT_ID=-1003983412913 # Telegram — SEC Financial Results channel (10-Q / 10-K notifications) SEC_FINANCIAL_TELEGRAM_BOT_TOKEN=8614048683:AAFSZmYYlU9ogMgyrc-DIoMUT_a9gOr_BcQ SEC_FINANCIAL_TELEGRAM_CHAT_ID=-1003965532707
$70 USD in 3 days
2.9
2.9

As an experienced freelancer with a sharp focus on web development, particularly in Python, I am confident that my skillset aligns perfectly with your Indeed Job Scraper project. At Paper Perfect, we have an extensive history of delivering tailored and scalable solutions across various industries and technical verticals. Our portfolio includes many successful scraping projects similar to your requirements. For client satisfaction, we prioritize clean and readable codebases that are flexible for future growth. This approach ensured our previous projects' longevity and efficacy;__( provide example protocol)__ which should translate well for the data throughput you need for maintaining your dataset's freshness. Anti-blocking strategies are paramount for efficient scrapes, and I intend to implement proxy rotation and other headless browsing techniques to minimize encounters with Indeed's stringent policies. In addition, our commitment to comprehensive testing will iron out any bottlenecks or complexities early on - rest assured your scraper will rotate through job listings smoothly and without downtime. Completing the clone of a company's website even when it isn't obviously mentioned is also part of our scraping expertise. Utilizing a depth search parser method enhanced by an effective use of one-time search pass remains reliable in drawing out accurate data giving you an unwavering thread of proprietary information which I will diligently
$140 USD in 7 days
2.6
2.6

Bangalore, India
Payment method verified
Member since Apr 19, 2016
₹1500-12500 INR
₹1500-12500 INR
$250-750 USD
$250-750 USD
₹12500-37500 INR
$2-8 USD / hour
$10-30 USD
$30-250 USD
$100-500 USD
$250-750 USD
$8-15 USD / hour
$250-750 USD
₹1500-12500 INR
₹12500-37500 INR
₹600-1500 INR
$250-750 USD
₹12500-37500 INR
$30-250 USD
$30-250 USD
$10-30 USD
$25-50 USD / hour
$750-1500 USD
₹12500-37500 INR
$15-25 USD / hour
$600-1000 USD