Create a Python script that standardizes scraped data from existing scripts before they're saved to a database

I have 10 different scraping scripts that I run through my VPS that each captures data from a website and stores that data to a database.

There is a slight problem, as the data that is being captured is inconsistent, and I want to display the data in a consistent format in my database and website. You must create a Python script that will 'standardize' the data.

FOR EXAMPLE: One of the fields that is captured on each website is 'Manufacturer'.

Website 1: 'Manufacturer' = {GE, TOSH, WST}

---> {General Electric, Toshiba, Westinghouse}

Website 2: 'Manufacturer' = {Westing., Toshiba, General elec.}

---> {Westinghouse, Toshiba, General Electric}

I want to insert a script within each of the scraping scripts that accomplishes this. Some of the filters will require Regular Expressions, so your script should be set up to be able to handle that.

** I can fill out the specifics of the arrays myself, for which words should be substituted for the terms. I just need someone with Python knowledge to construct the script and the 'template arrays' and tell me where to place it within my scripts. **

I will provide you with a sample of one of the scripts. They run Scrapy, and they are all similar enough that you will probably be able to create just one script and it will work for all of my scrapers.

The budget for this project is $50.

Kemahiran: Python, Pengikisan Web

Lihat lagi: script python scrapes data, python create data storage, scraping data python create csv, data entry python script, python send data php script, python script data site, python script data website, python script extract web data, generic php script show database data, python script data extraction csv, python script extract data website, python script read data text file, python script extract data, copy database data script, python script read url data, python script crape data, python script database input, python script extract data web page, python script reading scripts, script php create premium link megaupload

Tentang Majikan:
( 23 ulasan ) Sheridan, United States

ID Projek: #13768792

Dianugerahkan kepada:


I've written some advanced scraping scripts in python and node.js and I can pretty well envision what you are looking for. I'm thinking the config file for your scrape can be json or yaml to define your mapping. your Lagi

$50 USD dalam 3 hari
(3 Ulasan)

10 pekerja bebas membida secara purata $109 untuk pekerjaan ini


Hi sir, This is kimi and I am scraping expert, I have did too many scraping projects, please check my profile page then you will know. https://www.freelancer.com/u/mantislin.html Can you tell me Lagi

$250 USD dalam 5 hari
(231 Ulasan)
$54 USD dalam sehari
(90 Ulasan)

Hello, I have a lot of experience with web scraping and a lot of scripting experience in Python. I would love to help you with this. Please contact me for more information.

$50 USD dalam sehari
(41 Ulasan)

hello I have read your requirement. I can help you to finish this work. Can you provide more information about this project? Thank you

$50 USD dalam 3 hari
(20 Ulasan)

Hello Sir, How are you ? I read your description and I see that you have the array ready, if that's the case, then let's just start working on the project! please contact me, and thank you!

$55 USD dalam 3 hari
(6 Ulasan)

Hey i have a few questions and suggestions about the project, if we could talk in detail and make this work!

$77 USD dalam sehari
(5 Ulasan)

ILM Techno Solution.Pvt.Ltd. Noida, India. [login to view URL] [login to view URL] About ILM:- Welcome to ILM Techno Solutions Pvt. Ltd. clients ranging from Indian business giants to growing enterp Lagi

$250 USD dalam 3 hari
(2 Ulasan)

A Python and web scrapping developer here ready to discuss this further and create this script (regex) to standardize this scraped data. Could you send me the sample script so I can understand better what you exactly w Lagi

$100 USD dalam sehari
(7 Ulasan)
$155 USD dalam 3 hari
(1 Ulasan)