Make a function that produces a regex pattern to identify URLs of interest

Suppose we are intending to scrape a job portal, [login to view URL], which virtually contains many external sublinks, such as:

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

The idea is to apply a difference checker algorithm which yields a generic regex that matches the above routes, considering variable parts of the URLs, based on whether they yielded jobs or not.

Build a function, generatePattern(routes), where routes is an array of object having:

URL: str

hasYieldedJob: bool

In the above example, all the links except the last 3 ones yielded jobs, so, the perfect (fictive pattern) regex would be:

/job/{any number}/{any string}/?{any string}

Case scenarios

Query parameters should be considered as variables due to their complexity.

We do not want to apply a constant rule upon them, even if in the given dataset of urls they are the same. So if we have “/job/foo?parameter=true”, pattern will be “/job/foo{any string}”. Additional brainstorming is welcome.

- If routes contain hyphens, say ".../foo-bar/...", no matter if the part is invariant within the supplied urls, it will be considered as ".../{any string}/..."

Kemahiran: Python, Regular Expressions

Lihat lagi: mysql regex pull image urls, code make function search site, make function export database file zip file cake php, make function rbf matlab, php make function create recordset, regex pattern finder, rewritecond nocase option non regex pattern supported, java regex pattern matches, java util regex pattern example, java util regex pattern examples, regex pattern java example, make your own camo pattern online, how to make a repeat textile pattern, how to make function for insert query in php, how to make your own pants pattern, how to make a seamless repeat pattern in photoshop, how to make a digital sewing pattern, how to make a 3d plush pattern, how to make a cross stitch pattern in excel, how to make a stuffed animal pattern

Tentang Majikan:
( 9 ulasan ) Piazza Armerina, Italy

ID Projek: #29045421

10 pekerja bebas membida secara purata $102 untuk pekerjaan ini


I have experience in python for Regex generator checker for Licene plate checker. Links to some previous projects: https://www.freelancer.com/projects/html/Project-for-Shadab https://www.freelancer.com/projects/pytho Lagi

$140 USD dalam 7 hari
(29 Ulasan)

Dear Client Warm Greetings, I have been Python Developer for 3+ years and have experience of Building Management, Distributed, Database Applications. with Machine Learning, Ensemble Learning, Deep Learning implementat Lagi

$111 USD dalam sehari
(6 Ulasan)

Dear employer, Hi I can develop the code to find the URLs which has yielded job. I read the description carefully and got exactly what you want. I am a computer programmer with more than 10 years of working experienc Lagi

$100 USD dalam 7 hari
(9 Ulasan)

Hello Sir, I have previous knowledge and experience with regex. I think I can meet your requirements. Inbox me please so I can help. Thanks

$70 USD dalam 7 hari
(6 Ulasan)

NOTE : I HAVE EXPERTISE IN WEB SCRAPING. With respect to this project I would like to present myself as a candidate for your consideration. I have more than 12 years of IT experience. I have successfully completed pro Lagi

$140 USD dalam 4 hari
(1 Ulasan)

Hello Python EXPERT I have read your description and I am so interested in your project. You can see well experienced and skillful Python +15 years of experience in software development. Confident in your project and I Lagi

$140 USD dalam 7 hari
(5 Ulasan)

Hi, I can build this function using python and will give you the script of course. Ready to start right NOW. I could make a sample script for the presented details here if you wanted.

$60 USD dalam sehari
(4 Ulasan)

Hello, this is Rahaman. I will build you a pyton function to identify if the link has job or not with regex on the given website website. This job seems interesting to me. I have extensive experience in crawling websit Lagi

$75 USD dalam 2 hari
(1 Ulasan)

Hello, I am Individual freelancer. I have pretty much good experience in regular expression re library of python. I am available for this task. and will try to deliver you the script today. Waiting for your kind respon Lagi

$100 USD dalam 2 hari
(1 Ulasan)

Hi, I can get you a working version of the function you need straight away. Probably you will want to supply some additional test data, to see if you need it to account for some additional factors not present in the s Lagi

$80 USD dalam sehari
(0 Ulasan)