I have a one file php article generator/scraper script, a slight variant version of Tomas Vacilando's wikipedia reflection, that produces a set of article pages to first link depth only, (will fetch roughly 10 - 200 web pages in total when applied to wikipedia depending on the nature of the selected start page)
I am trialing this script within a "information about" link on one of my live but development websites, it seems to work very well, the result is an up to date automatic handy little GNU FDL information booklet about my subject of my website thanks in kind to wikipedia contributors.
Before I enable this same script on my few other websites that have a little more traffic in order to prevent excessive and unfair load on wikipedia mainly due to spiders and bots scanning my local content I would like to cache all "Web Fetched" content from the reflection script locally on my server (LAMP/CPANEL), including html and images! and all other fetched files, for a period of X weeks or days.
so, what it requires is a wrapper around my reflection script that
1, detects if the contents do not exist or have expired allows my script to run saving the result buffer in cache,
2, if the content is in cache and fresh delivers the like html and files from my cache with zero web fetching.
the result will allow the delivered content and my reflection script should only touch wikipedia for updated pages about once a month.
I would like the cache to be stored as a common directory of files rather than mysql, however I am happy to go with a mysql cache if you think this is a better way to go, reason is there is an important feature I want about this cache
my websites are on the same server and each of my websites has a similar subject matter, each has a dedicated wikipedia page, therefore second level wikipedia links that I will fetch for my different website are often common, in order to conserve server space I should only cache this common content once.
therefore a Shared Cache of files and images I believe would be best.
This would require a talented PHP programmer however I don't believe this would be a large or complex php script, you may be able to implement this in just a few lines of code however,
I am open to using an existing open source solution if you are familiar with one already available.
I would like to include some simple method for monitoring the cache progression, a dedicated bbclone invoker inserted at some strategic point in the cache or reflection script I figured may be a way to go? just some sort of simple monitoring of the cache perhaps a logger of Cache hits/which website/which page/cache updates/perhaps total cache size. something like that.
again perhaps there is an existing (lite) open source solution you know of for this part of the function, I will leave that to your recommendation that we can agree on.
Documentation of the script, nothing more that a few in script comments about what it is doing or where to adjust the cache frequency is required. I can follow most PHP code.
Time Frame: this is not urgent, I am flexible.
PM Me with your proposal or if you would like to take a look at my development URL, Valilandos script is at [url removed, login to view], my version of this script is similar however nobbled to be not as far reaching or unfair on wikipedia.
I will pay via GAF Escrow in order to leave you feedback. I leave detailed and very high praise feedback for good work,
Thank you so much for considering my task.
6 pekerja bebas membida secara purata $172 untuk pekerjaan ini
Will cache pages locally, allow you to set and define cache lifetime, provide administration page where you can clear cached pages or regenerate them, etc. Please see PM.