I want to create a Difficulty list (a priori rather than after testing students) of words.
My theory is the Difficulty of English words can approximately be predicted according to3 criteria:
1) Frequency
2) Structure of Word - the easiest to learn being a simple compound like teacup and the hardest a Root like 'weird'
3) Etymological origin in descending order of difficulty: english, foreign, Greek, Latin , French (thisis for French students)
A friend devised a rough formula in xl (in column Difficulty Ratio of attached xl file) based on notion Frequency is primary parameter and is multiplied by a coefficient: Word Structure x Origin/30x5 . But he suggests there is a problem in gap between easy end and hard end not being pronounced as the coefficient is linear. Perhaps a logarithm or square root like eg log word struc+ log origin/log30 + log5
Also influence of origins might prove to be more important than Word structure and may need to be multiplied.
I have attached a very reduced DB in number of items including the 3 parameters
The Difficulty Score should be on a scale 1-100 from least to most difficult. and spaced roughly evenly over the complete db of 50k words
## Deliverables
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
## Platform
xp