This is a test "project". More like a quiz. The price is so low, because 1) it's an easy task that I might give a student on a test and 2) if there is interest, I'll probably have a few people actually try and solve the problem.
I'm looking for someone to whip up an efficient CUDA and/or OpenCL algorithm for pre-computing standard scores (see attached document for definition). This is a small portion of a larger project, and I'm testing out if there is interest and ability.
This algorithm is preparatory work to allow us to take a very large number of time series and calculate the Pearson and/or Spearman correlations for every pair in as fast, precise, and efficient a manner as possible. Since the Spearman, once rank order is established, can devolve into the Pearson equation, this algorithm is useful for both cases. Since the Pearson can be calculated as the n-1 weighted mean of the sum of the products of the standard scores of two series, pre-computing the standard scores saves a lot of FLOPs.
Check my math, as it's late at night after a long trip, but it should be (mostly) right.
Try to make the algorithm maintain as much precision as possible, so taking out the (1/(n-1)) to be used at a later date in the pearson() function may be a good idea if possible - I don't know, since I have not thought it through. We're using finite precision math after all, and little errors propagate. We want to compensate for significant figure errors at the end rather than the middle wherever possible.
Make any assumptions you'd like about the data structures and function signatures, just assume that you'll get an array (or pointer to an array) and an array length. Assume the array is already on the GPU RAM. This algorithm will be part of pre-processing and will not require as much memory management as subsequent ones, so whatever works well will do.
If you pick CUDA or OpenCL explain why. Treat this as a "homework problem" or an "exam problem".
If you're interested in authorship in any potential resultant articles, let me know. If I feel this freelancer thing works out and is less cost and work than a postdoc, it will be for a scientifically relevant side project (i.e., not my current top priority, but worth seeing if there is interest). The code will likely be released to the public.
I'd like to work on this project for you. But I need < 3 days because I'm busy at this moment ( I have another important project to be completed in 1-2 days).
Regards,
Tam.
Hi! I'm a Duke University graduate with degrees in Computer Science and Neuroscience. I can do this in CUDA (because I don't know much about openCL). PM me so we can talk. Thanks for your consideration!
Doing a prototype of CUDA algorithm is not very easy especially for research purpose because we want to understand the original algorithm deeply and then design a new parallel algorithm.
After than we still want to understand the performance limit and do further optimization.
So I think there are lots of work for your project if we want to deliver a great implementation.
Meanwhile, it's not your high priority so I set a long time.
Contact me for further discussion.