Help with SAS - Non-parametric bootstrap to estimate the 'coefficient of variation' of a dataset

Below is the outline of what I need to do for a mock coursework assignment at University. There are a few more notes to help if you accept the assignment plus a dataset.

This is the scenario:

Whilst working on a foot ulcer trial application as a medical statistician, a colleague asks you to

provide assistance in designing a surgical trial in colorectal cancer. A surgeon has

approached your department for help with an application for a trial looking at

changes in patient weight before and after a surgical procedure.

In the consultancy session, the discussion turned to some datasets from a trial of surgical

techniques for colorectal cancer that you have previously worked on, so this data could help inform the design of your colleague’s trial. However, the discussion concluded that the key parameter is the COEFFICIENT OF VARIATION (CV) of the patients’ Body Mass Index (BMI). To complicate matters, your colleague needs you to provide the CV for your data, but also a

plausible range of values that the CV could take.

The CV is defined as the standard deviation divided by the mean. An expression for

the distribution of a CV is not straightforward. While SAS can provide a standard

error and confidence interval for the mean of observations, a variance or standard

error for the CV is not available by default.

In situations where you need to know the distribution of a parameter, and cannot easily obtain this analytically, you can use computationally-intensive methods to simulate a possible distribution based on the underlying data. One such approach is the non-parametric bootstrap using PROC SURVEYSELECT. A call to PROC SURVEYSELECT would look something like this:

proc surveyselect data=<DATASET> out=<DATASET>

<sampsize=...> <method=...>

<other options>



You need to implement a non-parametric bootstrap to estimate

the CV of BMI for patients in your workshop dataset.

You need to choose the PROC SURVEYSELECT options to ensure that this procedure correctly performs the non-parametric bootstrap to analyse the results in each sample.

You must:

1) Estimate the Coefficient of Variation of the Body Mass Index of the patients in

the baseline dataset. You will need to first derive the BMI for the patients in your baseline dataset;

2) Using SAS, implement the non-parametric bootstrap using PROC

SURVEYSELECT as described above to draw a suitable number of bootstrap

samples from the baseline dataset;

3) Include one or more PROC steps to summarise the results of your

simulations, so that your colleague can investigate the impact if the true value

is in a plausible range of values.

4) Provide your SAS code and the full SAS Output to show what was produced.

 1-2 sides of A4 incl. comments - each step must have a comment explaining what you have done

 Submit your SAS code as a SAS code file, or paste the code into Word. You

should also save your plain text output OR your HTML output and include that

 Use size 10pt Courier typeface, and single line spacing, as the text

appears in SAS.

 Readable code is important as well as my being able to reproduce your output from the code provided.

Must be done by Friday 16:00 GMT! Please only bid if you have access to SAS and can code it. Thanks :)

