Has to be original and not copied from the internet or anywhere else.
1) choose a data set
2) indentify a data mining problem that you might be able to solve
3) identify a technique that might be useful for mining the data
4) clean up and prepare the data for analysis
5) run the technique on your data
6) interpret your results and if necassary change steps 3-5 above
7) draw your conclusions from your data analysis
Suggestions: one very useful source of data is:
[url removed, login to view]
Another source of data us: [url removed, login to view]
There is also [url removed, login to view]
There are countless other possibilities. For data analysis, use Clementine, SPSS, TANAGRA or whatever you want.
What is required
1) Description of Problem
This can be broken down into two parts:
1-What is the data set?(where did it come from(reference),what else do you know about it?( Background Reading)
2- What do you propose to do with it? Is is a matter of classification, something more exploratory?, e.g. clustering
please not that your problem does not have to be highly original. For example you could choose to analyse hurricane data as a classification problem. The classification system is well known, but what matters is how you go about it, not what you find out
2) Analysis of the Data
1-What do you propose to do with the data?
2-What technique do you propose to use and why?
3- How do you propose carrying out your analysis?
Ok, your data mining software has spat out the results, but what do they mean?