I have a large data set (7500*5 matrix). The following should be performed on the dataset using Matlab.
2) Minimum spanning tree
3) any other method or algorithm from literature
to find optimum number of clusters in the dataset, use clustering index and justify optimum number of clusters.
Use PCA to do dimension reduction:
1) find min number of principal components needed to part ion the data into best number of clusters.
2) explain how you obtained such optimal number of principal components.