Algorithm |
Advantages |
Disadvantages |
Artificial Neural Network |
Good approximation of nonlinear functions |
Black-box model |
|
Easily parallelized |
Susceptible to local minima |
|
Good predictive power |
Non-trivial architecture: many adjustable parameters |
|
Extensively used in astronomy |
Sensitive to noise |
|
Robust to irrelevant or redundant attributes |
Can overfit |
|
Training can be slow |
|
No missing values |
Decision Tree |
Popular real-world data mining algorithm |
Can generate large trees that require pruning |
|
Can input and output numerical or categorical variables |
Generally poorer predictive power than ANN, SVM or kNN |
|
Interpretable model, easily converted to rules |
Can overfit |
|
Robust to outliers, noisy or redundant attributes, missing values |
Many adjustable parameters |
|
Good computational scalability |
Building the tree can be slow (data sorting) |
Genetic algorithm |
Able to avoid getting trapped in local minima |
Not guaranteed to find the local minimum |
|
Easily parallelized |
Non-trivial choice of fitness function and solution representation |
|
Can be slow to converge |
Support Vector Machine |
Copes with noise |
Harder to classify > 2 classes |
|
Gives expected error rate |
No model is created |
|
Good predictive power |
Long training time |
|
Popular algorithm in astronomy |
Poor interpretability |
|
Can approximate nonlinear functions |
Poor at handling missing or irrelevant attributes |
|
Good scalability to high-dimensional data |
Can overfit |
|
Gives unique solution (no local minima) |
Some adjustable parameters |
k-Nearest Neighbor |
Uses all available information |
Computationally intensive (but can be mitigated, e.g., k-d tree) |
|
Does not require training |
No model is created |
|
Easily parallelized |
Susceptible to noise and irrelevant attributes |
|
Few or no adjustable parameters |
Performs poorly with high-dimensional data |
|
Good predictive power |
|
Uses numerical data |
|
Conceptually simple, and easy to code |
|
Good at handling missing values and outliers |
k-means clustering |
Well-known, simple, popular algorithm |
Susceptible to noise |
|
Many extensions, e.g., |
Biased towards finding spherical clusters |
|
probabilistic cluster membership |
Has difficulties with different densities in the clusters |
|
constrained k-means, guided by training data |
Affected by outliers |
|
Requires numerical inputs if using Euclidean distance |
|
Not guaranteed to find local minimum |
|
Basic algorithm requires no overlap between classes |
Mixture Models & Expectation Maximization |
Gives number of clusters in the data |
Can be biased toward Gaussians |
|
Suitable for clusters of differerent density, size and shape |
Local minima |
|
Copes with missing data |
Doesn't work well with a large number of components |
|
Can give class labels for semi-supervised learning |
Can be slow to converge |
|
Requires numerical data, not categorical |
Kohonen Self-Organizing maps |
Well-known, popular algorithm |
Susceptible to noise |
|
Requires normalized data |
|
Susceptible to outliers |
|
Often requires a large number of iterations through the data |
Decompositions (SVD, PCA, ICA) |
PCA is extensively used in astronomy |
Applicable to numerical data only |
|
Can be extended to the nonlinear case and noisy data |
Requires whitening of the data as preprocessing step |
|
Sensitive to outliers if maximum likelihood is used |