| Algorithm | Advantages | Disadvantages | 
| Artificial Neural Network | Good approximation of nonlinear functions | Black-box model | 
|  | Easily parallelized | Susceptible to local minima | 
|  | Good predictive power | Non-trivial architecture: many adjustable parameters | 
|  | Extensively used in astronomy | Sensitive to noise | 
|  | Robust to irrelevant or redundant attributes | Can overfit | 
|  | Training can be slow | 
|  | No missing values | 
| Decision Tree | Popular real-world data mining algorithm | Can generate large trees that require pruning | 
|  | Can input and output numerical or categorical variables | Generally poorer predictive power than ANN, SVM or kNN | 
|  | Interpretable model, easily converted to rules | Can overfit | 
|  | Robust to outliers, noisy or redundant attributes, missing values | Many adjustable parameters | 
|  | Good computational scalability | Building the tree can be slow (data sorting) | 
| Genetic algorithm | Able to avoid getting trapped in local minima | Not guaranteed to find the local minimum | 
|  | Easily parallelized | Non-trivial choice of fitness function and solution representation | 
|  | Can be slow to converge | 
| Support Vector Machine | Copes with noise | Harder to classify > 2 classes | 
|  | Gives expected error rate | No model is created | 
|  | Good predictive power | Long training time | 
|  | Popular algorithm in astronomy | Poor interpretability | 
|  | Can approximate nonlinear functions | Poor at handling missing or irrelevant attributes | 
|  | Good scalability to high-dimensional data | Can overfit | 
|  | Gives unique solution (no local minima) | Some adjustable parameters | 
| k-Nearest Neighbor | Uses all available information | Computationally intensive (but can be mitigated, e.g., k-d tree) | 
|  | Does not require training | No model is created | 
|  | Easily parallelized | Susceptible to noise and irrelevant attributes | 
|  | Few or no adjustable parameters | Performs poorly with high-dimensional data | 
|  | Good predictive power | 
|  | Uses numerical data | 
|  | Conceptually simple, and easy to code | 
|  | Good at handling missing values and outliers | 
| k-means clustering | Well-known, simple, popular algorithm | Susceptible to noise | 
|  | Many extensions, e.g., | Biased towards finding spherical clusters | 
|  | probabilistic cluster membership | Has difficulties with different densities in the clusters | 
|  | constrained k-means, guided by training data | Affected by outliers | 
|  | Requires numerical inputs if using Euclidean distance | 
|  | Not guaranteed to find local minimum | 
|  | Basic algorithm requires no overlap between classes | 
| Mixture Models & Expectation Maximization | Gives number of clusters in the data | Can be biased toward Gaussians | 
|  | Suitable for clusters of differerent density, size and shape | Local minima | 
|  | Copes with missing data | Doesn't work well with a large number of components | 
|  | Can give class labels for semi-supervised learning | Can be slow to converge | 
|  | Requires numerical data, not categorical | 
| Kohonen Self-Organizing maps | Well-known, popular algorithm | Susceptible to noise | 
|  | Requires normalized data | 
|  | Susceptible to outliers | 
|  | Often requires a large number of iterations through the data | 
| Decompositions (SVD, PCA, ICA) | PCA is extensively used in astronomy | Applicable to numerical data only | 
|  | Can be extended to the nonlinear case and noisy data | Requires whitening of the data as preprocessing step | 
|  | Sensitive to outliers if maximum likelihood is used |