Machine learning algorithm cheat sheet for Microsoft Azure Machine Learning Studio SAS Algorithm Flowchart. A module might contain a particular algorithm, or perform a task that is important in machine learning, such as missing value replacement, or statistical analysis. For help with choosing algorithms, see. How to select algorithms; Azure Machine Learning Algorithm Cheat Sheet. Figure 3: Microsoft’s Machine Learning Algorithm Cheat Sheet A second challenge in implementing a machine-learning model is coding the algorithm. Azure Machine Learning helps out in this regard by providing canned implementations of 25 of the most commonly used algorithms in machine learning. The goal is to make machine learning.
This is part of the Machine Learning series.
Azure Machine Learning documentation. Learn how to train, deploy, & manage machine learning models, use AutoML, and run pipelines at scale with Azure Machine Learning. Tutorials, code examples, API references, and more show you how. This cheat sheet helps you choose the best Azure Machine Learning Studio algorithm for your predictive analytics solution. Your decision is driven by both the nature of your data and the question you’re trying to answer. START No Yes Categories Predict future data points? K-means Yes Values Predict categories or values? Data in rank-ordered.
One question that always pops up in any machine learning problem: Which algorithm should I use? What do the algorithms do anyways?
For this purpose, I got to interview the Channel 9 guru Seth Juarez (@sethjuarez) who happens to be as passionate about machine learning as I am:
After briefly going over a typical machine learning process, we have a closer look at third step, i.e. building the model:
What algorithms are out there? Which one should we use? What do they do? Thus, we covered the most common algorithms in machine learning problems - and on top of that we used some fancy visualisations to explain their doing:
- Perceptron (in AzureML: Two-Class Averaged Perceptron)
- Kernel perceptron (aka Support Vector Machines); in AzureML: Two-Class Support Vector Machine and Two-Class Locally-Deep Support Vector Machine
- Decision Trees
- Neural Netowrks
- Deep Learning
One of Microsoft's Data Scientist, Brandon Rohrer, has written a nice three-part blog series on introducing data science with no jargon:
- What Can Data Science Do For Me? Brandon explains what prerequisites are necessary for a good start of a machine learning project.
- What Types of Questions Can Data Science Answer? Here, Brandon goes through typical questions that can be covered by the three extended algorithm families:
- Supervised learning (e.g. classification, anomaly detection, regression),
- Unsupervised learning (e.g. clustering and dimensionality reduction), and
- Reinforcement learning.
- Which Algorithm Family Can Answer My Question? Brandon gives a good overview of typical questions asked in the following areas, and which algorithm to use then:
- Predictive Maintenance
- Marketing
- Finance
- Operational Efficiency
- Energy Forecasting
- Internet of Things
- Text and Speech Processing
- Image Processing and Computer Vision
Furthermore, there is one really neat cheat sheet created by Microsoft's Data Science team on when to use which algorithm:
Finally, one last resource that I hihgly recommend: Top 10 data mining algorithms in plain English. This article explains the 10 most influential algorithms (voted by 3 separate panels):
- C4.5 (decision tree)
- k-means (clustering)
- Support vector machines (next to C4.5, a classifier to try out first)
- Apriori (association rule learning --> recommendation engine)
- EM (i.e. expectation-maximization for clustering)
- PageRank (network analysis; think of the PageRank in Google's search engine)
- AdaBoost (boosting, and thus an ensemble learning algorithm; taking in and combining multiple learning algorithm)
- kNN (aka k-Nearest Neighbors, thus classification)
- Naive Bayes (family of classification algorithms assuming that all features is independent of each other)
- CART (aka classification and regression trees, thus a classifier)
This list contains algorithms of various algorithm families, including association rule learning (relevant for building recommenders).
