General machine learning techniques
The two most widely adopted machine learning techniques are supervised learning and unsupervised learning. The vast majority of machine learning (about 70%) is supervised learning. Unsupervised learning accounts for 10-20% of the total. In addition, semi-supervised learning and reinforcement learning are also used.
Supervised learning algorithms
use a labeled example (such as an input with a known desired output) to perform training. For example, suppose a device has a data point labeled "F" (in failure) or "R" (in operation). This learning algorithm takes a set of inputs and their corresponding correct outputs and compares their (algorithm) output with the correct output to detect errors. This is learning. Then, on your own, make appropriate improvements to the model. In supervised learning, techniques such as classification, regression, prediction, and gradient boosting are used to predict label values for unlabeled data using several patterns. Supervised learning is commonly used to predict what is likely to happen in the future from historical data. For example, it is possible to identify cases of suspected fraudulent credit card transactions and policyholders who are likely to make claims.
Semi-supervised learning
is used in the same way as supervised learning. However, this technique uses both labeled and unlabeled data for training. A typical case is when you use a small amount of labeled data and a large amount of unlabeled data (because unlabeled data is less expensive and less labor intensive to obtain). This learning technique can be used in combination with techniques such as classification, regression, and prediction. Semi-supervised learning is useful when trying to train with only labeled data is too costly. An early application of this learning is face recognition on webcams.
Unsupervised learning
is used for data that does not have a history label. This technique does not give the learning algorithm the "correct answer". The algorithm itself has to figure out the meaning of the data. The purpose of this technique is to explore the data and find some structure inside it. Unsupervised learning is useful for transactional data. For example, you can identify a segment of customers with similar attribute values (combinations) to develop specific activities for that segment in your marketing campaign. You can also reveal the key attribute values (combinations) that distinguish the customer segment. Commonly used techniques include self-organizing maps (SOMs), neighbor mapping, k-means clustering, and singular value decomposition. These algorithms are also used for text topic segmentation, product recommendations, data outlier identification, and more.
Reinforcement learning
is often used in robotics, gaming, and navigation. Reinforcement learning algorithms determine through trial and error which behavior produces the greatest reward. This learning method has three main components: the agent (learner or decision maker), the environment (everything that interacts with the agent), and the action (the action that the agent can do). The goal for agents is to choose behaviors that maximize the expected rewards over a limited period of time. The more appropriate policies the agents follow, the faster they can reach their goals. In other words, the purpose of reinforcement learning is to learn the best policy.