
Python Machine Learning By Example
By :

A machine learning system is fed with input data—this can be numerical, textual, visual, or audiovisual. The system usually has an output—this can be a floating-point number, for instance, the acceleration of a self-driving car, or an integer representing a category (also called a class), for example, a cat or tiger from image recognition.
The main task of machine learning is to explore and construct algorithms that can learn from historical data and make predictions on new input data. For a data-driven solution, we need to define (or have it defined by an algorithm) an evaluation function called a loss or cost function, which measures how well the models learn. In this setup, we create an optimization problem with the goal of learning most efficiently and effectively.
Depending on the nature of the learning data, machine learning tasks can be broadly classified into the following three categories:
Supervised learning is commonly used in daily applications, such as face and speech recognition, product or movie recommendations, sales forecasting, and spam email detection.
The following diagram depicts the types of machine learning tasks:
Figure 1.4: Types of machine learning tasks
As shown in the diagram, we can further subdivide supervised learning into regression and classification. Regression trains on and predicts continuous-valued responses, for example, predicting house prices, while classification attempts to find the appropriate class label, such as analyzing a positive/negative sentiment and predicting a loan default.
If not all learning samples are labeled, but some are, we have semi-supervised learning. This makes use of unlabeled data (typically a large amount) for training, besides a small amount of labeled data. Semi-supervised learning is applied in cases where it is expensive to acquire a fully labeled dataset and more practical to label a small subset. For example, it often requires skilled experts to label hyperspectral remote sensing images, while acquiring unlabeled data is relatively easy.
Feeling a little bit confused by the abstract concepts? Don’t worry. We will encounter many concrete examples of these types of machine learning tasks later in this book. For example, in Chapter 2, Building a Movie Recommendation Engine with Naïve Bayes, we will dive into supervised learning classification and its popular algorithms and applications. Similarly, in Chapter 5, Predicting Stock Prices with Regression Algorithms, we will explore supervised learning regression.
We will focus on unsupervised techniques and algorithms in Chapter 8, Discovering Underlying Topics in the Newsgroups Dataset with Clustering and Topic Modeling. Last but not least, the third machine learning task, reinforcement learning, will be covered in Chapter 15, Making Decisions in Complex Environments with Reinforcement Learning.
Besides categorizing machine learning based on the learning task, we can categorize it chronologically.
In fact, we have a whole zoo of machine learning algorithms that have experienced varying popularity over time. We can roughly categorize them into five main approaches: logic-based learning, statistical learning, artificial neural networks, genetic algorithms, and deep learning.
The logic-based systems were the first to be dominant. They used basic rules specified by human experts, and with these rules, systems tried to reason using formal logic, background knowledge, and hypotheses.
Statistical learning theory attempts to find a function to formalize the relationships between variables. In the mid-1980s, artificial neural networks (ANNs) came to the fore. ANNs imitate animal brains and consist of interconnected neurons that are also an imitation of biological neurons. They try to model complex relationships between input and output values and capture patterns in data. ANNs were superseded by statistical learning systems in the 1990s.
Genetic algorithms (GA) were popular in the 1990s. They mimic the biological process of evolution and try to find optimal solutions, using methods such as mutation and crossover.
In the 2000s, ensemble learning methods gained attention, which combined multiple models to improve performance.
We have seen deep learning become a dominant force since the late 2010s. The term deep learning was coined around 2006 and refers to deep neural networks with many layers. The breakthrough in deep learning was the result of the integration and utilization of Graphical Processing Units (GPUs), which massively speed up computation. The availability of large datasets also fuels the deep learning revolution.
GPUs were originally developed to render video games and are very good in parallel matrix and vector algebra. It’s believed that deep learning resembles the way humans learn. Therefore, it may be able to deliver on the promise of sentient machines. Of course, in this book, we will dig deep into deep learning in Chapter 11, Categorizing Images of Clothing with Convolutional Neural Networks, and Chapter 12, Making Predictions with Sequences Using Recurrent Neural Networks, after touching on it in Chapter 6, Predicting Stock Prices with Artificial Neural Networks.
Machine learning algorithms continue to evolve rapidly, with ongoing research in areas including transfer learning, generative models, and reinforcement learning, which are the backbone of AIGC. We will explore the latest developments in Chapter 13, Advancing Language Understanding and Generation with the Transformer Models, and Chapter 14, Building an Image Search Engine Using CLIP: a Multimodal Approach.
Some of us may have heard of Moore’s law—an empirical observation claiming that computer hardware improves exponentially with time. The law was first formulated by Gordon Moore, the co-founder of Intel, in 1965. According to the law, the number of transistors on a chip should double every two years. In the following diagram, you can see that the law holds up nicely (the size of the bubbles corresponds to the average transistor count in GPUs):
Figure 1.5: Transistor counts over the past decades
The consensus seems to be that Moore’s law should continue to be valid for a couple of decades. This gives some credibility to Ray Kurzweil’s predictions of achieving true machine intelligence by 2029.