More...More...More...More...More...More...AI 101

This section provides an Artificial Intelligence 101, including a basic overview, a summary of Supervised, Unsupervised and Reinforcement Learning, as well as Deep Learning and Artificial Neural Networks.

Introduction

Artificial Intelligence (AI) is not a new concept. Over the last couple of decades, it has experienced several hype cycles, which alternated with phases of disillusionment and funding cuts ("AI winter"). The massive investments into AI by today's hyperscalers and other companies have significantly fueled the progress made with AI, with many practical applications now being deployed.

A highly visible break-through event was the development of AlphaGo (developed by DeepMind Technologies, which was later acquired by Google), which in 2015 became the first computer Go program to beat a human professional Go player without handicap on a full-sized 19×19 Go board. Until then, Go was thought of as being "too deep" for a computer to master on the professional level. AlphaGo uses a combination of machine learning and tree search techniques.

Many modern AI methods are based on advanced statistical methods. However, finding a commonly accepted definition of AI is not easy. A quip in Tesler's Theorem says "AI is whatever hasn't been done yet". As computers are becoming increasingly capable, tasks previously considered to require intelligence are later often removed from the definition of AI. The traditional problems of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and the ability to move and manipulate objects [1].

Most likely the currently most relevant AI method is Machine Learning (ML). ML refers to a set of algorithms that improve automatically through experience and by the use of data[2]. Within ML, an important category is Deep Learning (DL), which utilizes so called multi-layered neural networks. Deep Learning includes Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs), amongst others. See below for an example of a CNN.

The three most common ML methods include Supervised, Unsupervised and Reinforcement Learning. The Supervised Learning method relies on manually labeled sample data, which are used to train a model so that it can then be applied to similar, but new and unlabeled data. The unsupervised method attempts to automatically detect structures and patterns in data. With reinforcement learning, a trial and error approach is combined with rewards or penalties. Each method is discussed in more detail in the following sections. Some of the key concepts common to these ML methods are summarized in the table following.

Key AI Terms and Definitions

Supervised Learning

The first AI/ML method we want to look at is Supervised Learning. Supervised Learning requires a data set with some observations (e.g., images) and the labels of the observations (e.g., classes of objects on these images, such as "traffic light", "pedestrian", "speed limit", etc.).

Supervised Learning

The models are trained on these labeled data sets, and can then be applied to previously unknown observations. The supervised learning algorithm produces an inference function to make predictions about new, unseen observations that are provided as input. The model can be improved further by comparing its actual output with the intended output: so-called "backward propagation" of errors.

The two main types of supervised models are regression and classification:

  • Classification: The output variable is a category, e.g., "stop sign", "traffic light", etc.
  • Regression: The output variable is a real continuous value, e.g., electricity demand prediction

Some widely used examples of supervised machine learning algorithms are:

  • Linear regression, mainly used for regression problems
  • Random forest, mainly used for classification and regression problems
  • Support vector machines, mainly used for classification problems

Unsupervised Learning

The next ML method is Unsupervised Learning, which is a type of algorithm that learns patterns from unlabeled data. The main goal is to uncover previously unknown patterns in data. Unsupervised Machine Learning is used when one has no data on desired outcomes.

Unsupervised Learning

Typical applications of Unsupervised Machine learning include the following:

  • Clustering: automatically split the data set into groups according to similarity (not always easy)
  • Anomaly detection: used to automatically discover unusual data points in a data set, e.g., to identify a problem with a physical asset or equipment.
  • Association mining: used to identify sets of items that frequently occur together in a data set, e.g., "people who buy X also tend to buy Y"
  • Latent variable models: commonly used for data preprocessing, e.g., reducing the number of features in a data set (dimensionality reduction)

Reinforcement Learning

The third common ML method is Reinforcement Learning (RL). In RL, a so-called Agent learns to achieve its goals in an uncertain, potentially complex environment. This can be, for example, a game-like situation, where the agent is deployed into a simulation where it receives rewards or penalties for the actions it performs. The goal of the agent is to maximize the total reward.

Reinforcement Learning

One main challenge in Reinforcement Learning is to create a suitable simulation environment. For example, the RL environment for training autonomous driving algorithms must realistically simulate situations such as braking and collisions. The benefit is that it is usually much cheaper to train the model in a simulated environment, rather than risking damage to real physical objects by using immature models. The challenge is then to transfer the model out of the training environment and into the real world.

Deep Learning and Artificial Neural Networks

A specialized area within Machine Learning are so-called Artificial Neutral Networks or ANNs (often simply called Neural Networks). ANNs are vaguely inspired by the neural networks that constitute biological brains. An ANN is represented by a collection of connected nodes called neurons. The connections are referred to as edges. Each edge can transmit signals to other neurons (similar to the synapses in the human brain). The receiving neuron processes the incoming signal, and then signals other neurons that are connected to it. Signals are numbers, which are computed by statistical functions.

The relationship between neurons and edges is usually weighted, increasing or decreasing the strength of the signals. The weights can be adjusted as learning proceeds. Usually, neurons are aggregated into layers, where different layers perform different transformations on their input signals. Signals travel through these layers, potentially multiple times. The adjective "deep" in Deep Learning is referring to the use of multiple layers in these networks.

A popular implementation of ANNs are Convolutional Neural Networks (CNNs), which are often used for processing visual and other two-dimensional data. Another example is Generative Adversarial Networks, where multiple networks compete with each other (e.g., in games).

Example: Convolutional Neural Network

The example shows a CNN and its multiple layers. It is used to classify areas of an input image into different categories such as "traffic light" or "stop sign". There are four main operations in this CNN:

  • Convolution: Extract features from the input image, preserving the spatial relationship between pixels by using small squares of input data. Convolution is a linear operation: it performs elementwise matrix multiplication and addition.
  • Non Linearity: ReLU (Rectified Linear Unit) is an operation applied after the convolution operations. ReLU introduces non-linearity in the CNN, which is important because most real-world data are non-linear.
  • Spatial Pooling/down-sampling: This step reduces the dimensionality of each feature map, while retaining the most important information.
  • Classification (Fully Connected Layer): The outputs from the first three layers are high-level features of the input image. The Fully Connected Layer uses these features to classify the input image into various classes based on the training dataset.

A more detailed explanation of a similar example is provided by Ujjwal Karn on KDNuggets.

Deep Learning: Generative AI

Generative AI is a subset of Deep Learning. It is a type of Artificial intelligence that creates new content based on what it has learned from existing content. The learning process is abstracting the data probability distributions by training large-scale datasets, and after that, it produces a statistical model. Users are usually interacting with a Generative AI via a so-call prompt, for example a question asked to the system. The prompt can include specifics like information on how the answer should be structured, e.g. number of words to be generated.

Through the given prompts, the generative AI utilizes the statistical model to forecast the expected response, generating new content.

Generative models can be divided into two types: generative language models and generative image/video models.

Generative language models are based on Natural Language Processing (NLP) techniques to generate new text by learning the laws and patterns of a given language. They usually have billions or larger orders of magnitude of parameters by training based on large-scale textual data like news, articles, novels, or web content, also called Large Language Models (LLMs). Examples for currently popular LLMs are GPT-3 and GPT-4 (OpenAI / ChatGPT), LaMDA (Google Bard), and LLaMA.

Example: ChatGPT

The fundamental concept behind these LLMs relies on a deep learning model known as the "Transformer" architecture. During model training, LLMs encode each word in the input text and convert it into a vector representation called Word Embedding. The "Transformer" architecture uses an attention mechanism to better understand the correlation between different Word Embeddings and can effectively handle long text dependencies. Through this mechanism, LLMs can infer the correct context in the training task to predict the next word's probability distribution accurately. In a real-world application, such as a Chatbot, it will first convert the user input into a complete sequence of text and then feed it to LLMs, which will use the previous words to predict the next one until a complete response is generated.

Example: Stable Diffusion

Generative image models are based on Computer Vision (CV) technique that generate new images by learning the features and structure of the existing images. The more classical model is the generative adversarial network (GAN) -- it comprises a generator that produces fake images and a discriminator that distinguishes between real and fake images, with both networks competing to generate realistic images through iterative iterations. Diffusion Models has received extensive attention in 2023, most strikingly for its excellent performance on the text-to-Image task. It passes randomly sampled Gaussian noise into the model and then generates data by learning the denoising process. The performance of OpenAI's DALL-E 2, Google Brain's Imagen, and StabilityAI's Stable Diffusion are approaching the quality of real photographs and human-drawn art.

Generative AI has a large market in text (General writing, Note-taking, Marketing, Sales, Support, etc.), code (code generation, code documentation, text to SQL, web application builders, etc.), images (Image generation Media Advertising, Design, etc.), speech, video, 3D, and more.

An outlook on the use of Generative AI in Manufacturing is given by BCG here

Summary: AI & Data Analytics

The field of data analytics has evolved over the past decades, and is much broader than just AI and data science - so it is important to understand where AI/ML is fitting in. From the point of view of most AIoT use cases, there are four main types of analytics: descriptive, diagnostic, predictive and prescriptive analytics. Descriptive analytics is the most basic one, using mostly visual analytics to address the question "What happened?". Diagnostic analytics utilizes data mining techniques to answer the question "Why did it happen?", providing some king of root cause analysis. Data mining is the starting point of data science, with its own specific methods, processes, platforms and algorithms. AI - predominantly ML - often addresses the questions "What is likely to happen?" and "What to do about it?". Predictive analytics provides forecasts and predictions. Prescriptive analytics can be utilized, for example, to obtain detailed recommendations as work instructions, or even to enable closed-loop automation.

Analytics

References

  1. Russell, Stuart J.; Norvig, Peter (2003), Artificial Intelligence: A Modern Approach (2nd ed.), Upper Saddle River, New Jersey: Prentice Hall, ISBN 0-13-790395-2
  2. Mitchell, Tom (1997). Machine Learning. New York: McGraw Hill. ISBN 0-07-042807-7. OCLC 36417892.