Artificial Intelligence 101: Difference between revisions

(Editorial changes as per digital playbook document.)
(42 intermediate revisions by 3 users not shown)
Line 1: Line 1:
__NOTOC__
<imagemap>
<imagemap>
File:0.2-AI 101.png|1200px|frameless|center|AI 101
File:0.2-AI 101.png|1200px|frameless|center|AI 101


rect 0 0 567 133 [[Digital OEM|More...]]
rect 4 4 1005 124 [[AIoT_101|More...]]
rect 567 0 1532 128 [[Digital Equipment Operator|More...]]
rect 2737 266 3539 394 [[Data_101|More...]]
rect 1528 0 2095 128 [[Hybrid Models|More...]]
rect 2737 128 3539 261 [[Artificial_Intelligence_101|More...]]
 
rect 2737 394 3539 523 [[Digital_Twin_101|More...]]
rect 2759 230 3521 306 [[Artificial Intelligence 101|More...]]
rect 2737 523 3539 655 [[Internet_of_Things_101|More...]]
rect 2759 306 3521 372 [[Internet of Things 101|More...]]
rect 2737 655 3534 788 [[AIoT Hardware|More...]]
rect 2759 368 3521 438 [[Digital Twin 101|More...]]
rect 2759 438 3521 518 [[AIoT_Framework|More...]]
 
rect 2741 540 3539 673 [[Business Model Design|More...]]
rect 2741 673 3539 802 [[Product / Solution Design|More...]]
rect 2741 802 3539 935 [[Co-Creation|More...]]
rect 2741 935 3539 1068 [[Agile AIoT Grid|More...]]
rect 2741 1068 3539 1201 [[Service_Operations|More...]]
rect 2741 1201 3539 1334 [[Product_Organization|More...]]


desc none
desc none
</imagemap>
</imagemap>


== Artificial Intelligence ==
This section provides an Artificial Intelligence 101, including a basic overview, a summary of Supervised, Unsupervised and Reinforcement Learning, as well as Deep Learning and Artificial Neural Networks.
Artificial Intelligence (AI) is not a new concept. Over the last couple of decades, it has experienced several hype cycles, which were alternating with phases of disillusionment and funding cuts ("AI winter"). The massive investments into AI by today`s hyper scalers and other companies has significantly fueled the progress made with AI, with many practical applications being deployed today.  


A highly visible break-through event was the development of AlphaGo (developed by DeepMind Technologies which was later acquired by Google), which in 2015 became the first computer Go program to beat a human professional Go player without handicap on a full-sized 19×19 Go board. Until then, Go was thought of as being "too deep" for a computer to master on the professional level. AlphaGo is using a combination of machine learning and tree search techniques.
__TOC__


Many modern AI methods are based on advanced statistical methods. However, finding a commonly accepted definition of AI is not easy. A quip in [https://en.wikipedia.org/wiki/AI_effect Tesler's Theorem] says ''"AI is whatever hasn't been done yet"''. As computers are becoming increasingly capable, tasks previously considered to require intelligence are later often removed from the definition of AI. The traditional problems of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception and the ability to move and manipulate objects<ref name="russel" />.
= Introduction =
Artificial Intelligence (AI) is not a new concept. Over the last couple of decades, it has experienced several hype cycles, which alternated with phases of disillusionment and funding cuts ("AI winter"). The massive investments into AI by today's hyperscalers and other companies have significantly fueled the progress made with AI, with many practical applications now being deployed.  


Probably the currently most relevant AI method is Machine Learning (ML). ML refers to a set of algorithms which improve automatically through experience and by the use of data<ref name="mitchell" />. Within ML, an important category is Deep Learning (DL), which utilizing so-called ''multi-layered neutral networks''. Deep Learning includes Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs), amongst others. See below for an example of a CNN.
A highly visible break-through event was the development of AlphaGo (developed by DeepMind Technologies, which was later acquired by Google), which in 2015 became the first computer Go program to beat a human professional Go player without handicap on a full-sized 19×19 Go board. Until then, Go was thought of as being "too deep" for a computer to master on the professional level. AlphaGo uses a combination of machine learning and tree search techniques.


The three most common ML methods include supervised, unsupervised and reinforcement learning. The supervised learning method relies on manually labeled sample data, which is used to train a model so that it can then be applied to similar, but new and unlabeled data. The unsupervised method attempts to automatically detect structures and patterns in data. With reinforcement learning, a trial and errror approach is combined with rewards or penalties. Each method will be discussed in more detail in the following.
Many modern AI methods are based on advanced statistical methods. However, finding a commonly accepted definition of AI is not easy. A quip in [https://en.wikipedia.org/wiki/AI_effect Tesler's Theorem] says ''"AI is whatever hasn't been done yet"''. As computers are becoming increasingly capable, tasks previously considered to require intelligence are later often removed from the definition of AI. The traditional problems of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and the ability to move and manipulate objects <ref name="russel" />.


Some of the key concepts common to these ML methods are summarized in the table below.
Most likely the currently most relevant AI method is Machine Learning (ML). ML refers to a set of algorithms that improve automatically through experience and by the use of data<ref name="mitchell" />. Within ML, an important category is Deep Learning (DL), which utilizes so called ''multi-layered neural networks''. Deep Learning includes Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs), amongst others. See below for an example of a CNN.  


[[File:1.2-AI Definitions.png|800px|frameless|center|Key AI Terms and Definitions]]
The three most common ML methods include Supervised, Unsupervised and Reinforcement Learning. The Supervised Learning method relies on manually labeled sample data, which are used to train a model so that it can then be applied to similar, but new and unlabeled data. The unsupervised method attempts to automatically detect structures and patterns in data. With reinforcement learning, a trial and error approach is combined with rewards or penalties. Each method is discussed in more detail in the following sections. Some of the key concepts common to these ML methods are summarized in the table following.


== Supervised Learning ==
[[File:1.2-AI Definitions.png|800px|frameless|center|link=|Key AI Terms and Definitions]]
The first AI/ML method we want to look at is supervised learning. Supervised learning requires a data set with some observations (e.g. images) and the labels of the observations (e.g. classes of objects on these images, like "traffic light", "pedestrian", "speed limit", etc.).


[[File:1.2-Supervised Learning.png|800px|frameless|center|Supervised Learning]]
= Supervised Learning =
The first AI/ML method we want to look at is Supervised Learning. Supervised Learning requires a data set with some observations (e.g., images) and the labels of the observations (e.g., classes of objects on these images, such as "traffic light", "pedestrian", "speed limit", etc.).


The models are trained on these labeled data sets, and can then be applied to previously unknown observations. The supervised learning algorithm produces an inferred function to make predictions about new, unseen observations which are provided as input to the model. The model can be improved further by comparing its actual output with the intended output: So-called "backward propagation" of errors.
[[File:1.2-Supervised Learning.png|800px|frameless|center|link=|Supervised Learning]]
 
The models are trained on these labeled data sets, and can then be applied to previously unknown observations. The supervised learning algorithm produces an inference function to make predictions about new, unseen observations that are provided as input. The model can be improved further by comparing its actual output with the intended output: so-called "backward propagation" of errors.


The two main types of supervised models are regression and classification:
The two main types of supervised models are regression and classification:
* Classification: The output variable is a category e.g. "stop sign", "traffic light", etc.
* Classification: The output variable is a category, e.g., "stop sign", "traffic light", etc.
* Regression: The output variable is a real continuous value, e.g. electricity demand prediction
* Regression: The output variable is a real continuous value, e.g., electricity demand prediction


Some widely used examples of supervised machine learning algorithms are:  
Some widely used examples of supervised machine learning algorithms are:  
* Linear regression, mainly used for for regression problems
* Linear regression, mainly used for regression problems
* Random forest, mainly used for for classification and regression problems
* Random forest, mainly used for classification and regression problems
* Support vector machines, mainly used for classification problems
* Support vector machines, mainly used for classification problems


== Unsupervised Learning ==
= Unsupervised Learning =
The next ML method is Unsupervised Learning, which is a type of algorithm that learns patterns from untagged data. The main goal is to uncover previously unknown patterns in data. Unsupervised machine learning is used when one has no data on desired outcomes.   
The next ML method is Unsupervised Learning, which is a type of algorithm that learns patterns from unlabeled data. The main goal is to uncover previously unknown patterns in data. Unsupervised Machine Learning is used when one has no data on desired outcomes.   


[[File:1.2-Unsupervised Learning.png|800px|frameless|center|Unsupervised Learning]]
[[File:1.2-Unsupervised Learning.png|800px|frameless|center|link=|Unsupervised Learning]]


Typical applications of Unsupervised Machine learning include:
Typical applications of Unsupervised Machine learning include the following:
* Clustering: automatically split the data set into groups according to similarity (not always easy)
* Clustering: automatically split the data set into groups according to similarity (not always easy)
* Anomaly detection: used to automatically discover unusual data points in a data set, e.g. to identify a problem with a physical asset or equipment.
* Anomaly detection: used to automatically discover unusual data points in a data set, e.g., to identify a problem with a physical asset or equipment.
* Association mining: used to identify sets of items that frequently occur together in a data set, e.g. "people that buy X also tend to buy Y"
* Association mining: used to identify sets of items that frequently occur together in a data set, e.g., "people who buy X also tend to buy Y"
* Latent variable models: commonly used for data pre-processing, e.g. reducing the number of features in a data set (dimensionality reduction)
* Latent variable models: commonly used for data preprocessing, e.g., reducing the number of features in a data set (dimensionality reduction)
 
= Reinforcement Learning =
The third common ML method is Reinforcement Learning (RL). In RL, a so-called Agent learns to achieve its goals in an uncertain, potentially complex environment. This can be, for example, a game-like situation, where the agent is deployed into a simulation where it receives rewards or penalties for the actions it performs. The goal of the agent is to maximize the total reward.
 
[[File:1.2-Reinforcement Learning.png|800px|frameless|center|link=|Reinforcement Learning]]


== Reinforcement Learning ==
One main challenge in Reinforcement Learning is to create a suitable simulation environment. For example, the RL environment for training autonomous driving algorithms must realistically simulate situations such as braking and collisions. The benefit is that it is usually much cheaper to train the model in a simulated environment, rather than risking damage to real physical objects by using immature models. The challenge is then to transfer the model out of the training environment and into the real world.
The third common ML method is Reinforcement Learning (RL). In RL, a so-called Agent learns to achieve its goals in an uncertain, potentially complex environment. This can be, for example, a game-like situation, where the agent is deployed into a simulation where it gets rewards or penalties for the actions it performs. The goal of the agent is to maximize the total reward.  


[[File:1.2-Reinforcement Learning.png|800px|frameless|center|Reinforcement Learning]]
= Deep Learning and Artificial Neural Networks =
A specialized area within Machine Learning are so-called Artificial Neutral Networks or ANNs (often simply called Neural Networks). ANNs are vaguely inspired by the neural networks that constitute biological brains. An ANN is represented by a collection of connected nodes called neurons. The connections are referred to as edges. Each edge can transmit signals to other neurons (similar to the synapses in the human brain). The receiving neuron processes the incoming signal, and then signals other neurons that are connected to it. Signals are numbers, which are computed by statistical functions.


One main challenge in Reinforcement Learning is to create a suitable simulation environment. For example, the RL environment for training autonomous driving algorithms must realistically simulate situations like braking, collisions, etc. The benefit is that it is usually much cheaper to train the model in a simulated environment, rather than risking damaged to real physical objects using immature models.
The relationship between neurons and edges is usually weighted, increasing or decreasing the strength of the signals. The weights can be adjusted as learning proceeds. Usually, neurons are aggregated into layers, where different layers perform different transformations on their input signals. Signals travel through these layers, potentially multiple times. The adjective "deep" in Deep Learning is referring to the use of multiple layers in these networks.
The challenge is then to transfer the model out of the training environment into the real world.


== Example: Convolutional Neural Network ==
A popular implementation of ANNs are Convolutional Neural Networks (CNNs), which are often used for processing visual and other two-dimensional data. Another example is Generative Adversarial Networks, where multiple networks compete with each other (e.g., in games).
A specialized area within Machine Learning are so-called Artificial Neutral Networks or ANNs (often simply called Neural Networks). ANNs are vaguely inspired by the neural networks that constitute biological brains. An ANN is represented by a collection of connected nodes called neurons (the connections are referred to as edges). Each edge can transmit signals to other neurons (similar to the synapses in the human brain). The receiving neuron processes the incoming signal, and can then signal other neurons which are connected to it. Signals are numbers, which are computed by statistical functions.


The relationship between neurons and edges are usually weighted, increasing or decreasing the strength of the signals. The weights can be adjusted as learning proceeds. Usually, neurons are aggregated into layers, where different layers perform different transformations on their input signals. Signals travel through these layers, potentially multiple times.
[[File:1.2-CNN.png|800px|frameless|center|link=|Example: Convolutional Neural Network]]


The example shows a CNN and its multiple layers. It is used to classify areas of an input image into different categories such as "traffic light" or "stop sign". There are four main operations in this CNN:
* Convolution: Extract features from the input image, preserving the spatial relationship between pixels by using small squares of input data. Convolution is a linear operation: it performs elementwise matrix multiplication and addition.
* Non Linearity: ReLU (Rectified Linear Unit) is an operation applied after the convolution operations. ReLU introduces non-linearity in the CNN, which is important because most real-world data are non-linear.
* Spatial Pooling/down-sampling: This step reduces the dimensionality of each feature map, while retaining the most important information.
* Classification (Fully Connected Layer): The outputs from the first three layers are high-level features of the input image. The Fully Connected Layer uses these features to classify the input image into various classes based on the training dataset.


A more detailed explanation of a similar example is provided by Ujjwal Karn on [https://www.kdnuggets.com/2016/11/intuitive-explanation-convolutional-neural-networks.html KDNuggets].


[[File:1.2-CNN.png|800px|frameless|center|Example: Convolutional Neural Network]]
= Summary: AI & Data Analytics =
The field of data analytics has evolved over the past decades, and is much broader than just AI and data science - so it is important to understand where AI/ML is fitting in. From the point of view of most AIoT use cases, there are four main types of analytics: descriptive, diagnostic, predictive and prescriptive analytics. Descriptive analytics is the most basic one, using mostly visual analytics to address the question ''"What happened?"''. Diagnostic analytics utilizes data mining techniques to answer the question ''"Why did it happen?"'', providing some king of root cause analysis. Data mining is the starting point of data science, with its own specific methods, processes, platforms and algorithms. AI - predominantly ML - often addresses the questions ''"What is likely to happen?"'' and ''"What to do about it?"''. Predictive analytics provides forecasts and predictions. Prescriptive analytics can be utilized, for example, to obtain detailed recommendations as work instructions, or even to enable closed-loop automation.
[[File:00 Analytics.png|800px|frameless|center|link=|Analytics]]


== References ==
= References =


<references>
<references>

Revision as of 01:12, 24 June 2022

More...More...More...More...More...More...AI 101

This section provides an Artificial Intelligence 101, including a basic overview, a summary of Supervised, Unsupervised and Reinforcement Learning, as well as Deep Learning and Artificial Neural Networks.

Introduction

Artificial Intelligence (AI) is not a new concept. Over the last couple of decades, it has experienced several hype cycles, which alternated with phases of disillusionment and funding cuts ("AI winter"). The massive investments into AI by today's hyperscalers and other companies have significantly fueled the progress made with AI, with many practical applications now being deployed.

A highly visible break-through event was the development of AlphaGo (developed by DeepMind Technologies, which was later acquired by Google), which in 2015 became the first computer Go program to beat a human professional Go player without handicap on a full-sized 19×19 Go board. Until then, Go was thought of as being "too deep" for a computer to master on the professional level. AlphaGo uses a combination of machine learning and tree search techniques.

Many modern AI methods are based on advanced statistical methods. However, finding a commonly accepted definition of AI is not easy. A quip in Tesler's Theorem says "AI is whatever hasn't been done yet". As computers are becoming increasingly capable, tasks previously considered to require intelligence are later often removed from the definition of AI. The traditional problems of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and the ability to move and manipulate objects [1].

Most likely the currently most relevant AI method is Machine Learning (ML). ML refers to a set of algorithms that improve automatically through experience and by the use of data[2]. Within ML, an important category is Deep Learning (DL), which utilizes so called multi-layered neural networks. Deep Learning includes Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs), amongst others. See below for an example of a CNN.

The three most common ML methods include Supervised, Unsupervised and Reinforcement Learning. The Supervised Learning method relies on manually labeled sample data, which are used to train a model so that it can then be applied to similar, but new and unlabeled data. The unsupervised method attempts to automatically detect structures and patterns in data. With reinforcement learning, a trial and error approach is combined with rewards or penalties. Each method is discussed in more detail in the following sections. Some of the key concepts common to these ML methods are summarized in the table following.

Key AI Terms and Definitions

Supervised Learning

The first AI/ML method we want to look at is Supervised Learning. Supervised Learning requires a data set with some observations (e.g., images) and the labels of the observations (e.g., classes of objects on these images, such as "traffic light", "pedestrian", "speed limit", etc.).

Supervised Learning

The models are trained on these labeled data sets, and can then be applied to previously unknown observations. The supervised learning algorithm produces an inference function to make predictions about new, unseen observations that are provided as input. The model can be improved further by comparing its actual output with the intended output: so-called "backward propagation" of errors.

The two main types of supervised models are regression and classification:

  • Classification: The output variable is a category, e.g., "stop sign", "traffic light", etc.
  • Regression: The output variable is a real continuous value, e.g., electricity demand prediction

Some widely used examples of supervised machine learning algorithms are:

  • Linear regression, mainly used for regression problems
  • Random forest, mainly used for classification and regression problems
  • Support vector machines, mainly used for classification problems

Unsupervised Learning

The next ML method is Unsupervised Learning, which is a type of algorithm that learns patterns from unlabeled data. The main goal is to uncover previously unknown patterns in data. Unsupervised Machine Learning is used when one has no data on desired outcomes.

Unsupervised Learning

Typical applications of Unsupervised Machine learning include the following:

  • Clustering: automatically split the data set into groups according to similarity (not always easy)
  • Anomaly detection: used to automatically discover unusual data points in a data set, e.g., to identify a problem with a physical asset or equipment.
  • Association mining: used to identify sets of items that frequently occur together in a data set, e.g., "people who buy X also tend to buy Y"
  • Latent variable models: commonly used for data preprocessing, e.g., reducing the number of features in a data set (dimensionality reduction)

Reinforcement Learning

The third common ML method is Reinforcement Learning (RL). In RL, a so-called Agent learns to achieve its goals in an uncertain, potentially complex environment. This can be, for example, a game-like situation, where the agent is deployed into a simulation where it receives rewards or penalties for the actions it performs. The goal of the agent is to maximize the total reward.

Reinforcement Learning

One main challenge in Reinforcement Learning is to create a suitable simulation environment. For example, the RL environment for training autonomous driving algorithms must realistically simulate situations such as braking and collisions. The benefit is that it is usually much cheaper to train the model in a simulated environment, rather than risking damage to real physical objects by using immature models. The challenge is then to transfer the model out of the training environment and into the real world.

Deep Learning and Artificial Neural Networks

A specialized area within Machine Learning are so-called Artificial Neutral Networks or ANNs (often simply called Neural Networks). ANNs are vaguely inspired by the neural networks that constitute biological brains. An ANN is represented by a collection of connected nodes called neurons. The connections are referred to as edges. Each edge can transmit signals to other neurons (similar to the synapses in the human brain). The receiving neuron processes the incoming signal, and then signals other neurons that are connected to it. Signals are numbers, which are computed by statistical functions.

The relationship between neurons and edges is usually weighted, increasing or decreasing the strength of the signals. The weights can be adjusted as learning proceeds. Usually, neurons are aggregated into layers, where different layers perform different transformations on their input signals. Signals travel through these layers, potentially multiple times. The adjective "deep" in Deep Learning is referring to the use of multiple layers in these networks.

A popular implementation of ANNs are Convolutional Neural Networks (CNNs), which are often used for processing visual and other two-dimensional data. Another example is Generative Adversarial Networks, where multiple networks compete with each other (e.g., in games).

Example: Convolutional Neural Network

The example shows a CNN and its multiple layers. It is used to classify areas of an input image into different categories such as "traffic light" or "stop sign". There are four main operations in this CNN:

  • Convolution: Extract features from the input image, preserving the spatial relationship between pixels by using small squares of input data. Convolution is a linear operation: it performs elementwise matrix multiplication and addition.
  • Non Linearity: ReLU (Rectified Linear Unit) is an operation applied after the convolution operations. ReLU introduces non-linearity in the CNN, which is important because most real-world data are non-linear.
  • Spatial Pooling/down-sampling: This step reduces the dimensionality of each feature map, while retaining the most important information.
  • Classification (Fully Connected Layer): The outputs from the first three layers are high-level features of the input image. The Fully Connected Layer uses these features to classify the input image into various classes based on the training dataset.

A more detailed explanation of a similar example is provided by Ujjwal Karn on KDNuggets.

Summary: AI & Data Analytics

The field of data analytics has evolved over the past decades, and is much broader than just AI and data science - so it is important to understand where AI/ML is fitting in. From the point of view of most AIoT use cases, there are four main types of analytics: descriptive, diagnostic, predictive and prescriptive analytics. Descriptive analytics is the most basic one, using mostly visual analytics to address the question "What happened?". Diagnostic analytics utilizes data mining techniques to answer the question "Why did it happen?", providing some king of root cause analysis. Data mining is the starting point of data science, with its own specific methods, processes, platforms and algorithms. AI - predominantly ML - often addresses the questions "What is likely to happen?" and "What to do about it?". Predictive analytics provides forecasts and predictions. Prescriptive analytics can be utilized, for example, to obtain detailed recommendations as work instructions, or even to enable closed-loop automation.

Analytics

References

  1. Russell, Stuart J.; Norvig, Peter (2003), Artificial Intelligence: A Modern Approach (2nd ed.), Upper Saddle River, New Jersey: Prentice Hall, ISBN 0-13-790395-2
  2. Mitchell, Tom (1997). Machine Learning. New York: McGraw Hill. ISBN 0-07-042807-7. OCLC 36417892.