The Main Concepts of AI and Machine Learning:
An Overview
By Martin Rupp
Blog Series 1: Machine Learning and Cyber Security: An Introduction
The WEF forecasts the global value of AI in cyber security to grow up to 46 billion dollars by the year 2027, excluding the value it creates by protecting critical assets. This article sheds some light on the main concepts of AI and Machine Learning.
In our first article “Machine Learning and Cyber Security: An Introduction”, we outlined a general introduction of the state of cyber security and explained why AI is increasingly entering the scene. We described specific applications and use cases where AI can complement cyber security solutions, as well as the importance of noise-free, uncompromised and properly treated data to prevent AI models from diverging in the wrong direction.
Considering this substantial and growing role of AI and particularly Machine Learning in cyber security, it is important to define the term AI, and more precisely, the specific machine-learning algorithms that can and will be used in cyber security applications.
Undoubtedly, AI (e.g. “Artificial Intelligence”) is a popular and very broad term. If you ask random people on the streets for a definition, you may end up with something like this:
”AI? Well that’s the machine that thinks like humans”
”AI? Mmm it means that computers can solve some problems, play chess, drive cars”
“AI? That’s the brain of the robots”
“AI? That’s how Siri works”
… and so on…
In fact, very few people will associate AI with another term which is ML (“Machine Learning”). True, AI is broader than ML, however ML is concretely how the majority of modern AI is done.
Of AI in general and of ML in particular AI
There is no single definition of AI, and in fact, often, AI experts battle with each other on the subject. In general, AI relates to “the intelligence of the machines”, but since that intelligence is designed, and programmed by humans, it’s ultimately a reflection, directly or indirectly, of how humans think.
Historically, AI has been developed as a way to provide machines with the ability to think the same way that humans do. This is very clear with the pioneering works of Alan Turing (see [1]). Finally, one of the leading edges of AI are artificial neural networks which started a long time ago with the perceptron. And the perceptron is nothing else but a computer simulation of a biological neuron. This proves that AI is intrinsically linked to the human mind model, but in the same way that a plane is linked to a bird.
However, AI encompasses concepts which can be inspired by how ‘nature thinks’ in general, such as an ants colony or bee hives algorithm. It is still the human representation and interpretation of these algorithms which is involved.
Algorithms have existed for a long time. So, where does the algorithm end and where does the AI begin? It’s not always clear. Some experts refer to AI as ‘only’ an old and banal statistical method.
Some researchers claim that AI does not yet exist, because this would involve machines that can think autonomously, act autonomously and eventually develop some sort of conscience on their own, which would have far reaching implications for humans…
Hence, what we call AI may still be at the early stages of a simulation of intelligence but, again, the debate is still ongoing.
At this moment in time, we should probably consider AI as a way for machines to solve problems in a smart way, to understand and complete some tasks efficiently, and in other terms… to do our work, or at least some part of our work.
Machine Learning
When it comes to Machine Learning (ML), things are far less prone to any ambiguities. Machine Learning follows the foundation principles about Learning Machines as enunciated by Alan Turing in his paper “Computing Machinery and Intelligence”. In this paper, Turing develops most of the ideas which are still ruling ML today. Turing describes how a machine can learn following a similar process to humans and offers many analogies between machines and /humans.
Same as in life, there are several ways that machines can learn. Either from lessons in a structured curriculum (supervised learning) or from random and unpredicted data (unsupervised learning).
Turing mentions that the introduction of (artificial) randomness could play a significant role in the efficiency of the learning process for a machine. Following Turing, a machine that could learn would be a “child” machine and would experience, in the same way as humans, heredity, mutation, experiments and choices following experiments. Complexity would be at stake, so that despite strong “formal” and “imperative” logic, a self-built model with a proper logic would be built as a result of the learning. This presupposes the initial ignorance of the child machine as well as the introduction of some randomness in the process.
There is a fundamental gap between machines that learn and machines that do not learn. Machines that do learn are more based on our conception of ‘life’. They will evolve dynamically, explore, eventually becoming better and better (providing that they learn from accurate material and not from fake or untrustworthy data) while machines that do not learn are timeless. They are an immutable logic, a set of algorithms, often very complex but which will not mutate, evolve or change.
An overview of a Generic Machine Learning algorithm
All ML algorithms can be represented the same way. They all have an initial training set which may (or may not) have been generated from an initial learning process. They also all have a way to process specific data, either by classifying them (discrete case) or by regressing them (continuous case). Finally, Machine Learning algorithms can feed themselves and learn continuously during their life and ideally, improve their skills and increase their accuracy.
This presuppose the following components:
The data (“learning set”);
A model builder (the ML in itself);
Task(s) to perform in an environment;
The environment;
Feedback from the tasks.
The different ML algorithms: a brief overview
The literature about ML is immense so let’s just try to very briefly describe the main ML algorithms. Model-driven engineering in the context of Machine Learning is a careful and meticulous research and experimentation process of selecting, validating and evaluating models. Both the type of input data and the type of output are critical in choosing the most suitable algorithm for a given purpose. For this reason, different algorithms can be superior to each other in different prediction purposes and with different training sets.
In most cases, the scope of model-driven engineering is mainly to design custom models that avoid overfitting to the training data set, and that find the optimised balance between false positives and the accuracy of the output.
In most cases, the scope of model engineering is mainly a matter of avoiding overfitting to the training data set and finding the optimized balance between false positives and the accuracy of the outcome.
The ANN- Artificial Neural Networks
ANNs are based on a simulation of the known behavior of the human brains, using synapses and neurons. This is an imperfect simulation which does not fully represent the real functioning of Human neurons and uses an artificial mechanism known as back-propagation which does not exist in biological systems.
In ANNs, nodes (aka ‘neurons’) receive signals from other neurons and transform that signal and communicate it to other nodes.
ANNs use weight functions to transform the signal.
There are different types of ANNS. Here are one of the most important types:
Recurrent (RNN)
Recursive
Convolutional (CNN)
Modular
A note about Deep Learning:
Deep learning is generally a class of ANNs which is using a very important number of layers (hence the term “deep”), which layers are aggregates of neurons. Deep learning is a very important area of application and active research in ML.
Bayesian Classifiers
One of the most important types of ML, using supervised learning and Bayesian probabilities as well as Bayesian statistics.
Bayesian network: Bayesian classifiers which are not “naive”, e.g. where the conditional probabilities are not independent.
Hidden Markov Model Classifiers
A special type of Bayesian Network used to model dynamical systems. Used for automatic recognition of handwriting, speech, patterns, etc…
Regressors
Regressors are used to predict continuous data (by statistical regression analysis) while classifiers work on a finite set of outputs which are streamed into categories
Evolution learning
Evolution learning is a class of ML meta-heuristic algorithms which often uses collective computational systems as found in nature, implying fitness and evolution mechanisms. Here is a list of evolutionary algorithms:
- Ant Colony Optimization
- Swarm Intelligence
- Genetic algorithms
- Bees colonies algorithms
Additionally, ML algorithms can themselves evolve through a “Darwinism” process. Evolution learning is often linked to multi-agent AI.
Tree Search:
Tree search algorithms can be used with Monte-carlo methods to perform Monte-carlo tree searches (MCTS).
Tree search is used in a wide range of Machine learning algorithms.
See also:
- Decision Tree Classifiers
- Random Forest Classifiers
- ID3 classifier
- AdaBoost
- XGBoost
SVMs:
SVMs are classifiers which construct hyperplanes in a multidimensional space, which separate two categories of data. It’s the best representative of Kernel methods-based MLs.
See also:
- KNN (K-nearest-neighbour) Classifiers
Deep Reinforcement Learning:
DRL is just the combination of deep learning and reinforcement learning.
Fusion Classifiers:
Meta-classifiers that combine several types of classifiers to combine the “best of two worlds” (or the best of several worlds…) and maximise the overall classification accuracy.
See also:
- Voting methods
- Hierarchical mixture of experts (HME)
- Meta algorithms
In the next article of this series, we will be exploring why the support of machine learning is needed and how it already is creating value in cyber security.
References
[1] Turing, Alan (October 1950), “Computing Machinery and Intelligence”
Author: Martin Rupp
Martin Rupp is a cryptographer, mathematician and cyber-scientist. He has been developing and implementing cybersecurity solutions for banks and security relevant organizations for 20 years, both as an independent consultant and through Anevka and SCD, the companies he founded.
Martin currently researches the application of Machine Learning and Blockchain in Cybersecurity.