Artificial Intelligence: So many buzzwords, so little difference

May. 4, 2017 | by Dr. P. Bangert

Artificial intelligence, machine learning, deep learning - those terms often are used interchangeably when talking about intelligent analytics and machine automation. But are they equivalent? The answer is mostly. Deep learning is part of the machine learning, which in itself is part of the artificial intelligence. This is only a somewhat useful answer, however, so let's do a deeper dive, starting from the most general, and drilling down to the most specific of those terms.

The most general of all the terms in its use, artificial intelligence (AI) is the attempt to make machines, which are able to think and learn like humans. AI, in its complete form, would be versatile and adaptable, and able to generalize the knowledge acquired in solving one problem, to the solution of a different, not obviously connected problem.

The first definition of artificial intelligence is often attributed to the famous test outlined by Alan Turing. That test posits that if a human being can have a conversation with a machine without understanding it to be a machine, then it is an AI. Thus the Turing test replaces the hard issue of "thinking" as a human being, with the much simpler idea of "acting" as a human being.

One quickly runs into trouble with that, and any other proposed general "test" for the completeness of artificial intelligence, however. In its most common form (which, should be noted, was developed from Turing's idea, but is not a test he proposed), the Turing test could theoretically be satisfied by a sufficiently advanced chatbot, which could use pattern matching to hold a coherent conversation. In fact, there have already been examples of chatbots arguably passing the test.

At the same time, however, grasping the actual meaning of informal sentences is tremendously difficult for machines. Every word matters, as does word order and the context in which the sentence is spoken. In fact, in terms of how specialists use the terminology, the term AI nowadays is almost exclusively reserved for natural language processing. That is because understanding natural language is what is often called "AI complete," meaning if you can solve it, you are much closer to solving the issue of general intelligence. Some specialists even think that cracking natural language processing is equivalent to creating a true AI. So the Turing test, in its spirit, is actually a pretty good idea of what we still think would constitute an intelligent machine. It just cannot be used as a practical test.

In practice, however, while most of the hype about AI has been concentrated on the sought after (and feared, even by great minds like Stephen Hawking) general AI, most of the actual development has happened in the much more practical field of intelligent tools, also known as specialized (or weak) AI. From Siri, to programs that automatically trade stocks, chess-playing algorithms, and parking assistance or autonomous driving software, weak AIs are all around us. They are specialized tools, which excel in a particular task. Curiously, very often those tasks are ones humans are not particularly good at. That, again, reveals a fundamental problem in the way we understand human and machine intelligence. Tasks like playing chess or go, or analyzing huge amounts of data, which are extremely challenging or downright impossible for our meat-based hardware, are much easier for modern computing algorithms. Yet we still do not have a computer that possesses that level of common sense or independent reasoning of a four-year-old.

Getting a bit more specific, machine learning is a branch of AI, which is, at its core, a sophisticated approach to data analysis. It has existed since the 1980s, although it really gained power and recognition since 2010, with the advancement of computing power enabling it to reach its full potential. It broadly utilizes a set of algorithmic approaches called neural networks in order to automate analytical model building from data. It does so by using algorithms that can learn from data. It thereby enables programs to detect key insights without being told what exactly they are looking for. In our article on neural networks we explained how this kind of architecture is similar to the structure and functioning of the human brain. In short, neural networks are computer systems that work by classifying information, and can teach themselves to learn when exposed to new information. The neural networking model also allows programs to project their learning into the future, by making predictions based on their previous knowledge. Such approach to machine intelligence has the advantage of simulating the versatility and adaptability of the human mind, while still retaining the innate advantages computers have over us - speed and accuracy of processing, and lack of experience-based biases. It is not the approach that is likely to yield the independent, self-actualizing machine mind that science fiction authors envision, but it is an approach that is proving enormously powerful in solving many issues today.

There are several different types of neural networks. The most straightforward and simplest type of neural network is the perceptron network, which has one so-called hidden layer of neurons. In spite of their simplicity, these networks can represent most commonly occurring data relationships very accurately. The perceptron networks are responsible for the bulk of neural network methods currently used in practical applications.

Deep learning, which itself is a type of machine learning, is another type of neural networks, which is enjoying quite the hype lately. The crucial characteristic of deep learning is that it contains multiple layers of hidden neurons between the input and the output, as each layer uses the output of the previous layer as its input. Research in this area is mostly driven by the need to process large-scale unlabeled data, so as to tackle the issues that are most difficult for machine learning to parse - image classification, voice recognition, or even picking up an eye-pleasing still frame from a video.

And lastly, another interesting, and very relevant to algorithmica's work, type of networks are the recurrent neural networks (RNNs). We won't go into too much technical detail on the differences, as they can get excessively technical, but briefly: Conventional neural networks relate input to output but cannot consider that various inputs might be interrelated themselves. If, for example, the inputs are observations of the same thing at different times, we would want to represent this inherent time-dependency in some way. This can be done with RNNs as they contain loops in the network of interconnected neurons. Loops of short or long length in the network essentially act as a short or long term memory over time in the brain. Modeling dynamical processes, in which cause-effect relationship need to be represented, calls for this facility. This makes RNNs very powerful tools for modeling many real-life problems, including most complex industrial processes.

Finally, it is important to note that while there are indeed differences between those terms for experts, and they have different uses, in their popular use they are indeed used interchangeably. In fact, the term deep learning has largely been used to "spruce up" and rebrand neural networks, a technology that already existed in the 1980s, as something new and exciting.