Systems Approach Full disclosure: I have a history with AI, having flirted with it in the 1980s (remember expert systems?) and then having safely avoided the AI winter of the late 1980s by veering off into formal verification before finally landing on networking as my specialty in 1988.
And just as my Systems Approach colleague Larry Peterson has classics like the Pascal manual on his bookshelf, I still have a couple of AI books from the Eighties on mine, notably P. H. Winston’s Artificial Intelligence (1984). Leafing through that book is quite a blast, in the sense that much of it looks like it might have been written yesterday. For example, the preface begins this way:
The field of Artificial Intelligence has changed enormously since the first edition of this book was published. Subjects in Artificial Intelligence are de rigueur for undergraduate computer-science majors, and stories on Artificial Intelligence are regularly featured in most of the reputable news magazines. Part of the reason for change is that solid results have accumulated.
I was also intrigued to see some 1984 examples of “what computers can do.” One example was solving seriously hard calculus problems – notable because accurate arithmetic seems to be beyond the capabilities of today’s LLM-based systems.
If calculus was already solvable by computers in 1984, while basic arithmetic stumps the systems we view as today’s state of the art, perhaps the amount of progress in AI in the last 40 years isn’t quite as great as it first appears. (That said, there are even better calculus-tackling systems today, they just aren’t based on LLMs, and it’s unclear if anyone refers to them as AI.)
One reason I picked up my old copy of Winston was to see what he had to say about the definition of AI, because that too is a controversial topic. His first take on this isn’t very encouraging:
Artificial Intelligence is the study of ideas that enable computers to be intelligent.
Well, OK, that’s pretty circular, since you need to define intelligence somehow, as Winston admits. But he then goes on to state two goals of AI:
- To make computers more useful
- To understand the principles that make intelligence possible.
In other words, it’s hard to define intelligence, but maybe the study of AI will help us get a better understanding of what it is. I would go so far as to say that we are still having the debate about what constitutes intelligence 40 years later. The first goal seems laudable but clearly applies to a lot of non-AI technology.
This debate over the meaning of “AI” continues to hang over the industry. I have come across plenty of rants that we wouldn’t need the term Artificial General Intelligence, aka AGI, if only the term AI hadn’t been so polluted by people marketing statistical models as AI. I don’t really buy this. As far as I can tell AI has always covered a wide range of computing techniques, most of which wouldn’t fool anyone into thinking the computer was displaying human levels of intelligence.
When I started to re-engage with the field of AI about eight years ago, neural networks – which some of my colleagues were using in 1988 before they fell out of favor – had made a startling comeback, to the point where image recognition by deep neural networks had surpassed the speed and accuracy of humans albeit with some caveats. This rise of AI led to a certain level of anxiety among my engineering colleagues at VMware, who sensed that an important technological shift was underway that (a) most of us didn’t understand (b) our employer was not positioned to take advantage of.
As I threw myself into the task of learning how neural networks operate (with a big assist from Rodney Brooks) I came to realize that the language we use to talk about AI systems has a significant impact on how we think about them. For example, by 2017 we were hearing a lot about “deep learning” and “deep neural networks”, and the use of the word “deep” has an interesting double meaning. If I say that I am having “deep thoughts” you might imagine that I am thinking about the meaning of life or something equally weighty, and “deep learning” seems to imply something similar.
But in fact the “deep” in “deep learning” is a reference to the depth, measured in number of layers, of the neural network that supports the learning. So it’s not “deep” in the sense of meaningful, but just deep in the same way that a swimming pool has a deep end – the one with more water in it. This double meaning contributes to the illusion that neural networks are “thinking.”
A similar confusion applies to “learning,” which is where Brooks was so helpful: A deep neural network (DNN) gets better at a task the more training data it is exposed to, so in that sense it “learns” from experience, but the way that it learns is nothing like the way a human learns things.
As an example of how DNNs learn, consider AlphaGo, the game-playing system that used neural networks to defeat human grandmasters. According to the system developers, whereas a human would easily handle a change of board size (normally a 19×19 grid), a small change would render AlphaGo impotent until it had time to train on new data from the resized board.
To me this neatly illustrates how the “learning” of DNNs is fundamentally unlike human learning, even if we use the same word. The neural network is unable to generalize from what it has “learned.” And making this point, AlphaGo was recently defeated by a human opponent who repeatedly used a style of play that had not been in the training data. This inability to handle new situations seems to be a hallmark of AI systems.
Language matters
The language used to describe AI systems continues to influence how we think about them. Unfortunately, given the reasonable pushback on recent AI hype, and some notable failures with AI systems, there may now be as many people convinced that AI is completely worthless as there are members of the camp that says AI is about to achieve human-like intelligence.
I am highly skeptical of the latter camp, as outlined above, but I also think it would be unfortunate to lose sight of the positive impact that AI systems – or, if you prefer, machine-learning systems – can have.
I am currently assisting a couple of colleagues writing a book on machine-learning applications for networking, and it should not surprise anyone to hear that there are lots of networking problems that are amenable to ML-based solutions. In particular, traces of network traffic are fantastic sources of data, and training data is the food on which machine-learning systems thrive.
Applications ranging from denial-of-service-prevention to malware detection to geolocation can all make use of ML algorithms, and the goal of this book is to help networking people understand that ML is not some magic powder that you sprinkle on your data to get answers, but a set of engineering tools that can be selectively applied to produce solutions to real problems. In other words, neither a panacea nor an over-hyped placebo. The aim of the book is to help readers understand which ML tools are suitable for different classes of networking problems.
One story that caught my eye some time back was the use of AI to help Network Rail in the UK manage the vegetation that grows alongside British railway lines. The key “AI” technology here is image recognition (to identify plant species) – leveraging the sort of technology that DNNs delivered over the past decade. Not perhaps as exciting as the generative AI systems that captured the world’s attention in 2023, but a good, practical application of a technique that sits under the AI umbrella.
My tendency these days is to try to use the term “machine learning” rather than AI when it’s appropriate, hoping to avoid both the hype and allergic reactions that “AI” now produces. And with the words of Patrick Winston fresh in my mind, I might just take to talking about “making computers useful.” ®