Can AI Neural Networks Help Explain How Humans Process Abstract Thought?

One of the oldest questions in philosophy regards the source of human knowledge: Where does our knowledge come from? How do we gain knowledge about the world? How is that knowledge justified? Historically, answers to this question have fallen into two main philosophical traditions, rationalism and empiricism.

According to rationalists such as Plato or René Descartes, knowledge is innate and independent of experience. Rationalists believe that knowledge is gained through the rigorous application of logic, which is entirely independent of experience.

On the other hand, empiricists, such as John Locke or David Hume, argue that human knowledge begins primarily with experience.  Empiricists in philosophy have often argued that complex abstract thought derives from sensory experience coupled with a psychological faculty of “abstraction.” Rationalists, in turn, have criticized empiricist views for failing to adequately explain the nature of this psychological faculty of “abstraction”. Now, in a new paper published in the journal Synthese, philosopher Cameron Buckner argues that the tools of Deep Convolutional Neural Networks (DCNNs) in current AI research finally gives empiricists the tools to provide a scientific account of how abstract thought is derived from sensory experience in humans.

In the paper, Buckner argues that the complex organization of DCNNs allow them to implement a form of hierarchical processing that he calls “transformational abstraction,”—the conversion of a sensory-based representation into higher order representations that are resistant to “nuisance variation.” Such representations allow an AI to more accurately identify objects in their environment, even given a wide range of differences in input. Buckner argues that the computational architecture underlying DCNNs can be applied to human thought to explain how humans go from having sensory experiences of a particular object to having an abstract general concept of that object that can be used to identify different members of that object class. In other words, Buckner’s work attempts to bridge the gap between modern work in AI and classical empiricist positions in the philosophy of mind to give a comprehensive empiricist account of how the human mind generates abstract concepts.

From Sensory Experience To Knowledge

Here is a thought experiment: Imagine (for experiment’s sake) that you have never seen or heard of a chair. One day you walk into a restaurant with your friend and you inquire what all the strangely shaped wooden objects on the ground are. They laugh and tell you those are chairs. Later, you and your friend go to a bar and you see the barstools. You ask again what those things are and your friend tells you they are a kind of chair. Eventually, after seeing a number of different kinds of chairs, you come to be able to independently recognize chairs in your environment, even given the different appearances of different kinds of chairs. Where and how, exactly, during this process did you go from having separate experiences of chairs to unifying them under a general abstract concept of “chair” that you can use to identify objects of future experiences? This is the main question that Buckner’s work examines and he argues that recent advancements in AI neural networks can give the answer.

AI researchers have noticed that computers actually have a very difficult time recognizing everyday shapes and objects like triangles, chairs, and cats because they can be encountered with so many different variations in vantage point, color, and orientation. A chair does not look the same from every direction so how do you get an AI to unify those diverse perspectives under a single category “chair”? Modern AI researchers employ DCNNs to get around these problems.

A DCNN is a kind of connectionist neural network. Connectionist neural networks describe mental phenomena as interconnected networks of simple uniform units. The general idea of connectionist networks is that mental states can be represented as numeric “activation” values across a network of simple units. Connectionist networks are popular in neural modeling as one can see the network as representing the brains neurons (individual units) and the synaptic connections between them (connections between the units).

DCNNs differ from other connectionist networks in that they incorporate different levels of processing the connections between the individual units in the network. They can do this because they are able to single out specific information in a given signal. DCNNs are able to take some pice of visual information fed to a computer and, via a computation called a convolution operation, can amplify the presence of certain features and minimize information related to other features. For example, a DCNN fed visual information from a cube might focus in on the corners of the cube and maximize the visual information that corresponds to that corner, while minimizing information related to the faces of the cube. A DCNN then takes different instances of this amplified information and feeds it into a “complex” unit, one that takes inputs from nearby simple units and aggregates them to detect those features across simple units. The result is that. over time, the network comes to be more able to reliably detect objects as the complex units extract more and more features from the amplified information fed into them from the simple units. DCNNs operating according to these principles have shown remarkable success in recognizing everyday objects like chairs and tables.

These AI techniques may sound impressive, but why should we think that they can be extended to explain aspects of actual human cognition? The connection lies in the finding that the mammalian cortex has two different kinds of cells that respond to sensory information. The human visual system has two kinds of receptor cells, simple and complex. Simple cells are the ones that take in the immediate visual information from light. Simple cells then feed this information into complex cells, which, in turn, modulate the firing of the simple cells via neurochemical inhibitory processes. Complex cells possess an invariant field of activation and inhibition, meaning that they will fire over a wider range of circumstances than simple cells. Essentially, this means that complex cells in the visual cortex will respond to general features under a varying range of presentations.

The general structure of these feedback mechanisms looks remarkably similar to the structure of a DCNN with its simple and complex units, and in fact, initial research into DCNNs was motivated by the discovery of these two types of cells in the mammalian visual cortex. Buckner argues that DCNN can be seen as representing the cognitive machinery that the human brain uses to generate general abstract concepts.

If correct, Bucnker’s work provides a profound insight into the nature of mammalian cognition. Buckner originally began his career as a computer scientists but his observation of the difference between early AI and how humans actually solve problems motivated his transition to philosophy. According to Buckner, DCNNs are unique “because they can acquire the kind of subtle, abstract, intuitive knowledge of the world that comes automatically to humans but has until now proven impossible to program into computers.” Modern AI is already able to perform some impressive tasks, but it seems that the Holy Grail of AI research is to create an intelligence that can faithfully recreate the kind of intelligence found in human beings.