There is so much excitement about deep learning networks in the last few years that we forget the bigger picture. Let’s look at examples of some things I would expect an intelligent agent should be able to do:
- We show the AI a picture of a dog standing on a skateboard. A modern state of the art image classifier can draw bounding boxes over the dog and over the skateboard and correctly label each bounding box with a wordnet synset for the detected object
- Then we tell the AI in English “The dog’s name is Spot.”
- Finally we ask our AI “Can Spot ride a skateboard?”
It should be clear the deep neural networks can play a large role in the above scenario. One of them will classify the image. And another will parse the English text and find the parts of speech and dependencies.
But one thing we absolutely do not want to use a deep learning network for is for the part where our AI learns that the dog in the image is a specific dog named “Spot” who was observed on a skateboard at some specific point in time at some specific place. Seriously we do not want to spend hours or days training a network with different permutations of the photograph and the word “Spot”. We expect to tell it once and have it remember. So this kind of information is stored someplace other then inside a huge network of floating point numbers. I’ll call this place a “datastore”(*)
What should this datastore look like?
Remember, one of my working theories is that an AI must exist inside a physical body or at least a body that is simulated. Because a body is a physical object must always exits at only one point in time and place. So if it sees a dog on a skateboard it saw it at one time and at one place.
A datastore “remembers” the context including time and place of anything that goes into it. Also by the above scenario we can see that it can store the result of an image classification system and of a natural language system. Implied by example #3 above, the datastore must be able to store and apply rules. In order to answer the question the AI must have rules like (a) if one object is on another object the first object is supporting the second. And (2) an object supported by a vehicle is riding in/on the vehicle. (3) a skateboard is a kind of vehicle.
Finally “The Answer” or at least the my current best answer: “Datalog”.
Prolog is a Turing complete general purpose logic programming language. Datalog is a subset of Prolog that is only declarative. As as example in prolog one can write “whacky(f(X), Y)” but in Datalog only (whacky(X,Y)” is allowed. Datalog because it is declarative allows some optimizations not possible with Prolog.
More on how I will use Prolog/Datalog later.
(*) I like the word “database” better than “datastore” but the word database has come to mean a specific kind of information storage system that is accessed by SQL or more recently by no-sql, While the underlying storage might very well by a SQL DBMS, I don’t want to imply a traditional DBMS.