A REPL for Conversations

This was originally posted on Vernacular.ai's blog.

A REPL, in programming, is an interactive environment where a programmer can go through the cycle of writing code, getting it Read, Evaluated, output Printed and then back in a Loop. Fundamentally, a REPL merges usage and extension of a system in a single interface. This merger has been important in providing easy extensibility to the end programmers in many programming languages.

1. Interaction is supervision

We can envision a similar merger in human intelligence involving languages. After getting to a certain level of language understanding, our interactive behavior can be manipulated itself by interactions. Interactions cover all possible dialogs. Manipulative interactions or supervision, as we like to call it, cover whatever is involved in updating the state of the internal model which, in turn, affects the behavior.

Most of the machine dialog platforms aren't designed to expose this equivalence explicitly. We have a different language for providing supervision and different for building the interaction experiences and that's it. Consider the chatbots you see on various websites these days. These have an interaction language which is a natural language, and a separate supervision language which stays at the programmer's beck in the background. I can't improve my personal interaction with the system or extend it's behavior by talking to it. I can do that with a human agent easily.

We definitely are moving towards such systems by adding control knobs piece by piece. But still aren't looking very actively for a general framework of user initiated supervision. Practically this is not going to be immediately helpful for many reasons known to all of us working in the field, but it's helpful to define various kinds of platforms for machine dialog involving users on a continuum who all program their desired changes using their own interactive languages.

From the technical side, of course there are problems that need to be solved for this. Since natural languages are already there, as compared to a language to be designed, this needs a lot of work in understanding and creating computational models of language acquisition from one end and works in zero and few shot learnings from the other.

Sublanguages are still going to be the more efficient way for working at a certain level, just like mathematical notations are for talking about mathematics, but we can target a natural language REPL layer on top of current dialog platforms and aim for a sort of interactive malleability for growing the system organically afterwards.

2. Growing a conversational agent

This malleability is not only a learning but also design problem at the moment¹. A problem similar to designing a programming language. What we want the language user to accomplish is our focus right now. How we can let the language itself grow is a problem we will be designing systems for in the future. Once this interaction language is capable enough, we can have an ecosystem of extensions and libraries making a minimal system malleable enough to allow building drastically different experiences on top of it.

And it's important that the system accepts growth at various levels if it has to generalize across many use cases. In his talk 'Growing a Language' (steele1999growing), Guy L. Steele cites lack of this growth as the difference between APL and Lisp. In APL "there was no way for a user to grow the language in a smooth way". The smoothness here focuses on extensibility in terms of language primitives. From the same talk:

If I want to help other persons to write all sorts of programs, should I design a small programming language or a large one?

I stand on this claim: I should not design a small language, and I should not design a large one. I need to design a language that can grow. I need to plan ways in which it might grow—but I need, too, to leave some choices so that other persons can make those choices at a later time.

Conversational systems are interesting from many more angles than just what their components look like at the moment, which includes Speech To Text, Language Understanding, Dialog Management, Text To Speech, etc. Improving metrics under the current framework is a decent and important goal, but there are many interesting challenges in the design aspects of how we build and let others build conversational experiences themselves.

./references.bib

Footnotes:

Many researches in self learning systems are orthogonal here as you can have low generalization and still make a natural language REPL.