« RC2 (already) | Main | Version 4.5 Released »

May 05, 2005

Highlights of RC3

Release Candidate 3 is out, and this one is really a big deal. As long as no more big bugs are uncovered in the near future, this is essentially what's going to be in the final release of 4.5. If possible, I'd like to address the outstanding problems with the ICQ and AIM listeners, and I'm looking around through installer packages. But with rc3 already, 4.5 is a big step forward for Program D, and I wanted to highlight a couple of things that I'm glad I've been able to include.

First, the AIML testing framework. Albertas Mickensas from MegaLogika developed this using the 4.1.5 codebase. We got in touch, and I adapted it to the 4.5 architecture. The process of adaptation itself brought improvements to Program D, as it finally provided a concrete impetus for me to implement a real plugin architecture. The testing framework is the first use of this architecture. It is implemented as a plugin available from the Program D shell with the command /test.

The AIML community has been waiting for a tool like this for many years. The closest approximation we had, prior to Albertas's framework, was typified by the testcases.aiml and its accompanying text script, developed by Tom Ringate some years back. To test Program D (or any interpreter that could read from stdin), you would “feed” the script to the engine, and then visually inspect the results. Any bugs discovered there constituted an indication of a functional problem in the implementation of the AIML spec; a successful run through all tests was supposed to indicate proper implementation. When Tom first put this out, it was a big improvement over the total absence of a test suite. Other people used this approach for testing their AIML sets. But this form of testing was never something that could really scale to big knowledge bases.

Albertas's framework changes all that. Now, you can define a test case using a formal (but simple!) XML structure. Here's an example:

<TestCase>
  <Input>what's your name?</Input>
  <ExpectedAnswer>My name is Joebot.</ExpectedAnswer>
</TestCase>

This is the simplest test case: an input to send the engine, and an expected answer. Only an exact match constitutes a pass. It's also possible to specify keywords that should, or should not, appear in a response, and an expected length. I'll probably add an element letting you give a regular expression that should match the response.

I have adapted the old test cases mentioned above, to produce the first-ever automated AIML Compliance Test Suite. This is now included in Program D as AIML.aiml and AIML.xml. The release is configured, by default (in bots.xml) to load the AIML.aiml file, so you can test that Program D actually does what the AIML spec says by typing "/test" at a D shell prompt. Pretty cool. This helped me find a couple of significant bugs, which have been addressed in rc3.

The other big item I wanted to highlight is the Flash interface. A few years ago, Chris Fahey provided the world with the first Flash interface to an AIML interpreter. At the time, it was a novelty—now, with services like Pandorabots and Oddcast, we see more and more Flash bot interfaces. But Chris's interface hadn't been updated in a while, and when I got around to testing it with Program D, I found that it didn't work anymore. This was probably due to changes I had made to the program, primarily an insistence on XML compliance that caused me to include Chris's original template inside a sort of "dummy" tag so that the Flash chat template became a valid XML file.

In any case, Daniel Ireland, an interactive designer from Australia, zeroed in on the Flash interface for the first release candidate of 4.5, and found it lacking. A series of exchanges with him finally resulted in a fixed version of the Flash toolkit, which is now included in Program D for the first time. Many thanks to Daniel for fixing what was broken and bringing this important client up to date!

In rc3, I have added parameters to the flash-responder.xml file to allow you to convert HTML line breaking elements (br and p) to actual line breaks, and to strip out other markup, since the Flash client itself does not do anything with these items. These parameters are set to true as shipped. If you wish to modify the Flash client (the source is also provided, of course), then you can turn off these parameters and make use of HTML tags, or whatever else you choose to pass in the response.

Beyond being happy that Program D now ships with a working Flash client, however, I'm also very glad about how this experience validates, at least in my mind, the XML template processor that was created for 4.5. This is currently used by the HTMLResponder and the FlashResponder, but it should be very easily usable by any responder someone might care to build, such as (for instance) an interface to an external SMS system for mobile phones, in which one would like to send a simple XML message with the bot's response. Perhaps there's still something missing here for the sending part—one would also want to be able to receive an XML message—but with all the infrastructure reworking that's been done with this version, this should be far easier than it was before.

Lastly, I'd like to mention that the major revision of the Program D architecture to a "component-friendly" form seems to be getting favorable press. I've been in touch with a few people who, over the years, have indicated an interest in integrating Program D with other kinds of systems. It appears that the re-architecture work has indeed answered a lot of their qualms about the original structure: now, I'm gathering, it should be far simpler to deploy Program D as a JavaBean, an EJB, or whatever. I know little about these technologies, so I'm looking to others to guidance, but it appears this is in the pipeline. That's exciting to me. The only person I haven't heard back from is the creator of "ChatterBean", an effort which I found very inspiring. I still hope he'll write me back and we can do something "synergistic".

Anyway, I'll be looking forward to your comments on Program D, by way of the mailing list, this blog, private email, or whatever suits you. Thanks to all those who've participated so far!

Noel

Posted by Noel at May 5, 2005 10:58 PM

Comments