Category: AIML

Dusting Off Alicebot Program V

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Tuesday, December 02 2014 @ 12:16:34 PM

In an older blog post I mentioned getting Alicebot Program V, a Perl implementation of an Alice AIML bot, up and running on a modern version of Perl, but I didn't go into any details on what I actually did to get this to work (which wasn't very nice of me ;) ).

Unfortunately for me I also didn't even write down any notes for myself, so I had to figure it out again, myself, from scratch. Which I decided to do, only this time I wrote down some notes, and published a new, updated version of Program V which you can check out on GitHub -- or for the lazy, get a zipped release of Program V 0.09.

This blog post is about what I needed to do to get Program V up and running, some of which is also outlined in UpdateNotes.md on the GitHub repo.

So, first of all, Program V was programmed for Perl version 5.6, which is ancient; by the time I had discovered the existence of Program V, the current version of Perl was 5.8, and already, Program V would not run out-of-the-box on Perl 5.8; and I was a newbie back then and had no experience to draw from to figure out how to get Program V to work for me. I wanted to use Perl to make chatbots, and Program V was the only AIML implementation around, but I couldn't use it, and it eventually led me to just create my own chatbot language to replace AIML for me -- RiveScript.

The differences between Perl 5.6 and all newer versions are pretty significant and some things have changed, and that's the basic background on why Program V was hard to get to work.

So the first thing I needed was to install the Perl modules Unicode::String and Unicode::Map8 - I learned this the hard way by trying to run build.pl and getting missing module errors, but the README also mentioned I needed these things. And then I got an error message that probably looks familiar to a lot of people who've tried (and failed) with Program V:

[~/g/programv]$ perl build.pl
Constant subroutine AIML::Common::LOCK_SH redefined at lib/AIML/Common.pm line 52.
Prototype mismatch: sub AIML::Common::LOCK_SH () vs none at lib/AIML/Common.pm line 52.
Constant subroutine AIML::Common::LOCK_EX redefined at lib/AIML/Common.pm line 53.
Prototype mismatch: sub AIML::Common::LOCK_EX () vs none at lib/AIML/Common.pm line 53.
Constant subroutine AIML::Common::LOCK_NB redefined at lib/AIML/Common.pm line 54.
Prototype mismatch: sub AIML::Common::LOCK_NB () vs none at lib/AIML/Common.pm line 54.
Constant subroutine AIML::Common::LOCK_UN redefined at lib/AIML/Common.pm line 55.
Prototype mismatch: sub AIML::Common::LOCK_UN () vs none at lib/AIML/Common.pm line 55.
Died at lib/AIML/Unicode.pm line 102.
Compilation failed in require at lib/AIML/Common.pm line 109.
BEGIN failed--compilation aborted at lib/AIML/Common.pm line 109.
Compilation failed in require at build.pl line 32.
BEGIN failed--compilation aborted at build.pl line 32.

The most interesting line to me was when it said Died at lib/AIML/Unicode.pm line 102. without any error message (looks like a die; statement in Perl with no message), so I looked at this line of this file, and it was doing this:

my $changer = do ( $pkg ) or die;

So I debugged it (read: added a print "$pkg\n"; above that line) and found it was trying to import unicode/To/Upper.pl which was nowhere to be found. On my Linux system I ran a locate Upper.pl command, and found that a similarly named file existed at /usr/share/perl5/unicore/To/Upper.pl - the key difference is that "unicode" is spelled "unicore".

There were 3 lines of code in AIML::Unicode that reference files like this and I changed all three to point to the "unicore" versions - fixed it! build.pl would now make some progress.

It then complained about a couple of the AIML files containing illegal characters in their <pattern> tags. No big deal, went in and fixed them. build.pl runs successfully now! And then I ran shell.pl and was dropped into an interactive chat session with an Alice bot.

As for those AIML::Common::LOCK_SH warnings: I have to assume that in Perl 5.6, file locking (flock) was either a new feature or not widely implemented, and the code was trying to do a few error checks (in a Perl 5.6 style way) and if flock isn't found to be available, it would shim those constants itself. But, flock is available, but the error checks were still going off, so AIML::Common was overriding the constants. So, I just fixed the error checks and problem solved. ;)

So once I got it working, I published the updated version of Program V on my GitHub account for others to check it out.

This doesn't mean Program V is actively maintained again. It was abandoned by its author in 2002 and hasn't been able to run on modern Perl ever since; I simply updated it so it works again. If that inspires you, feel free to fork it and maintain it on your own - it was released by its author under the same terms as Perl itself.

Alicebot Program V

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Thursday, August 20 2009 @ 01:45:26 PM
Update (12/20/2011): It seems this blog post has been linked by AliceBot.org as a place to find the Program V software. If you're looking to download Program V, I have it hosted at RiveScript.com: programv-0.08.tar.

Update #2 (12/02/2014): I've updated ProgramV 0.08 and released a new version, 0.09, which should work out-of-the-box on modern Perl. If you're interested in ProgramV, check it out on GitHub or read my blog post about the update.

The original blog post follows.


Numerous years ago (2002 or 03) while I was a newb at programming my own chatterbots in Perl (and a newb at Perl in general), there was this program called Alicebot Program V - an implementation of an Alice AIML chatbot programmed in Perl.

When I moved from RunABot (hosted AIML bots) and Alicebot Program D (a Java AIML bot) to Perl, I had to give up using AIML for my bots' response engines because there weren't any simple Perl solutions for parsing AIML code. There was only Program V, and Program V is a monster! I could never figure out how to get it to actually run, and, while it had a dozen Perl modules with it that deal with the AIML files, these modules are too integrated together to separate and use in another program.

And then Program V's site went down and for several years I couldn't find a copy of Program V anymore to give it another shot.

Since I effectively could not use AIML for my Perl bots, I eventually developed an alternative bot response language called RiveScript, and it's text-based instead of XML-based, so it's super easy to deal with. At this point I don't care much for AIML any more, because my RiveScript is more powerful than it anyway.

The only thing RiveScript is missing though is Alice - the flagship bot personality of AIML. Alice has about 40,000 patterns that it can respond to. Users chatting with Alice won't get bored with the conversation for a long, long time. Alice has a large enough reply set that you'll have to chat with it a lot before you can start predicting how she might reply to your next message.

RiveScript, being (relatively) new (and not yet as popular as AIML), hasn't seen any large projects like Alice. Alice was written by Dr. Wallace, who created AIML; should I, as the creator of RiveScript, create an Alice-sized reply base myself? Ha. I wish I had that kind of free time on my hands.

So for a really long period of time I was trying to create an AIML-to-RiveScript translator, so that Alice's 40,000 responses of AIML code could become 40,000 responses of RiveScript code. I've finally accomplished this with a really good degree of success recently. So mission accomplished.

Now, while searching for something unrelated, I managed to come across a site that hosted a copy of the Program V code. Now that I'm much more awesome at Perl than I was back then, I downloaded it and tried getting it set up. It took a bit of tinkering (it was programmed on Perl 5.6 and some things have changed between then and 5.10) but I got it up and running.

Program V works in two parts: first you run a "build script," which reads and processes the AIML code and builds a kind of database file (really it's just the result of Data::Dumper, dumping out a large Perl data structure). And then you run the actual bot script, which just loads this data structure from disk. This is because the Alice AIML set (40K replies) takes about 3 to 4 minutes to load, but the Perl data structure takes milliseconds to load. So you build it first to save lots of time when actually running it.

I was impressed at how fast the bot could reply, too. Most of its replies were coming back in 8 milliseconds or less. In contrast, when I load Alice's brain in RiveScript... it takes 5 seconds to load all the RiveScript code from disk (much faster than Program V's loading of AIML), and then Alice will usually reply in less than 1 second, but longer than 8 milliseconds. So, I had a look at this data structure that Program V creates.

In the data structure, all the patterns in AIML are put into a hash, and categorized by the first word in the pattern. Here's just a snippit:

The following patterns are represented here:
ITS *
ITS BORING
ITS FUN
ITS GOOD *

$data = {
   aiml => {
      matches => {
         'ITS' => [
            '* <that> * <topic> * <pos> 17818',
            'BORING <that> * <topic> * <pos> 17819',
            'FUN <that> * <topic> * <pos> 17820',
            'GOOD * <that> * <topic> * <pos> 17821',
         ],
      },
   },
};
So, since all these patterns began with the word "ITS", they're all categorized under the "ITS"... each item in the array begins with the rest of the pattern (after the word ITS), and then there's separators for the "that", "topic", and "pos" (position). In all the examples here, these patterns had no 'that' or 'topic' tags associated with them, so there's only *'s here. The position is an array index.

The templates (responses to these patterns) are all thrown together into a single large array. All those positions listed in the "matches" structure? Those are array indices.

You might need to know a little Perl to see the performance boost here. A good number of patterns in Alice's brain begin with the same word. So when it's time to match a reply from the human, the program can use the first word in your message as a hash key (say you said "It's good to be the king", it would look up the array above based on the word "ITS"). With Alice's brain, there'd be only a few hundred unique first words to patterns. So this is a relatively small hash, and looking up one of these keys such as "ITS" is really fast. Then, each of the arrays here are relatively small, and the program just loops through them to find out if any of them match your message (taking into account the `that`'s and `topic`s too).

When it finds a match, it has an array index of the template for that match. Pulling an array item by index in Perl is even more wicked fast than a hash. So almost instantaneously you can get a response back.

Compared to the Perl module, RiveScript.pm's, data structure, the one used by Program V is much more efficient. In RiveScript.pm everything is arranged in a hierarchy, sorted by: topic, pattern, reply. RiveScript uses arrays in the end to organize the patterns in the most efficient way possible, but when it comes to actually digging out data from this giant hashref structure, it's a little slower than just using an array like Program V.

Still, though, 1 second response times for a brain that contains 40,000 patterns isn't bad. But I might need to think about recoding RiveScript.pm to use more efficient data structures like Program V.

I have a copy of Program V hosted on RiveScript.com here: programv-0.08.tar.

Kirsle
Channels
Creativity
Software
Web Tools
Subdomains
Miscellany
Links


Fan Club