Category: RiveScript

RiveScript T-Shirt

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Friday, October 29 2010 @ 05:34:36 PM
I created a RiveScript-branded t-shirt using Cafe Press and it arrived today!

Front picture
Front (click for bigger picture)
Back picture
Back (click for bigger picture)

Zoom in on back picture
Zoom-in on back.

If you want one it's $20 at cafepress.com/rivescript.

Java RiveScript

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Thursday, October 21 2010 @ 10:57:25 PM
A couple weeks ago I started over on my Java RiveScript library and now it's feature-complete enough that I'm releasing a beta version. I've created a Google Code project for it at http://code.google.com/p/rivescript-java/.

It supports all the directives and tags that the Perl version does, including the trickier things like the %previous tag and topic inheritance/includes. The only notable thing it doesn't support yet is object macros, because Java isn't a dynamic language and can't dynamically execute more Java code. So, eventually I'll be adding JavaScript support to it for object macros, and maybe even find a clever way to get Perl macros to work with it too.

With this I eventually plan on creating some RiveScript-enhanced Android apps, like a "personal assistant" chatbot that can talk out loud and listen to you using Android's text/speech converter libraries.

RiveScript Licensing, Cont.

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Thursday, September 23 2010 @ 08:30:52 PM
One of my friends started taking some classes to learn C++ recently and would ask me for some homework help now and then. I know much of the syntax of C++ pretty well due to the numerous times I've attempted to learn it by following tutorials, but this all inspired me to try diving back into it again.

On my many attempts to learn C++ I never actually made anything useful that wasn't a tutorial-like program (with the exception being a single-threaded CyanChat library I was working on but never completed). So this time I decided to just skip the tutorial b/s and dive right in to making a RiveScript C++ library, and Google things as I run into them.

I knew about C++ strings (who doesn't?) which are a big improvement over what C has, but I newly discovered vectors (dynamic resizable arrays) and maps (associative arrays) and among these three, it covers all the basic data types I'm used to in Perl.

I hit a roadblock though when it came to constructing the large data structure that the Perl RiveScript module makes and which I'm familiar with, but looking into C++'s struct solved that pretty quickly.

All this attention to RiveScript lately, though, made me remember the point about how the Perl RiveScript module (the only feature-complete implementation of RiveScript to date) is released under the GNU General Public License, which would demand that any application that uses it also be released under open source.

I was thinking of re-licensing RiveScript under a more open license to see if it could drive up usage of it for non-GPL projects. I was considering something like the LGPL, Apache or BSD license. But, I think I have a better idea!

At work I was using ExtJS for a while and I saw how their licensing scheme works: they dual-license their code; the GPL licensed version is free to use, but being GPL code it demands that your entire application be made open source. They then have a commercial license that allows you to use their library in a non-open application, for a fee.

I think this may be the better way to go. For the open source folks who use RiveScript today nothing changes, but if somebody wants to create a commercial application with RiveScript they wouldn't be able to use the GPL-licensed version. After watching all the news about SmarterChild and its parent company (Colloquis, now owned by Microsoft) over the past decade, with their chatterbot patent and their commercial SDK, giving out free code that can be used in a commercial closed-source product doesn't seem like a very smart move. ;)

The chatterbot patent now owned by Microsoft is groundless anyway, because the Net::AIM module on CPAN released in 1999 ships with an example script for an AIM chatterbot which pre-dates ActiveBuddy's inception in 2000. ActiveBuddy's claim that they invented the AIM chatterbot falls in the face of prior art.

Relicensing RiveScript

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Thursday, July 08 2010 @ 03:33:58 PM
It occurred to me today, after having read a lot about other software projects and their licensing schemes, that I should change the license used for my RiveScript library to something less restrictive.

The Android OS project, for example, opted to release the core Android components under the Apache license, rather than the GNU General Public License (GPL) which is usually popular with open source GNU/Linux software.

Using the Apache license was a strategic move for Google, because the Apache license is less restrictive than the GPL; with the GPL, any software you write that incorporates GPL code must also be released, in its entirety, under the GPL license. Just by using GPL code in your application, your whole application must be made open source under the same terms as the GPL'd code you borrowed for it.

To put this in perspective, let's imaging a hypothetical story in which Adobe decides to add a new image format into Photoshop, and they use some code for this image format which is released under the GPL license. Because Photoshop is using GPL'd code--even though they're using only a tiny bit in comparison to the rest of Photoshop's code--Adobe would be forced to release Photoshop in its entirety under the GPL license; they would have to provide all of its source code to its users, and you could therefore just download it and compile your own Photoshop from source code.

Proprietary software companies like Adobe of course wouldn't like this, so they don't generally like to use GPL-licensed code in their products; the GPL is considered viral because it "infects" the entire application and forces it all to become open source just because it is. The GPL is a "copyleft" license.

Android's code is released under the Apache license instead because, that way, individual vendors can make their own modifications and additions to the base Android code, and add proprietary features to keep an edge over their competition. If their contributions to the Android core was forced to be made open source, as the GPL would require, they would be less inclined to spend time and money developing their new features in the first place, because their competitors could easily just take their code for themselves in their own Android devices.

So what does this all mean for RiveScript?

RiveScript has always been released under the GNU General Public License, which would be great if RiveScript was, itself, a complete software application. It isn't. RiveScript is just a library, a Perl module, and it doesn't do anything until you write an actual Perl application that loads the module.

So if somebody wanted to create and sell a closed source "desktop assistant" type application, with a pretty GUI and an animated face and it could speak to you out loud and understand you when you speak to it out loud, and it simply wanted to use my RiveScript module for the artificial intelligence part... they wouldn't be able to release it as a closed source product. The GPL license governing the RiveScript library would force their entire application to be made open source.

This obviously isn't ideal; and so, usually, libraries are released under the GNU Lesser General Public License, which is an ideal license to be used for software libraries.

Under the LGPL, an application would be allowed to use my RiveScript library and still be, overall, a closed source application. The only restriction would be if they wanted to modify RiveScript itself, to add new features to it or extend its functionality in any way; if they touch my RiveScript module itself, their changes would have to be released to the public as open source. But, it wouldn't touch the rest of their product at all; they could still keep their overall product closed source and sell it if they wanted to.

So, I'm thinking I should relicense RiveScript and maybe see if that will help drive more developers to use it in their projects. The LGPL though isn't particularly ideal for a Perl module though since the license deals a lot with terminology like "linking" which only applies to C/C++ code. But maybe the old tried-and-true Artistic License that Perl itself is released under would do just as well.

I'll work out the details later but expect RiveScript 1.22 to be licensed more openly; or, at the very least, dual-licensed to be used in both open and closed source projects.

Alicebot Program V

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Thursday, August 20 2009 @ 01:45:26 PM
Update (12/20/2011): It seems this blog post has been linked by AliceBot.org as a place to find the Program V software. If you're looking to download Program V, I have it hosted at RiveScript.com: programv-0.08.tar.

Update #2 (12/02/2014): I've updated ProgramV 0.08 and released a new version, 0.09, which should work out-of-the-box on modern Perl. If you're interested in ProgramV, check it out on GitHub or read my blog post about the update.

The original blog post follows.


Numerous years ago (2002 or 03) while I was a newb at programming my own chatterbots in Perl (and a newb at Perl in general), there was this program called Alicebot Program V - an implementation of an Alice AIML chatbot programmed in Perl.

When I moved from RunABot (hosted AIML bots) and Alicebot Program D (a Java AIML bot) to Perl, I had to give up using AIML for my bots' response engines because there weren't any simple Perl solutions for parsing AIML code. There was only Program V, and Program V is a monster! I could never figure out how to get it to actually run, and, while it had a dozen Perl modules with it that deal with the AIML files, these modules are too integrated together to separate and use in another program.

And then Program V's site went down and for several years I couldn't find a copy of Program V anymore to give it another shot.

Since I effectively could not use AIML for my Perl bots, I eventually developed an alternative bot response language called RiveScript, and it's text-based instead of XML-based, so it's super easy to deal with. At this point I don't care much for AIML any more, because my RiveScript is more powerful than it anyway.

The only thing RiveScript is missing though is Alice - the flagship bot personality of AIML. Alice has about 40,000 patterns that it can respond to. Users chatting with Alice won't get bored with the conversation for a long, long time. Alice has a large enough reply set that you'll have to chat with it a lot before you can start predicting how she might reply to your next message.

RiveScript, being (relatively) new (and not yet as popular as AIML), hasn't seen any large projects like Alice. Alice was written by Dr. Wallace, who created AIML; should I, as the creator of RiveScript, create an Alice-sized reply base myself? Ha. I wish I had that kind of free time on my hands.

So for a really long period of time I was trying to create an AIML-to-RiveScript translator, so that Alice's 40,000 responses of AIML code could become 40,000 responses of RiveScript code. I've finally accomplished this with a really good degree of success recently. So mission accomplished.

Now, while searching for something unrelated, I managed to come across a site that hosted a copy of the Program V code. Now that I'm much more awesome at Perl than I was back then, I downloaded it and tried getting it set up. It took a bit of tinkering (it was programmed on Perl 5.6 and some things have changed between then and 5.10) but I got it up and running.

Program V works in two parts: first you run a "build script," which reads and processes the AIML code and builds a kind of database file (really it's just the result of Data::Dumper, dumping out a large Perl data structure). And then you run the actual bot script, which just loads this data structure from disk. This is because the Alice AIML set (40K replies) takes about 3 to 4 minutes to load, but the Perl data structure takes milliseconds to load. So you build it first to save lots of time when actually running it.

I was impressed at how fast the bot could reply, too. Most of its replies were coming back in 8 milliseconds or less. In contrast, when I load Alice's brain in RiveScript... it takes 5 seconds to load all the RiveScript code from disk (much faster than Program V's loading of AIML), and then Alice will usually reply in less than 1 second, but longer than 8 milliseconds. So, I had a look at this data structure that Program V creates.

In the data structure, all the patterns in AIML are put into a hash, and categorized by the first word in the pattern. Here's just a snippit:

The following patterns are represented here:
ITS *
ITS BORING
ITS FUN
ITS GOOD *

$data = {
   aiml => {
      matches => {
         'ITS' => [
            '* <that> * <topic> * <pos> 17818',
            'BORING <that> * <topic> * <pos> 17819',
            'FUN <that> * <topic> * <pos> 17820',
            'GOOD * <that> * <topic> * <pos> 17821',
         ],
      },
   },
};
So, since all these patterns began with the word "ITS", they're all categorized under the "ITS"... each item in the array begins with the rest of the pattern (after the word ITS), and then there's separators for the "that", "topic", and "pos" (position). In all the examples here, these patterns had no 'that' or 'topic' tags associated with them, so there's only *'s here. The position is an array index.

The templates (responses to these patterns) are all thrown together into a single large array. All those positions listed in the "matches" structure? Those are array indices.

You might need to know a little Perl to see the performance boost here. A good number of patterns in Alice's brain begin with the same word. So when it's time to match a reply from the human, the program can use the first word in your message as a hash key (say you said "It's good to be the king", it would look up the array above based on the word "ITS"). With Alice's brain, there'd be only a few hundred unique first words to patterns. So this is a relatively small hash, and looking up one of these keys such as "ITS" is really fast. Then, each of the arrays here are relatively small, and the program just loops through them to find out if any of them match your message (taking into account the `that`'s and `topic`s too).

When it finds a match, it has an array index of the template for that match. Pulling an array item by index in Perl is even more wicked fast than a hash. So almost instantaneously you can get a response back.

Compared to the Perl module, RiveScript.pm's, data structure, the one used by Program V is much more efficient. In RiveScript.pm everything is arranged in a hierarchy, sorted by: topic, pattern, reply. RiveScript uses arrays in the end to organize the patterns in the most efficient way possible, but when it comes to actually digging out data from this giant hashref structure, it's a little slower than just using an array like Program V.

Still, though, 1 second response times for a brain that contains 40,000 patterns isn't bad. But I might need to think about recoding RiveScript.pm to use more efficient data structures like Program V.

I have a copy of Program V hosted on RiveScript.com here: programv-0.08.tar.

Kirsle
Channels
Creativity
Software
Web Tools
Subdomains
Miscellany
Links


Fan Club