Kirsle.net logo Kirsle.net

Tagged as: RiveScript

Relicensing RiveScript
July 8, 2010 by Noah
It occurred to me today, after having read a lot about other software projects and their licensing schemes, that I should change the license used for my RiveScript library to something less restrictive.

The Android OS project, for example, opted to release the core Android components under the Apache license, rather than the GNU General Public License (GPL) which is usually popular with open source GNU/Linux software.

Using the Apache license was a strategic move for Google, because the Apache license is less restrictive than the GPL; with the GPL, any software you write that incorporates GPL code must also be released, in its entirety, under the GPL license. Just by using GPL code in your application, your whole application must be made open source under the same terms as the GPL'd code you borrowed for it.

To put this in perspective, let's imaging a hypothetical story in which Adobe decides to add a new image format into Photoshop, and they use some code for this image format which is released under the GPL license. Because Photoshop is using GPL'd code--even though they're using only a tiny bit in comparison to the rest of Photoshop's code--Adobe would be forced to release Photoshop in its entirety under the GPL license; they would have to provide all of its source code to its users, and you could therefore just download it and compile your own Photoshop from source code.

Proprietary software companies like Adobe of course wouldn't like this, so they don't generally like to use GPL-licensed code in their products; the GPL is considered viral because it "infects" the entire application and forces it all to become open source just because it is. The GPL is a "copyleft" license.

Android's code is released under the Apache license instead because, that way, individual vendors can make their own modifications and additions to the base Android code, and add proprietary features to keep an edge over their competition. If their contributions to the Android core was forced to be made open source, as the GPL would require, they would be less inclined to spend time and money developing their new features in the first place, because their competitors could easily just take their code for themselves in their own Android devices.

So what does this all mean for RiveScript?

RiveScript has always been released under the GNU General Public License, which would be great if RiveScript was, itself, a complete software application. It isn't. RiveScript is just a library, a Perl module, and it doesn't do anything until you write an actual Perl application that loads the module.

So if somebody wanted to create and sell a closed source "desktop assistant" type application, with a pretty GUI and an animated face and it could speak to you out loud and understand you when you speak to it out loud, and it simply wanted to use my RiveScript module for the artificial intelligence part... they wouldn't be able to release it as a closed source product. The GPL license governing the RiveScript library would force their entire application to be made open source.

This obviously isn't ideal; and so, usually, libraries are released under the GNU Lesser General Public License, which is an ideal license to be used for software libraries.

Under the LGPL, an application would be allowed to use my RiveScript library and still be, overall, a closed source application. The only restriction would be if they wanted to modify RiveScript itself, to add new features to it or extend its functionality in any way; if they touch my RiveScript module itself, their changes would have to be released to the public as open source. But, it wouldn't touch the rest of their product at all; they could still keep their overall product closed source and sell it if they wanted to.

So, I'm thinking I should relicense RiveScript and maybe see if that will help drive more developers to use it in their projects. The LGPL though isn't particularly ideal for a Perl module though since the license deals a lot with terminology like "linking" which only applies to C/C++ code. But maybe the old tried-and-true Artistic License that Perl itself is released under would do just as well.

I'll work out the details later but expect RiveScript 1.22 to be licensed more openly; or, at the very least, dual-licensed to be used in both open and closed source projects.

Tags: 2 comments | Permalink
Alicebot Program V
August 20, 2009 by Noah
Update (12/20/2011): It seems this blog post has been linked by AliceBot.org as a place to find the Program V software. If you're looking to download Program V, I have it hosted at RiveScript.com: programv-0.08.tar.

Update #2 (12/02/2014): I've updated ProgramV 0.08 and released a new version, 0.09, which should work out-of-the-box on modern Perl. If you're interested in ProgramV, check it out on GitHub or read my blog post about the update.

The original blog post follows.


Numerous years ago (2002 or 03) while I was a newb at programming my own chatterbots in Perl (and a newb at Perl in general), there was this program called Alicebot Program V - an implementation of an Alice AIML chatbot programmed in Perl.

When I moved from RunABot (hosted AIML bots) and Alicebot Program D (a Java AIML bot) to Perl, I had to give up using AIML for my bots' response engines because there weren't any simple Perl solutions for parsing AIML code. There was only Program V, and Program V is a monster! I could never figure out how to get it to actually run, and, while it had a dozen Perl modules with it that deal with the AIML files, these modules are too integrated together to separate and use in another program.

And then Program V's site went down and for several years I couldn't find a copy of Program V anymore to give it another shot.

Since I effectively could not use AIML for my Perl bots, I eventually developed an alternative bot response language called RiveScript, and it's text-based instead of XML-based, so it's super easy to deal with. At this point I don't care much for AIML any more, because my RiveScript is more powerful than it anyway.

The only thing RiveScript is missing though is Alice - the flagship bot personality of AIML. Alice has about 40,000 patterns that it can respond to. Users chatting with Alice won't get bored with the conversation for a long, long time. Alice has a large enough reply set that you'll have to chat with it a lot before you can start predicting how she might reply to your next message.

RiveScript, being (relatively) new (and not yet as popular as AIML), hasn't seen any large projects like Alice. Alice was written by Dr. Wallace, who created AIML; should I, as the creator of RiveScript, create an Alice-sized reply base myself? Ha. I wish I had that kind of free time on my hands.

So for a really long period of time I was trying to create an AIML-to-RiveScript translator, so that Alice's 40,000 responses of AIML code could become 40,000 responses of RiveScript code. I've finally accomplished this with a really good degree of success recently. So mission accomplished.

Now, while searching for something unrelated, I managed to come across a site that hosted a copy of the Program V code. Now that I'm much more awesome at Perl than I was back then, I downloaded it and tried getting it set up. It took a bit of tinkering (it was programmed on Perl 5.6 and some things have changed between then and 5.10) but I got it up and running.

Program V works in two parts: first you run a "build script," which reads and processes the AIML code and builds a kind of database file (really it's just the result of Data::Dumper, dumping out a large Perl data structure). And then you run the actual bot script, which just loads this data structure from disk. This is because the Alice AIML set (40K replies) takes about 3 to 4 minutes to load, but the Perl data structure takes milliseconds to load. So you build it first to save lots of time when actually running it.

I was impressed at how fast the bot could reply, too. Most of its replies were coming back in 8 milliseconds or less. In contrast, when I load Alice's brain in RiveScript... it takes 5 seconds to load all the RiveScript code from disk (much faster than Program V's loading of AIML), and then Alice will usually reply in less than 1 second, but longer than 8 milliseconds. So, I had a look at this data structure that Program V creates.

In the data structure, all the patterns in AIML are put into a hash, and categorized by the first word in the pattern. Here's just a snippit:

The following patterns are represented here:
ITS *
ITS BORING
ITS FUN
ITS GOOD *

$data = {
   aiml => {
      matches => {
         'ITS' => [
            '* <that> * <topic> * <pos> 17818',
            'BORING <that> * <topic> * <pos> 17819',
            'FUN <that> * <topic> * <pos> 17820',
            'GOOD * <that> * <topic> * <pos> 17821',
         ],
      },
   },
};
So, since all these patterns began with the word "ITS", they're all categorized under the "ITS"... each item in the array begins with the rest of the pattern (after the word ITS), and then there's separators for the "that", "topic", and "pos" (position). In all the examples here, these patterns had no 'that' or 'topic' tags associated with them, so there's only *'s here. The position is an array index.

The templates (responses to these patterns) are all thrown together into a single large array. All those positions listed in the "matches" structure? Those are array indices.

You might need to know a little Perl to see the performance boost here. A good number of patterns in Alice's brain begin with the same word. So when it's time to match a reply from the human, the program can use the first word in your message as a hash key (say you said "It's good to be the king", it would look up the array above based on the word "ITS"). With Alice's brain, there'd be only a few hundred unique first words to patterns. So this is a relatively small hash, and looking up one of these keys such as "ITS" is really fast. Then, each of the arrays here are relatively small, and the program just loops through them to find out if any of them match your message (taking into account the `that`'s and `topic`s too).

When it finds a match, it has an array index of the template for that match. Pulling an array item by index in Perl is even more wicked fast than a hash. So almost instantaneously you can get a response back.

Compared to the Perl module, RiveScript.pm's, data structure, the one used by Program V is much more efficient. In RiveScript.pm everything is arranged in a hierarchy, sorted by: topic, pattern, reply. RiveScript uses arrays in the end to organize the patterns in the most efficient way possible, but when it comes to actually digging out data from this giant hashref structure, it's a little slower than just using an array like Program V.

Still, though, 1 second response times for a brain that contains 40,000 patterns isn't bad. But I might need to think about recoding RiveScript.pm to use more efficient data structures like Program V.

I have a copy of Program V hosted on RiveScript.com here: programv-0.08.tar.

Tags: 3 comments | Permalink
C++ CyanChat Library?
July 10, 2009 by Noah
I've finally motivated myself to sitting down and putting some time into learning C++. To that end, I found a good C++ tutorial at learncpp.com which is much better than the tutorial at cplusplus.com, which I had tried following in the past.

The cplusplus.com tutorial only covers the syntax of C++, but doesn't go into anything practical, such as working with more than one source file, or how header files work, or how to compile a program that has multiple files. The tutorial at learncpp.com though covers all of these things and then some -- it even explains how C++ programs are organized and a bunch of other helpful techniques that clears up a lot of the fuzziness that other tutorials leave ya with.

So after a week and change, I've gotten about halfway through the tutorials and am attempting to write my own programs from scratch -- actual programs, not tutorial-like things. So, I've decided to start piecing together a CyanChat library. Why CyanChat? Because of its simplicity:

  • CyanChat is a stupid-simple text-based chat protocol
  • All it requires programmatically is basic access to sockets, and sending/receiving lines of text over the network.
A CyanChat client was the first network program I wrote in Perl, the first library I wrote in Java, and it's going to be the first library I write in C++ (my Java library is available only here for now: /projects/Java/).

Programming this CyanChat library so far is a bit more tricky than it was in Perl and Java. There is a full C++ CyanChat client available named Magenta, but looking at its source code doesn't help me very much -- this is a Windows application, and the source file that handles the sockets is an override of the Win32 CSocket library, which is Windows-specific. I want to use something cross-platform.

Right now I'm using the rudesocket C++ library, although I may need to ditch it for something different, because its setTimeout() function doesn't seem to work and so reading from the socket hangs until the server sends data. This isn't scalable.

When I get the library completed, I'll try building it on Windows to get that experience (that'll be a blast...), and then I'll attempt to build a dynamic Windows DLL file from it -- and then link that DLL into Perl using Win32::API -- to see that everything is successful.

My eventual goal in C++ programming is to build a RiveScript interpreter library in it, and build it as a dynamic DLL, so that practically every programming language will then be able to link it and use it (or C/C++ programs can statically link it from its source code if they want). I could even create a Perl module named RiveScript::XS, which compiles statically with the C++ RiveScript interpreter, which might give it additional speed over the pure-Perl RiveScript module -- or again, just to see that it all works how I want it to.

Tags: 0 comments | Permalink
AiRS - Artificial Intelligence: RiveScript
April 30, 2009 by Noah
After having not run a chatterbot full-time for the past three years or so, I decided to create one again. The initial reason for having a bot running 24/7 is that, if I was away from home and my home IP address changed (which it does every now and then) -- therefore breaking the DNS on the hostname I gave my PC -- there could be a bot on AIM running from my PC that I could ask to see what my new IP address is.

And today is one such day that I tried to SSH home and got a "Connection refused", because my IP had changed. My bot told me what my new address was, and all was good.

Which brings me to my next point: the program I wrote for my bots I named AiRS (Artificial Intelligence: RiveScript). If any of my readers know me from when I used to run AiChaos.com, this bot is sort of like my Juggernaut and Leviathan programs. The bot can run multiple connections (mirrors) to multiple listeners (so far, AIM and HTTP, but I'll be adding MSN Messenger support shortly). Unlike Juggernaut and Leviathan, though, the program uses RiveScript and RiveScript only as the reply engine.

By the time I release the program it will definitely have support for AIM, MSN, IRC, and HTTP. Other listeners? Maybe, maybe not. Jabber is a possibility. It will depend on what existing modules are available, how usable they are, how well they work, etc.

At any rate, you can chat with my AIM bot by sending an IM to AiRS Aiden. So far it just chats, but it can also tell you how the weather is, play mad libs, and do a couple less cool things.

Tags: 0 comments | Permalink
Java!
December 17, 2008 by Noah
Recently I got the bright idea to just sit down and put some time into learning how to program in Java. Why? My ultimate goal in which programming language I wanna figure out is C++, but every time I try that, it gives me a new reason to hate it (for instance, I wrote some code in Dev-C++ 4 which compiled and executed, but would not compile in Dev-C++ 5 anymore). Also, Java is under the GNU General Public License now, it runs the same on multiple platforms without much thought (like Perl), and it's less picky than C++ (and is therefore a good stepping stone along my journey to finally tackling C++).

I started with Sun's tutorials, beginning with basic command-line apps to get the syntax down, then moving into the GUI tutorials with Swing. I'm not putting high priority on learning Java applet programming right at the moment, because nobody likes Java applets anymore.

And so now I'm at the point where I'm attempting to program my own things from scratch. A logical place to start was to create a Java class for the CyanChat protocol. The goal of it is to match the functionality of Net::CyanChat, and then one day I might even program a "Java CyanChat Client", to complement my current Perl CyanChat Client (and by Java CC Client, I don't mean an applet; the standard CyanChat client is an applet -- I mean a GUI application).

My CyanChat package is named org.kirsle.network.CyanChat for right now. Eventually I intend to program a RiveScript interpreter in Java, to open the door up to Java developers to get into the world of RiveScript (and because the only RiveScript interpreter currently in existence is written in Perl). Then, one of my goals in C++ is to compile a "RiveScript.dll" file, which can be dynamically linked with C/C++ programs or any other language that can dynamically link a DLL. :)

Since I'm serious about Java development, I made a nice lil avatar for Java-related blog posts, and spent more than 5 minutes creating it too.

Tags: 0 comments | Permalink
Project RiveScript
June 19, 2008 by Noah
I've added a page about my RiveScript project. It's something I started work on a good 4 or 5 years ago, then abandoned, then started work on again. I thought it needed a page.

So here's its page.

Tags: 0 comments | Permalink