Welcome!

Welcome to Kirsle.net! This is the personal website of Noah Petherbridge, and it's where my various software projects and web blog lives.

RiveScript Rewritten in CoffeeScript

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Wednesday, April 22 2015 @ 07:47:51 PM

I've spent the last few months porting over the JavaScript version of my chatbot scripting language, RiveScript, into CoffeeScript. I mainly did this because I didn't like to maintain raw JavaScript code (especially after I ran a linter on it and had to rearrange my variable declarations because JavaScript's variable scoping rules are stupid), but I also took the opportunity to restructure the code and make it more maintainable in the future.

For some historical background, the first RiveScript implementation was written in Perl, and I designed it to be one big monolithic file (RiveScript.pm), because then I could tell noobs that they can update to a new version of RiveScript in their bots by just dropping in a new file to replace the old one, and avoid overwhelming them with the complexity of doing a CPAN installation (especially if they're generally new to Perl as well, and all they really wanna do is just run a chatbot.)

Source File Restructuring

The Python and JavaScript ports are more-or-less direct ports of the Perl version: I literally read the Perl source from top to bottom and translated it into each respective language. So, there was rivescript.js which was a huge 2,900 line long wall of JavaScript and one of my goals in the refactor was to spread logic out into multiple files, so if I have a bug to fix in the reply matching code, I don't have to scroll through pages and pages of loading/parsing code to get to the relevant part.

The new file structure is like this:

  • rivescript.js - The user-facing API, has all the same public functions as the old one
  • parser.js - A self-contained module that loads RiveScript code into an "abstract syntax tree" (a JSON serializable blob that represents ALL of the parsed RiveScript code in a program-friendly format).
  • sorting.js - The rat's nest that is the implementation behind sortReplies() is contained here.
  • inheritance.js - Functions related to topic inheritance/includes are here.
  • brain.js - The code that actually gets a reply is here.
  • utils.js - All those miscellaneous internal utility functions, like quotemeta() and such.
  • lang/javascript.js - The implementation for JavaScript object macros in your RiveScript code.
  • lang/coffee.js - This is new - you can use CoffeeScript in your object macros (it's not enabled by default, you'll probably have to snipe lang/coffee.js and include it in your own project for now, but the built-in shell.coffee uses it out-of-the-box).

Logic Refactoring

Another thing that was messy in the Perl, Python and JS versions of RiveScript was how it lays out its internal data structures in memory. If you'd ever run the equivalent of Data::Dumper you'd see reply and trigger data appearing in multiple places depending on whether it had a %Previous tag on it or not.

It looked something like this:

$RiveScript = {
    "topics" => {
        # Most reply data is under here, BUT NOT triggers with %Previous!
        "random" => { # (topic names)
            "hello bot" => { # (trigger texts)
                "reply" => { # (replies to the trigger)
                    0 => "Hello, human!",
                    1 => "Hi there!",
                },
                "condition" => { # (conditions)
                    0 => "<get name> != undefined => Hello, <get name>!",
                },
                "redirect" => undef, # (if there's an @redirect)
            },
            # ...
        },
    },
    "thats" => {
        # This is like 'topics' but ONLY for triggers with %Previous
        "random" => { # (topic names like before)
            "who is there" => { # (the %Previous text)
                "*" => { # (trigger text)
                    # then things were like the above
                    "reply" => { 0 => "<sentence> who?" },
                    "condition" => {},
                    "redirect" => undef,
                }
            }
        }
    },

    # And then sorting! Again the 'normal' replies are completely
    # segregated from those with %Previous
    "sorted" => {
        "random" => [ # (topic names)
            "hello bot", # (triggers, sorted)
            "*",
        ],
    },
    "sortsthat" => {
        # This one is simply a trigger list for ones that have %Previous
        "random" => [ # (topic names)
            "*",
        ]
    }
    "sortedthat" => {
        # This one is, in case one %Previous has more than one answer,
        # if we ONLY had the above sort we'd overwrite the first answer
        # with the second.. this one keeps track of all "replies" with
        # the same %Previous
        "random" => {
            "who is there" => [
                "*",
            ]
        }
    },
}

As you can see, it was a mess. There were a ton of different places where reply data was kept, and triggers had to be sorted many different ways for various edge cases. Additionally, keeping track of which topics inherit or include others was kept in a separate data structure from the topics themselves!

In the new refactor of RiveScript-js I eliminated as much duplication as possible. Now, the entirety of the reply base exists under the topics key, and instead of using lots of nested dictionaries, e.g. ->{topic}->{trigger}->{reply} which had the issue of one trigger overwriting the data for another if they both happened to have the same text (e.g. when you have two answers to the same %previous question), the ordering of the triggers is preserved. The data structure ends up looking like this:

{
    "topics": { // main reply data
      "random": { // (topic name)
        "includes": {}, // included topics
        "inherits": {}, // inherited topics
        "triggers": [ // array of triggers
          {
            "trigger": "hello bot",
            "reply": [], // array of replies
            "condition": [], // array of conditions
            "redirect": "",  // @ redirect command
            "previous": null, // % previous command
          },
          ...
        ]
      }
    }
}

Additionally, whenever the code refers to reply data (for example, in the sort buffers and the %previous tree), it refers to a specific index in the singular topic structure for the reply data. This has the other side benefit that, while getting a reply for the user, when a matching trigger is found it already has the pointer to that trigger's responses right away. It doesn't have to look it up from the central topic structure like before (and since the replies are kept on an array, this would be impractical now anyway).

Keeping the replies on an array also naturally takes care of the issue with multiple responses to the same %previous without needing a third sort buffer (we still do need a separate sort buffer for %previous replies themselves, though).

Other Implementations

The Perl and Python versions probably won't get updated anytime too soon to fit this new structure, but any future ports of RiveScript to other languages that I work on will be based off this new CoffeeScript version. I have some vague plans right now to port RiveScript over to Google's Go language, and I wanted to get the refactor out of the way first so I have a new "template" to reference when writing a new port.

So, Why CoffeeScript?

To elaborate more on why I rewrote it in CoffeeScript instead of just doing this restructure in Node-style JavaScript:

  • A while back I started JS-linting the JavaScript codebase and fixing all of the mistakes it pointed out.
  • JavaScript has really weird variable scoping issues, where all variables declared within a function, no matter where it's declared, gets implicitly "hoisted" to the top of the function as though you had declared it up there instead. Sane programming languages have lexical scoping where variables declared inside of a { block } of code (including loop variables, i.e. for (my $i = 0; ...)) die with that block's closing brace, for example Perl does it this way.
  • In numerous places in my code, I would have a long function that does a loop over "something" in more than one place, always doing for (var i = 0; ...). Declaring var i multiple times in the same function is the error, so I had to take all the reused variable names and make this ugly, var i, iend, j, jend, match; at the tops of functions.
  • CoffeeScript has a clean syntax compared to JS (it looks quite like Python), and it automagically handles your variable declarations. You simply use a variable (like Python), and CoffeeScript handles when and where to declare it in the JavaScript output.

As for all the haters that say things like, "ECMAScript 6 makes CoffeeScript obsolete because it adds classes and the arrow operator and everything else into the core JavaScript language": I see no reason at all that CoffeeScript can't one day compile into ECMAScript 6 the way that it does into ES5 right now, so it's not like CoffeeScript is a dying language that will no longer be maintained in the future. In a worst case scenario, I could program a CoffeeScript to ES6 compiler myself if nobody else will. It's not as if I have no experience writing scripting language parsers. ;)

And with GitHub backing it (they use CoffeeScript to program my favorite text editor, Atom), the language isn't going anywhere anytime soon.

Linux Desktop Remote Code Execution via File URI

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Friday, March 27 2015 @ 09:04:08 PM
I've discovered a sort of "remote code execution" vulnerability that affects all Linux desktops, particularly Fedora and Ubuntu but most likely all desktop Linux distributions could be affected, except for maybe Arch or Gentoo with extremely customized installations.

First and foremost: this requires the victim to click not one, but two random links sent to them over Pidgin (or any other program that does URL auto-linking the way Pidgin does). So it's not exactly the most severe vulnerability, but I found it interesting nonetheless.

Demonstration video:

Key Ingredients

A few different parts are responsible for this:

Application Launchers (.desktop files)

Graphical desktop environments use .desktop files as their application launchers. They're plain text, INI-style files that describe the application's name, icon, and executable command.

A vulnerability was discovered a while back about Application Launchers, wherein a launcher file could be double-clicked on and it would run the command inside, in exactly the same way that a .exe file can be double-clicked on Windows. And, like its Windows counterpart, it was easy to accidentally double-click an application launcher that you didn't intend to.

In the linked article's example, you could attach a .desktop launcher to an e-mail in much the same way as a virus would, and a careless Linux user could accidentally run it just as easily.

In response, all of the Linux desktop environments made a change, so that you must explicitly mark an application launcher as "executable" before it can be double-clicked to run. Trying to run a non-executable launcher would result in a warning pop-up, telling you that the launcher is not trusted and giving you options to mark it executable, launch it anyway, or cancel.

Pidgin (well, not just Pidgin)

I'm only picking on Pidgin here but other apps are possible attack vectors here.

Pidgin will take any text that looks like a URL, in the format protocol://urlpath and make it into a clickable link. This includes file://, but thankfully does NOT include javascript:.

On Linux desktops, the underlying mechanism Pidgin uses is xdg-open, which is the Linux version of the open command on Mac OS X, or rough equivalent of the start command on Windows. It will open anything using your preferred application, for example HTTP URLs will open in your default browser, text files in your default text editor, etc.

Also, Adium does the same thing for the Mac OS X users; have fun poking at that.

Freedesktop.org's xdg-open

Pidgin is only guilty as far as making file:// URIs clickable; xdg-open is the next part.

These were some of the behaviors I've seen with xdg-open and what kinds of file:/// links work in Pidgin:

  • A link to a binary executable file (file:///usr/bin/gedit) would execute the program on click.
  • However, a link to an executable text file/shell script (file:///usr/bin/firefox -- a shell script), would NOT execute when clicked. Nothing would happen.
  • A link to a file of a known type would open that file in your preferred program.
But most interestingly, xdg-open will execute an application launcher (.desktop) even when it isn't marked as executable.

The Perfect Storm

Put all of these together and here's an attack path:

  1. Send the victim an HTTP link to a URL that forces a download of an application launcher .desktop file, by using the Content-Disposition: attachment response header (this would likely need to be done via a PHP or CGI script).
  2. Victim clicks the link. Google Chrome and Chromium will automatically download the launcher with no prompt. Firefox will ask to open it or save it. On most Linux desktops, the download will go into ~/Downloads in their home folder.
  3. Send the victim a file link to the launcher in their Downloads folder, like file://Downloads/Pwned.desktop
  4. Victim clicks on the file link and the launcher executes and can run whatever code it wants.
  5. ???
  6. Profit
The CGI script that forces the application launcher download can be as simple as this:

#!/usr/bin/env python

print """Content-Type: application/octet-stream
Content-Disposition: attachment; filename=Pwned.desktop

[Desktop Entry]
Type=Application
Name=Pwned
Exec=/usr/bin/bash -c 'echo "I pwned you :)" | /usr/bin/gedit -'
"""

Bug Tickets?

Pidgin should be aware of this problem because it's come up multiple times for them and yet they're still doing it on Pidgin v2.10.11

For the xdg-open bug I filed a Red Hat bug ticket here. I'm half expecting to get a response back such as, "xdg-open simply forwards the request to your desktop environment's native file opener," but it is what it is.

The Good: AIM blocks file URIs

AOL Instant Messenger seems to block any IMs sent that contain a file:// URI in them. When I was initially testing this, I was sending IMs to myself and noticed that any time I used a file URI, I didn't get my own message echoed back to me. For the proof of concept video, I used my personal XMPP server.

Minecraft Map: Swampcore

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Friday, March 06 2015 @ 05:12:00 PM

I've created a downloadable Minecraft map that implements "Swampcore" (older blog post about Swampcore). It's a superflat swamp biome preset with a 24/7 thunderstorm, making for an extremely hostile map where you have to scratch and claw your way into having so much as a simple dirt shack to call "home".

It runs on vanilla Minecraft (no mods or anything needed) and was created on Minecraft v1.8.3, but will probably work on later versions too.

Superflat Preset

This is the preset code that was used to generate the map:

3;9*minecraft:bedrock,minecraft:dirt,minecraft:grass;6;biome_1,decoration,lake,lava_lake

From the bottom up, you have:

  • 9 layers of bedrock
  • 1 layer of dirt
  • 1 layer of grass

World Features

The superflat swamp includes lakes and lava lakes, oak trees, vines, sugar canes, mushrooms and tall grass. The water lakes can sometimes spawn with sand or clay, and the lava lakes spawn with stone around them. The stone can occasionally be an ore block instead (such as iron or redstone). Diamonds are able to spawn around the lava lakes as well (world surface is at Y=11) but are extraordinarily rare.

There is a 24/7 thunderstorm which keeps the sky dark and prevents zombies and skeletons from burning up during the day, and even allows for hostile mobs to spawn during the day. Players will frequently get killed while just trying to build a cheap dirt shack to live in.

The only stone on the map is around lava lakes, so players will need to loot a lava lake in order to build a furnace or stone tools. The main source of coal will be from burning wood logs in a furnace, as coal ore will be extremely rare. Zombies can rarely drop iron ingots when killed, or players can try to find iron ore around lava lakes.

The nether portal can be built and activated, but since diamonds are practically nonexistent players would have to build a mould for a portal and pour lava and water in using buckets to slowly build the frame. Since gravel is extremely hard to find (impossible?), no Flint & Steel, so players would have to use lava with wood planks to set a fire near the portal and wait for it to spread into the frame and activate it (pretty tedious).

With the presence of sugar canes and access to the nether, splash potions of weakness and golden apples can be crafted and a zombie villager could be healed, so players can build their own NPC village. With villagers players could trade for diamond gear and lots of other items that are otherwise impossible to get on this world.

In short, it's entirely possible to get to the late game on this world (the main thing missing is access to the End Dimension, since no strongholds exist, but that's probably for the better in a multiplayer server anyway). It's just very, very difficult. Even when you manage to secure a safe perimeter, venturing outside to get any more resources or loot more lava lakes remains just as dangerous as on day one.

Spawn Room

This map is especially designed to be played on multiplayer survival (SMP). All new players who join the server (and players who die without having a bed to reset their spawn point) will be placed in the spawn room, which is a radially symmetrical room with 12 golden pressure plates along the walls. Each pressure plate would teleport the player into an unpredictable place in the swamp around a large radius (each destination is about 2,000 blocks away from the next).

Spawn Room
The spawn room that new players will find themselves in.

The idea is to distribute the players around the world. If you die and reappear in the spawn room, it's difficult to find your way back to where you left off because it's hard to know exactly which pressure plate you used the last time.

Swampcore
Where a player might end up when stepping on a pressure plate.

Download

Place it in your .minecraft/saves folder for single player or use it as your world on multiplayer.

Technical Bits

  • I moved the world spawn point to be at coordinate 0,0 on the X/Z plane.
  • A large 40x40 floating platform is centered on 0,250,0 (250 blocks in the air) which encompasses the spawn region for players on multiplayer. All newly joining players land on this platform at first before being teleported into the spawn room.
  • The spawn room is around 0,4,0 inside the bedrock layers beneath the world spawn point.
  • Command blocks that enforce the 24/7 thunderstorm and teleport players into the spawn room are around 0,1,0 beneath the spawn room, using a hopper clock.
  • Players in creative mode are allowed to approach the 40x40 spawn platform at the top of the world without being teleported to the spawn room. This is reserved for any maintenance tasks, etc., by server operators.

RiveScript.com Makeover

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Sunday, January 25 2015 @ 09:11:03 PM

I've just spent pretty much the whole day redoing the website for RiveScript.com, and I think it looks pretty nice.

Screenshot

RiveScript.com was the final website on my server that was still running on my legacy PerlSiikir CMS, and it's been on my to-do list for a while to get it migrated over to my new Python CMS, Rophako like what Kirsle.net is currently running on. The old Perl code was clunky and ugly and memory-leaky, and now I'll be at ease if I ever need to migrate to a new web server, as my Python web apps are extremely quick to get up-and-running, whereas it was an hours-long ordeal to get PerlSiikir to run.

So, the bulk of the work actually needed for RiveScript.com was purely front-end. I revamped the whole web design to use Twitter Bootstrap and make it look all hip and edgy like how all the other small software project websites are these days.

Besides the programming language on the back-end, I had other reasons for why I wanted to simplify RiveScript.com: I don't have the motivation or energy to do as much with that site as I did previously.

I used to run a YaBB Forum on RiveScript.com, but it wasn't extremely active and it was getting hit by too many spam bots, so several months ago I shut that down and linked to the RiveScript forum at Chatbots.org.

More recently I had programmed a chatbot hosting service that was on RiveScript.com, but that wasn't very popular either. I know nobody was using it because it had been broken for months and I hadn't heard any complaints. ;) A couple months ago I sunsetted that feature by turning off new site registrations and removing some references to the feature. And now that's officially gone! If you actually had a bot hosted there, contact me and I can get you your bot's reply files back.

So, the new site is simple and minimalistic and is just about the RiveScript language itself. It was an ordeal rewriting all of the pages from scratch (well, most of them) but now that it's done, the site should be very low-maintenance for me.

One of the most fun parts of it was that I ported over my "Try RiveScript Online" page to use JavaScript and run in the browser, whereas the old version had a Perl back-end (because a JavaScript version of RiveScript didn't exist at the time), so that's even one less thing for me to maintain and make sure it doesn't break. :)

And, the front-end pages for the new site are also open source, FWIW.

Brief GNOME 3.14 Review

Noah Petherbridge
kirsle
Posted by Noah Petherbridge on Thursday, December 18 2014 @ 12:39:53 AM

I jumped ship from GNOME 2 to XFCE when GNOME 3 was announced and have ranted about it endlessly, but then I decided to give GNOME 3.14 (Fedora 21) a try.

I still installed Fedora XFCE on all the PCs I care about, and decided my personal laptop was the perfect guinea pig for GNOME because I never do anything with that laptop and wouldn't mind re-formatting it again for XFCE if I turn out not to like Gnome.

After scouring the GNOME Shell extensions I installed a handful that made my desktop somewhat tolerable:

Screenshot
(Click for bigger screenshot)

And then I found way too many little papercuts, some worse than others. My brief list:

Settings weren't always respected very well, and some apps would need to be "coerced" into actually looking at their settings. For example, I configured the GNOME Terminal to use a transparent background. It worked when I first set it up, but then it would rarely work after that. If I opened a new terminal, the background would be solid black. Adjusting the transparency setting now had no effect. Sometimes, opening and closing a tab would get GNOME Terminal to actually read its settings and turn transparent. Most of the time though, it didn't, and nothing I could do would get the transparency to come back on. It all depended on the alignment of the stars and when GNOME Terminal damn well feels like it.

Also, I use a left handed mouse, and GNOME Shell completely got confused after a reboot. The task bar and window buttons (maximize, close, etc.) and other Shell components would be right handed, while the actual apps I use would be left handed. So, clicking the scrollbar and links in Firefox would be left-handed (right mouse button is your "left click"), and when I wanted to close out of Firefox, I'd instead get a context menu popup when clicking the "X" button. Ugh!

I wanted to write this blog post from within GNOME but it just wasn't possible. With different parts of my GUI using right-handed buttons and other parts using left-handed ones, I had context menus popping up when I didn't want them and none popping up when I did. After a while I thought to go into the Mouse settings and switch it back; this didn't help, instead, the parts that used to be right-handed switched to left-handed, and vice versa. It was impossible to use. I just had to painstakingly get a screenshot off the laptop and to my desktop and deal with it over there instead.

These things just lead me to believe the GNOME developers only develop for their particular workflows and don't bother testing any features that other mere mortals might like to use. All the GNOME developers are probably right-handed, and they have no idea about the left-handed bugs. All of the GNOME developers don't use transparency in their terminals, evidenced by the fact that the transparency option disappeared from GNOME 3.0 and only just recently has made a comeback (in GNOME 3.12/Fedora 20).

XFCE is going back on this laptop.

Kirsle
Channels
Creativity
Software
Web Tools
Subdomains
Miscellany
Links


Fan Club