Thursday, August 31, 2006

Performance: Block Variables Breakdown

In response to my previous blog, Chris Nokleberg noted that if we're using String#equals a lot, interning will have an additional benefit...namely because String#equals short-circuits if both String objects are ==. I had forgotten about that benefit, so I thought I'd poke around for places it might have an effect.

In digging a little, I was reminded that DynamicVariableSet, which holds block-local variables, does a linear search for retrievals and mutations. Compare that to method-local variables in Scope, for which all clients must pass an index to get or set a value. I'm not sure if there's a reason why DynVars must be retrieved by name and Scope vars can be directly indexed...it's probably historical from the C code or a limitation of the current parser. In any case, a linear search could theoretically represent a performance issue accessing or modifying block-local variables. So I thought I'd run some numbers to see how frequently we have to search past the 0 value in the DynVarSet. Here's the breakdown while gem installing rake:

Count of block var retrieves from indicies 0 through 6:
516436
226049
42687
1170
656
122
1

So a total of 787121 block variable lookups, with percentages below:

index 0: 65.6%
index 1: 28.7%
index 2: 5.4%
...with the remainder filling out the last 0.3% of accesses

What can we learn from this? Well DynVarSet currently allocates initial space for up to 8 variables, based on someone's long-past examination of code that showed a maximum of 6 was a good estimate. The numbers above show that, for RubyGems + RDoc at least, a maximum of 3 would cover 99.7% of all cases without requiring expansion of the array, and that with a subsequent expansion to 6 slots we would cover all but 0.0000127% of cases. So using 6 as a maximum and always allocating 8 slots is perhaps a bit generous. More profiling on more code is warranted here, but it might be safe to drop the initial size to 3 and leave in the doubling when expanding.

It also shows us that we pay the cost of two String#equals calls for 28.7% of lookups, and pay for three in 5.4% of lookups, with higher counts statistically insignificant. So although it might seem like we're unlikely to suffer much of a performance hit because most block var lookups are at index 0, our cost to lookup or modify block variables is doubled in over a quarter of cases and tripled in a twentieth of cases. If we normalize all lookups to the same cost (X * 99.7 versus X * 139.2), that translates to a 28% performance boost for block variable lookup cost. Note that I'm not measuring block variable modifications here either, which would conceivably see a similar boost.

And of course what it doesn't show (but which we know to be true) is that if we were able to eliminate both the linear search and the call to String#equals, we'd save a very large percentage of block var lookup time, since we'd reduce it to a simple array indexing. I think it's worth exploring how we could do that--especially since we already do it for Scope in all the same places.

Isn't optimizing legacy code fun?

Performance: Inlining Strings

A while ago, we decided to inline all appropriate symbolic strings as they entered the AST. This appeared to help performance a measurable amount, presumably for two reasons:

  1. The AST would take up less space in the long term
  2. Since Strings cache their hashcodes, having each identical string in the AST be the same object would reduce the number of hash calculations.
And the numbers seem to bear it out. Here's a local rake install:

Without interning AST strings:
real 1m19.894s
user 1m18.649s
sys 0m0.920s

With interning AST strings:
real 1m15.021s
user 1m13.785s
sys 0m0.948s

So it's very measurable...in this case 4-5 seconds out of 80 seconds, or about a 6% gain with interning.

However this week I realized the fallacy of the second point above. In order to intern each incoming string, Java must hash them. This causes all strings entering the AST to be hashed once anyway in order to get the interned version. In fact, it could reduce performance, since we're now forced to hash strings we might never encounter during execution.

I can't confirm that this is 100% true, but it's a reasonable theory. String.intern() is native code, but logic dictates that the fastest way to intern strings would be to use an internal hash/symbol table. So I proceeded this evening to try removing interning to see how it affected performance. I hoped to get a small gain during execution due to a large gain during parsing. The numbers above show that I lose a measurable amount overall by interning. However, I do see substantial parse performance gains:

With interning AST strings (Time.now - t method of measuring 100 parses):
11.101

Without interning AST strings:
9.404

About 1.7 seconds out of 11, or a whopping 15% gain in parse speed without interning. AARGH.

At this point I'm leaning toward removing interning and swallowing the 6% performance hit temporarily until we can figure out where it's coming from. Logically, interning everything should only add overhead, or so my brain tells me. Where's the flaw in my logic?

My London Schedule

So here's the times and places. I want to get together JRubyists to just chat a bit at some point, and it seems like Thursday or Friday would work best. The only concrete suggestion so far is the Fitzroy Tavern on the 15th, around 6:30PM-7:00PM (thanks Damian). This works for me, and it's after the conference. Here's a goofy Google Map to Fitzroy from the conference center.

I'm also planning to attend the "Pizza on Rails" event on the 13th, so folks can find me there.

I'll keep this entry updated as things come in, so it can be used for reference.

13 September, 6:00AM: Arrival in London (Heathrow)

Other than getting checked into the hotel, I have that whole day free. I don't plan to cram for my presentation or anything, since it's not until Friday (and I don't really work that way). To avoid the lag I'll probably sleep as much as I can on the plane. Getting out and about once I get there ought to get the circadians whipped into shape.

13 September, 6:00PM: Pizza on Rails

Pizza on Rails is around 6PM, with pizza at 7PM.

14 September, 9:00AM: RailsConf Day One

The "Welcome" is at 9, and it goes straight through until 8:30PM with a lunch break from 12:30PM to 2:00PM and a dinner break from 5:50PM to 7:30PM. I'll probably try to grab meals with folks from the conf.

15 September, 9:30AM: RailsConf Day Two

Lunch break is the same as Day One, and the conf is scheduled to go through until 6PM, with "Closing Remarks" coming after that. Probably be out by 6:30, and the Fitzroy is pretty close. Maybe someone not going to the conf can get there early to scope it out.

15 September, 3:00PM: JRuby on Rails

I'm in the last set of speakers, before the final plenary session with James Duncan Davidson and Dave Thomas. I'm up against cleaning up rails, adding unicode to rails, and project management for rails. I wish I could attend the unicode session, but I'll try to grab Dominic offline.

15 September, 6:30PM: JRuby Meetup

So far I think we'll just be at Fitzroy Tavern, as mentioned at the top of this post. All are welcome...I don't fly out until 1:25PM the next day, so I can stick around for a while.

16 September 10:00AM: Departure

Given the recent TERROR scare, I'm going to be extra careful and head to the airport nice and early. I'm not particularly worried about TERROR, but I am over-cautious about giving security enough time to scour my person for soda pop and lip balm.

Thus ends my first trip to London. It's too bad it will be so short, but I have business to attend to stateside immediately after the conference. Hopefully I'll stop back in when I'm in the EU for JavaPolis.

I hope to see plenty of JRubyists in London!

Wednesday, August 30, 2006

IronPython Demo for John Udell

Tim Bray tossed me a link to an IronPython screencast given by Jim Hugunin for Jon Udell.

On the surface, it does look fairly impressive. However I'm not impressed for the reasons some folks might be. So here's the notes I took while watching this demonstration...take them for what they're worth.

IronPython with Avalon demo

Here Jim demonstrates instantiating and manipulating a small Avalon UI in an interactive python session. This is essentially the same as the demonstration I gave at JavaOne using IRB to script Swing components (and no, I'm not claiming to be the first to do such a demo). It shows that Avalon produces very impressive and beautiful UIs, and the XAML under the covers is pretty big and ugly. It also shows that IronPython has some nice integrations with .NET code. Beyond that, I wasn't any more impressed than I was when I got the IRB stuff working. It's cool, to be sure, but it's not new.

Notes:

  • Tab completion is nice, but in an interactive shell it isn't very exciting. IRB already does this in Ruby (not in JRuby unless you have a ReadLine library wired in), and completion against running code is easy.
  • Importing a whole namespace from .NET is really, really nice. We would love to be able to efficiently support the same, but there's currently no capability in Java to get all classes or subpackages in a package. That's the reason why you have to include each class you want. You can include a whole package with JRuby, but for each package you include we have to brute-force search them for all Java classes from then on. O(n) to locate a class, where n = number of packages imported. It sucks. Including the classes directly has no unreasonable overhead, though.
  • It looks like IronPython has a few nice wrappers around the delegate-based events in .NET. That's the b.Click += some_handler...some_handler is python code that becomes a delegate instance; the += adds it to the Click event. Nice and simple. Everyone feels differently about events+delegates versus listener interfaces, however, and in Java we've got the latter. We'll need a different way to make things nice and simple.
  • Most of the UI interaction stuff we can already do with JRuby, though it's not wrappered quite as much; you're basically just calling Swing methods directly. We could add a lot here by making something smart that handles common UI use cases for you. This demo somewhat reminds me of the Groovy demo at JavaOne 2006, where the presenter used OLE automation to work with an Excel document. It was a great OLE demo. It was a pretty good Excel automation demo. However it was only a mildly entertaining Groovy demo because it just called OLE methods like any other language could. The Avalon stuff in this demo is similar.
  • Almost all the functionality shown, including XAML manipulation, etc, is .NET code, not IronPython code. Ignore anything about XAML and Avalon to see what's really interesting...the various little ways IronPython makes .NET components available to python. Again, an impressive Avalon or XAML or .NET demo...but only a pretty nice IronPython demo. But pretty nice is pretty good. IronPython looks pretty nice so far.
Visual Studio demo

VS 2005+ is very nice, but I think most folks already knew that. However the python support looks much better than I expected, on par with what RDT has built for Ruby in Eclipse, though maybe a bit more polished.

Notes:
  • It was a bit of a dodge that he didn't admit how Jython compares performance-wise to IronPython. Of course, Jython has been around longer, and I'd wager that it's faster. I've made it perfectly clear that Ruby is still faster than JRuby, because I don't think distracting folks from JRuby's performance issues will make them any less of a concern. It's too slow. We'll fix it.
  • I've given a lot of thought to where we'd need to hook into JRuby to be able to online debug code, and it won't be terribly difficult to do. We also might be able to simply get Ruby's normal debugging hooks working with a little effort, and just use those remotely. Either way integration into the IDE is just icing on the cake.
  • I completely disagree with the long-standing idea that programs should be implemented in Python or Ruby (or some other dynlang) and then migrated toward C# or Java (or some other static lang). I application code should *stay* Python or Ruby, because much of the value of those languages is in the long-term maintenance of the code you write. The minute you port that code to C# or Java (or C) you've lost that benefit. You've set it in stone, as far as I'm concerned. Is any application code ever to a point where you want to set it in stone? Wouldn't we rather just code in dynlang all the time?
  • I didn't see any code completion or refactoring of python in Visual Studio, unfortunately. There was a little C# completion only, which is well-known and deserving of praise. That's a big area Ruby IDEs could jump ahead. The editing capabilities for python are also only touched on briefly, since he presents a pre-written python script and then writes and calls some C# code.
  • The online debugging in Visual Studio looks really nice...I want that for Ruby. The JRuby compiler I've been working on will include .rb filenames and line numbers in the eventual stack trace from the compiled code, so eventually you'll be able to step through Ruby code using my compiler...even after it's compiled to Java class files:
    at MyCompiledScript.fib_java (samples/test_fib_compiler.rb:2)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

PowerShell

I read a bit about the PowerShell stuff the other day and I don't really have much to say here. It's basically a set of .NET objects that are interactively available to dynamic .NET languages for interacting with the system...a nice set of OO components to provide what UNIX command-line utilities do on UNIX, in a sense. It's neat and all, but it's not relevant to IronPython except that you can script those components in python.

Other notes

Using import clr in IronPython is a clever way to add in .NET decoration of python types; we could probably add that, so that Ruby strings automatically have Java String methods.

Conclusion

My opinions are sour grapes, perhaps...but I could do almost all of the non-IDE stuff in a screencast of my own. The IDE stuff will come soon. It's still a very nice demo, and kudos to Jim for what IronPython has become. I'm impressed.

I think it's safe to say that we're behind IronPython on having a fast, well-integrated implementation (a detriment of the legacy code from the original port) but way ahead on completeness and compatibility with C Ruby (a benefit of the legacy code from the original port). The IDE stuff is not yet as nice for us, even counting the Eclipse-based RDT. NetBeans and Eclipse are in many ways not as nice as Visual Studio, so there's some additional catchup to do there.

I'd like to know how folks out there feel about our approach to JRuby, working to make it a drop-in replacement for C Ruby so it can take advantage of all the libraries and applications out there. As far as I know, neither Jython nor IronPython can actually run most of the major Python applications, and certainly nothing as complicated as Rails. Our choice has made some things harder, but I think the value of making Ruby work on the JVM would be greatly diluted if you couldn't run real Ruby apps.

FWIW, I've had a long-standing task to do a nice screencast like this for JRuby using Swing and other libraries, and I dir a very similar UI demo for JavaOne. Hopefully soon I'll have time to do a longer, fancier one.

Tuesday, August 29, 2006

From Eclipse to NetBeans: Part 2

I had planned to post a "Part 1.5" that summarized all the emailed and commented help I got for my first post, but I've been too busy working on JRuby on Rails to try most of those suggestions out. Several folks offered a whole bundle of useful tips that will address many of the "bad" points in Part 1. I will revisit those as I have time, and hopefully use them to compile a "NetBeans for Eclipse Users Walkthrough. For now, however, I continue to muddle through with NetBeans as-is.

NetBeans continues to surprise me in both good and bad ways. Some features seem to be very nice and very well thought out, while others...well I wonder if anyone's actually using them, because they're very cumbersome. Most of the UI seems to be solid and clean under Java 6, but there are some occasional jarring glitches and (in my opinion) poorly-designed interfaces. Sometimes NetBeans seems to know exactly what I want to do...sometimes it seems to make what I want to do as hard as possible. It's still been worlds better than earlier versions, but I'd have expected a level of polish I'm just not seeing yet. Is it because I'm running a beta IDE on a beta JVM? Perhaps. Perhaps not.

However I'm going to be a trooper about this and do what I can to fight through issues. I think NetBeans really has a lot of promise if these things can be ironed out, and truth be told, most stuff works fine. However the stuff that doesn't work...REALLY doesn't work.

And again, please don't consider this an attack on the NetBeans product or team. NetBeans is very impressive...but there's room for a lot of good improvement. I hope this hard look at NetBeans vs Eclipse will help make progress to that end.

The Bad:

  1. "Extract Method" should have an option to replace all instances of identical code with calls to the new method. I use that feature in Eclipse *very* frequently.
  2. No "Inline Method" refactoring? I really need that one. Here's a typical use case for me:
    • N methods all have the same code, for whatever reason..perhaps I've refactored and simplified them each down to the same chunk of code.
    • Extract Method to create a new method that does that duplicate block, and replace all instances of duplicate code so all the old methods call the new one.
    • Inline all the old methods so all callers now call the single new method. The old methods are eliminated.
    In this way, I manage to make all callers calling duplicative methods call a single new method without ever manually modifying a single line of code. I have many such refactoring tricks I have learned over the years, many of which depend on being able to inline methods and extract methods globally. JRuby would not be where it is today without those refactorings.
  3. The "Find Usages" dialog has a grand total of two or three checkboxes (depending on visibility), one of which I will probably rarely use (find in comments) and the other which should probably just be on by default (find overrides). I would much prefer to just go straight to the search results, rather than always have to hit "Next" on this dialog.
  4. There's no way to look at the project view and know whether I have errors in my project or not, even after running a build. In Eclipse, if I save a class with errors, the entire project and all directories leading to that class get marked with a big red X. In NetBeans I can see if the current file has errors, which is good...but being able to see errors anywhere in the project is a must when I'm making refactoring changes that can affect more than one file. I need to know at a glance if I've broken something else. For larger projects, re-running the build can be prohibitively slow.
  5. My Projects and Files views only show the little "versioning cylinder" icon next to SOME of the items in my project. Even more confounding, in the Projects view, only SOME packages show up as versioned. I haven't figured out a pattern to it.
  6. I should be able to change the delay between typing code and having NetBeans re-parse it and display error squiggles. By my estimate, it takes about three seconds for it to realize I've typed bad code, by which time I've almost certainly moved on to the next line or a different file completely. Yes, it checks when I save, but checking as I type *very slowly* is not really useful. EDIT: I did find this setting...it's at 2000ms by default. I've set it to 1000ms, so we'll see if it's disruptive. Is it set so long because it causes a lot of overhead?
  7. There are surprisingly few "non-advanced" settings and a baffling array of "advanced" settings. Almost everything I need to do is in the "advanced" section. Perhaps I could just turn on an "always advanced" view and save myself that button-press?
  8. Why can I right-click items in the Advanced Settings tree, to bring up the exact same properties listing I see on the right-hand pane? Prefer one way to do things...the right-click properties dialog should go.
  9. I can't find how to change the default L&F. Maybe it's in the Advanced Settings somewhere, but I couldn't locate it. Metal is an OK L&F, but it's jarring when the rest of my system looks completely different. It also seems too "bright" to me, for some reason...I prefer softer earth tones.
  10. Code completion brings up two popups: one for the list of methods, and one for documentation on the selected method. One displays itself above the line I'm typing and one displays below the line. This effectively hides all contextual code around the line I'm typing, which is frustrating. The documentation window is also enormous, and usually mostly empty.
  11. There doesn't appear to be a way to revert changes when the changes are new files. Modified files get reverted, but not files or directories that are new. Oddly enough, they DO show up as changes when choosing "Show Changes".
  12. I selected all new items in the "Show Changes" pane and hit the delete key...BEEP, nothing happens. I right clicked and selected Delete...and it appeared to proceed with the delete operation. The Delete key should map to a logical Delete operation whenever possible.
  13. After deleting, the view continued to show new files that were no longer visible in the Files view, and refreshing did not help.
  14. After manually cleaning out the new files, my NetBeans project appeared to be totally hosed, returning a goofy error about a parameter field not being normalized (filed as bug 83665). I had to completely wipe out my working copy and re-get it...
  15. ...only to find that once I had deleted my project, NetBeans totally forgot my SVN URL and I had to retype it. Thankfully it remembered my credentials, but why didn't my old URL show up in the list of possible URLs anymore?
  16. Then it again asked me where I wanted to check things out, defaulting to my home dir. Can't I just set a single work location and be done with it?
At this point I figured I should call it a night.

The Good:
  1. NetBeans shuts down WAY faster than Eclipse. That's a huge benefit. I don't know what the hell Eclipse is doing that takes as much as 15 seconds to shut down, even if it was sitting idle.
  2. NetBeans ships with way more capabilities than stock Eclipse. I found the Module Manager today to turn some stuff off and was absolutely stunned to see how much stuff comes in the base install. However there's a bad side to this too: NetBeans could be perceived as having a slower startup time because of all these modules, even though most folks will never use most of them. Maybe have a few coarse-grained use cases people can choose to enable AFTER installing and starting up the first time. At any rate, I'm impressed with what ships out-of-the-box.
  3. I can confirm that antialiasing appears to work for the whole IDE under Java 6, as reported by one commenter. Very nice!
  4. I like being able to turn on line numbers for all editors in one place. However, others may want to be more picky.
  5. NetBeans seems to be much more deterministic than Eclipse. For example, when Eclipse is under heavy load, occasionally mouse clicks and keypresses will disappear into the ether. When Eclipse gets laggy I frequently have to double-check that the code or shortcut I just typed was actually received. This sometimes leads to me typing only half of a line of text or typing a search query into my editor. NetBeans so far does not seem to "lose" UI events like Eclipse does.
  6. I see that right-clicking editor tabs DOES allow me access to Subversion. I just don't have it in the main pane's context menu. That's totally acceptable, so mark another item off the bad list from Part 1.
Subversion...Wherefore Art Thou

The subversion issues have been severe enough I figured I should rant a bit before closing this article out. As far as I can tell, SVN support is simply broken in 5.5b2, to the point that I'm now afraid to perform many operations out of fear that they'll nuke my working copy in some heinous way. In this case, I had no local changes to lose (or to struggle to save), but I could picture some painful corruption like the "parameter field" error coming right in the middle of some multi-file refactoring. I pray that doesn't happen.

Painful to live in fear, isn't it?

Jack-of-all-Trades...but...

NetBeans seems to have a vast range of capabilities, rivaling multiple major Eclipse-related projects all rolled into one. However there's a lot to be said for polishing the core pieces to death before launching into a new set of features. Subversion should just work. The UI should just work, without glitches. Professional-grade refactorings should work like you expect them to (and Eclipse's complement of refactorings is actually overshadowed by some other IDEs, so NB is well behind the curve here). Under no circumstances should a working copy be so badly damaged by an operation that it must be flushed out and re-gotten. And so on...

These key features need to work perfectly, all the time, before anyone will consider launching into more advanced features like Java EE support or Database Browsing. If I can't even do basic SCM operations on my project, why would I ever invest time in anything more than simple Java code editing?

That said, I think NetBeans could really shine like a star with some polishing...polishing that may be coming in NB6, I really don't know. Most of these issues are obvious or minor, and many of them are in newer features that have perhaps not been tested as well as one would like. Hopefully the excellent improvements in NetBeans over the past year will draw more users...and more testers...to the project.

Monday, August 28, 2006

RailConf Europe 2006 - I Will Be There!

It's official! Travel arrangements are currently being made for me to attend and present at RailsConf Europe 2006 in London!

I had not expected to make it, so this comes as a bit of a surprise. I guess I need to put together some sort of presentation now, don't I? Oh, and make sure I'm happy enough with JRuby's Rails support to be comfortable...

I'll take comments on this off-the-top-of-my-head agenda:

  • Who Am I
  • What is JRuby (dime tour, nothing too deep)
  • Why Rails on JRuby (probably a few slides on JDBC, alt persistence layers, EJB integration, legacy code integration, why JRuby is better in some ways than Ruby)
  • Why not Java-based web framework X
  • Demos
  • What doesn't work
  • What's next, future plans, how can you help
  • Q/A
I'm also hoping the JRubyist Mongrelites are able to get things working before then. I'll probably try to help them, wrap up enough bugs so that most walkthroughs in AWDwR work nicely, and create the presentation over the next two weeks.

I hope to see you in London!

P.S. If anyone wants to give me any pointers on travel from the US to the UK, please do. I'll be travelling alone and I've never been across the pond. Thankfully, I already have my passport, but other than that I'll be flying blind. If any Brits feel like taking me under their wing once I arrive (or just grabbing a pint or two), that would also be much appreciated. I plan to arrive during the evening of the 13th, flying back the morning of the 16th.

Sunday, August 27, 2006

From Eclipse to NetBeans, Part 1

I am attempting to make the switch from Eclipse to NetBeans, and this is a raw dump of the pros and cons so far during that process. Note that these are not meant to question design decisions behind various NetBeans features; they are simply differences that have made the conversion harder or easier.

Stats

Opteron 150 w/ 2G memory
Ubuntu Linux 6, AMD64
Java 5, x86_64 version
NetBeans 5.5b2

The Bad

I'll start off with the bad, because these are more obvious. It's much harder to list good things, since the "best" features will be those that are intuitive to an Eclipse user and that require no re-learning. Please keep in mind that I'm a rank newbie when it comes to NetBeans, but I'm a pretty solid developer. In other words, if I have trouble with these things, most new users will too. There will be more to come, and I'm willing to discuss these with any of the core NetBeans folks at length.

The list, in the order in which I encountered them this evening:

  1. Why isn't antialiasing turned on by default, and why is it a bit hard to find? I don't think of this as an editor setting...I want antialiasing on everywhere. To be honest, I don't understand why this is even a setting. Why would I choose the "make everything uglier" option?
  2. SVN should be available in the default install. Perhaps I'm biased because I need it, but I really, really loathe having to install additional plugins whenever I set up a new workspace.
  3. The concept of multiple workspaces with their own plugins and settings is VERY attractive. NetBeans seems to lack this concept.
  4. I have always preferred to keep the concepts of repository management and project sources separate, rather than making a hard link in my IDE between the two. NetBeans appears to require me to go through a lot of anguish each time I want to switch a project from trunk to branch or back.
  5. Also, there has developed in the Subversion world a standard of using top-level "trunk", "branches", "tags" dirs, and at least one SVN plugin for Eclipse uses that convention to make switching branches, merging, and updating more intuitive. NetBeans should do the same, allowing the typical branch/tag operations against this de-facto standard layout.
  6. I tried to delete a CVS-based project to re-checkout via SVN, and it didn't appear to delete all the files successfully. I had to do it manually AND go back through the Subversion wizard again.
  7. I hate wizards. Don't make me go through a wizard for every one or two settings when I could easily enter them all at once. Wizards like this are totally unhelpful.
  8. Some time back, Eclipse made the wise decision to also show all non-source files in the source or "package" view. It was widely considered a very good idea, and I miss it. I'd like to have a single view that has source or "Project" smarts without masking files and dirs I really want to see.
  9. Eclipse recently added a full-text search to their settings dialog that works very well for finding the location of hard-to-reach settings. I would have found two NetBeans settings much more quickly with such a search by entering "antialias" and "browser".
  10. Browser selection should prefer the settings of the host platform (I'm on Ubuntu Linux).
  11. NetBeans is a goofy name for an IDE. When I think NetBeans I think "beans for doing network stuff". Forte and Eclipse are better names, though completely undescriptive. Something along the lines of "UberIDE" would be even better. Names like Visual Studio, JBuilder, IDEA, and so on have far less ambiguity about what they mean. NetBeans seems misleading, and I doubt anyone would guess it's an IDE if asked.
  12. Eclipse keybindings should include C-A-t (open a specific class) and C-A-r (open a specific file/resource anywhere in the project)
  13. Eclipse provides options for sorting the Outline or "Members View" of a class by name or by visibility, which is nice.
  14. Eclipse provides options to show packages in a hierarchy rather than as expanded names, which is very nice when expanded package names take up a lot of space (as in JRuby). There are also features to allow using a hierarchy but still collapsing empty packages, so I could have org.jruby as a top-level node (because the org package is empty) and then source files and subpackages under it.
  15. Editors are anti-aliased, but nothing else is. Why?
  16. Eclipse supports collapsing multiline /* */ comments, which is very nice for us since every JRuby file includes a reasonably long licensing block at the top.
  17. Eclipse uses color, bolding, and italics better to differentiate different kinds of variables, methods, keywords, and type names.I would rather see type names, static variables and methods, constants, and fields offset in text than method calls and names. It appears that NetBeans by default only offsets the following:
    • Comments: grey (I think green is better/easier to read, but that's a matter of opinion)
    • Method names, both declarations and calls: bold black
    • Keywords: blue
    Perhaps accessibility plays a decision in the more drab and colorblind-friendly defaults?
  18. Idle/startup memory use seems a lot higher in NetBeans. Mine's idling at 399MB, where Eclipse hovered under 200MB most of the time with several other projects checked out.
  19. FIXME-style task tags in comments are really, really nice, and I know a LOT of projects that use them (basically, EVERY project that uses Eclipse)
  20. I would like to be able to manually clear search results from previous searches to eliminate text highlighting.
  21. Being able to right click within the file and have SCM actions available in that menu is extremely nice, so I can go to an editor for a file I know I want to commit or compare and do that action right away.
  22. Many files in my project do not have a Subversion submenu in any view...they have a CVS view, and it wants to try to add or delete them from CVS. This isn't even a CVS project, so I have no idea why the CVS submenu would EVER be appropriate to show up. As it is, a large number of files show the CVS submenu and can't be committed directly; I have to commit at a higher level. This is a major goofy bug, perhaps in the Subversion module.
  23. NetBeans takes longer than Eclipse to start up.
The Good

It's only fair to sugar the salt a bit, since it's so far been a pretty good migration. NetBeans has come a long way in the past year or two, and I'm very impressed.
  1. I'm absolutely stunned at how responsive the UI is most of the time. I frequently have to double-check that an operation has finished successfully because they happen so fast. Sluggish response was one of my biggest reasons for not using NetBeans in the early days.
  2. I appreciate the fact that NB uses the ant script for builds. I think this is "The Right Way", though I have my doubts about Ant as the "Right Tool" for building software in general. Seamless maven2 integration might be a good thing to add.
  3. The Runtime pane is nicer and includes more useful things than stock Eclipse provides.
  4. The default monospaced editor font is much neater and more compact than most Courier fonts, which is what Eclipse uses by default. I dislike serifed fonts for code, even more than I dislike non-monospaced fonts.
  5. Automatically updating the UI based on outside changes to the project's files and dirs is a no-brainer; I hate having to refresh in Eclipse.
  6. Undoing all changes to an unsaved modified file causes it to be marked as unmodified again; this is a Very Good Thing.

Friday, August 18, 2006

To Multibyte, Or Not To Multibyte

We've been wrestling with parser speed this past week on the JRuby project, tweaking the lexer, fiddling with the grammar and parser generator, and micro-optimizing all the various support classes. None of those experiments have helped much; performance in each case improved by only a few percentage points.

We've also been wrestling with the issue of Unicode support, since Java supports it well and Ruby does not. We're caught between worlds here, not wanting to create an incompatible Ruby but realizing the absurdity of our lacking Unicode support under Java.

It seems that solutions for the two issues may be mutually exclusive.

Character Pain

After a recent speed comparison between Java, Ruby, C, and a few other languages erupted on the ruby-talk mailing list, it became quickly apparent how expensive writing UTF-16 character sequences to a single-byte encoding can be. The best optimization of the program on that thread pre-encoded and cached all strings to be written (none dynamically generated) as byte[], saving the cost of encoding them later during a stream write. Because of that version's success, I started to wonder what might be the cost of reading and writing char versus byte in Java. The results surprised me.

When I first suggested this comparison on the JRuby dev list, Ola Bini quickly tested out a version of the lexer that used only streams and byte[], rather than readers and Strings. With no other optimizations, that change improved the overall parse performance by almost 20% (Java 5 on Windows x86). Shocking, to say the least.

Given that surprising speed boost, I thought I'd run a few microbenchmarks on reading and writing bytes and characters.

The Test

The source files come in two flavors:

  • yes.txt, an ISO-8859-1-encoded file filled with "y" characters
  • yes2.txt, the same file encoded in UTF-16.
Eight scenarios were tested:
  • reading bytes straight out of the file
  • reading characters straight out of the file
  • buffered reading of bytes straight out of the file
  • buffered reading of characters straight out of the file
  • reading bytes from a byte array
  • reading characters from a byte array
  • writing bytes to a byte array
  • writing characters to a byte array
I didn't play with various types of streams much; I mainly just ran with a few basic ones I'm familiar with. If there's an optimal way to perform each of these scenarios, please let me know.

Each cycle was run 1000 times, reconstructing streams and readers each time. Each cycle read the equivalent of 10 million characters, which in the case of yes2.txt meant reading 20 million bytes.

All tests were run on an Opteron 150, 2.6GHz, running 64-bit Linux and 64-bit Java 5.

Results: yes.txt, ISO-8859-1, 10 million characters (10 million bytes)

I first ran against the single-byte version:

1000 direct byte reads from file: 22435
1000 direct char reads from file: 112372
1000 buffered byte reads from file: 22625
1000 buffered char reads from file: 112594
1000 buffered byte reads from array: 9477
1000 buffered char reads from byte array: 107975
1000 buffered byte writes to array: 8556
1000 buffered char writes to byte array: 16198

Ouch. In both buffered and unbuffered direct reads from a file, characters fare rather poorly, taking over five times as long. Note that buffering here didn't really help, since filesystem IO is apparently not a limiting factor on this machine.

Notice also how little character reads improved from an in-memory byte array. In this case, I had the code read from an InputStreamReader wrapped around a ByteArrayInputStream. It's certainly possible this number would improve if simply passing the byte[] directly to a String constructor, but the current code seems far slower than I expected.

Not terribly surprising is how much better character writes to a byte array performed. Down-encoding from UTF-16 to a single-byte encoding--especially when we're dealing with all ASCII characters--is pretty cheap. Still, it took twice as long.

Results: yes2.txt, UTF-16, 10 million characters (20 million bytes)

1000 direct byte reads from file: 44717
1000 direct char reads from file: 126998
1000 buffered byte reads from file: 44595
1000 buffered char reads from file: 126401
1000 buffered byte reads from array: 17423
1000 buffered char reads from byte array: 122082
1000 buffered byte writes to array: 17893
1000 buffered char writes to byte array: 57915

Here the character reads fare better, but not by much. While the byte reads took twice as long (duh, we're reading twice as many bytes) the character reads have increased by only about 10%. Since the work done for character reads should be a superset of the work done for byte reads, this shows that it's obviously faster reading from UTF-16 into UTF-16. Unfortunately any speed gains are wiped out when we have to read twice as much data.

The write numbers are confusing, and could indicate an error in my test. Where the byte writes doubled in length, the character writes have almost quadrupled. Either I'm doing something wrong or someone else is. If anything, I would have expected the performance of character writes to decrease no more than the performance of byte writes, since no down-encoding was now necessary. And if I had wired the test wrong and down-encoding is actually occurring, the numbers should have matched the single-byte file.

How This Affects JRuby

MRI (Matz's Ruby Interpreter) currently has poor support for Unicode, mostly cobbled together from various community projects. Ruby 2.0 promises support for every string encoding possible, including Unicode encodings and many others, but we're unlikely to see it for well over a year. Because JRuby runs on Java, us toeing the line and also avoiding Unicode support simply doesn't make sense. As much as we'd like to avoid diverging from MRI, many Javaists simply can't use JRuby effectively without Unicode.

A number of different schemes have been discussed for supporting Unicode in JRuby. Some are based on the Ruby 2.0 plans, or as far as we can take them without causing incompatibility with Ruby 1.8, its libraries, or applications written for it. Some leverage the fact that our Ruby String implementation is using Java's UTF-16 String, simply allowing incoming files to be in any encoding and allowing the parser to work with full UTF-16 characters rather than with our present 0xFF-masked byte-in-a-char. Still others propose we support multibyte encodings, but only in literal strings...which matches Ruby 2.0 plans to only allow single-byte-encoded identifiers in code, but any encoding for embedded literal strings.

The simplest to support, obviously, is to just allow Java to handle decoding the incoming stream, possibly allowing a pragma line (Ruby 2.0-style) to specify a specific encoding. While reading, we handle the pragma and set the remainder of the file to read with the given encoding into full UTF-16 Strings. This achieves the primary goal of Unicode string literals, but has the side effect of allowing Unicode identifiers, something which is so far not supported for Ruby 2.0.

The Ruby 2.0-ish way to handle encodings would be to read the file in as a single-byte encoding first, only using specialized encodings when encountering string literals. Say what you want about that method; I won't comment on its quality, but I will say it would be considerably more difficult for us to implement, and I'm not sure how you would embed non-ASCII-compatible string literals into an ASCII-compatible script file.

I am leaning toward the full Unicode support, where incoming files can be any encoding Java supports and all text can use the full complement of UTF-16-compatible Unicode characters. The compatibility with existing Ruby code is apparent: almost everything out there right now is in an ASCII-compatible format, which we'd be able to support without any work at all. However JRuby scripts that use Unicode characters would almost certainly be incompatible with MRI if any of those characters require multiple bytes; it would be impossible, for example, to take a UTF-16 encoded JRuby file and run it under MRI without modification.

So What?

There are two conflicting goals here: performance and Unicode support.

On the performance front, we would like to always read, parse, and store simple bytes, rather than paying the thunk cost for every character. Perhaps more serious and drastic, we'd like to use a byte[]-based UTF-8 String implementation internally, since Ruby uses String as a general-purpose byte-buffer (for which we currently pay the thunk cost on every read or write operation). The cost of using all characters internally, when everything else comes in the form of bytes, is apparent from the benchmark numbers.

On the Unicode front, we'd like seamless, Java-style Unicode support without quirks or gotchas. We'd like to continue using Java's String internally, and do all our parsing through readers. We would have to suck up the (sometimes large) thunk cost, but we'd have arguably the best Unicode support of any Ruby implementation currently available. We would unfortunately also then support writing scripts that are incompatible with MRI.

What To Do?

All these numbers and all these ideas boil down to a few key questions:
  • Is the ability to create incompatible scripts for JRuby a showstopper? Is it enough to warn people that we support Unicode more fully than MRI, but that support comes at a price? Is full Unicode support more important than backward-compatibility for JRuby scripts under Ruby 1.8 (or even forward-compatibility for JRuby scripts under Ruby 2.0 as currently specified)?
  • Is there anything that can be done about the dismal performance of byte-to-char thunking? It worries me for parsing, but worries me even more for our String implementation, which uses the char-based StringBuffer internally as a byte buffer for all Ruby's IO operations. Are parse and Ruby IO performance more important than full Unicode support? Should we hobble JRuby for (perhaps large) performance gains?
I'm anxious to solve both issues; but we may end up having to choose one or the other. However if we could resolve the character-thunking performance issue, the answer would be clear.

Update: The source code for the test, as it was run, is available here.

Thursday, August 17, 2006

Ola Bini: JRuby Goes Camping

Ola Bini on Java, Lisp, Ruby and AI

As part of his series of JRuby "howto" articles, Ola has put together an outstanding walkthrough for getting Camping running under JRuby. It has all the trimmings, including ActiveRecord over JDBC. It took surprisingly little work for us to support in JRuby. I made a few tricky interpreter fixes, and Ola solved some other good bugs, but ActiveRecord has been working since June. Ultimately it seems that a number of recent fixes made on my JRuby branch solved the last few problem, and we can now say JRuby supports Camping.

Thanks go to Ola for his ongoing contributions and again to Evan Buswell for his WEBrick-enabling NIO work in the past.

This is another very compelling application and use case for JRuby. Things are getting very exciting.

Tuesday, August 15, 2006

InfoQ: The Resurgence of Java the Platform

InfoQ: The Resurgence of Java the Platform

A prescient post from Scott Delap, InfoQ's newest Java editor. As you can probably guess, I also believe Java the platform is entering a renaissance with Sun's recent promise for the platform to be "multilingual" and projects like JRuby finally coming into their own. Java the platform--the sleeping giant beneath Java the language--is awakening...and it will speak in dynamic tongues.

Friday, August 11, 2006

.NET and J2EE to get better dynamic language support

Digg: Microsoft and Sun Microsystems have observed growing interest in dynamic programming, and plan to integrate more extensive support for dynamic language features in their respective managed language platforms.

It's interesting to see this kind of article make it to the Digg front page. The links to the eWeek articles on Microsoft and Sun's efforts are also very interesting to read. The Sun article goes into more depth about what changes might be made to the JVM.

read more | digg story

Nibbling Away at Performance

JRuby's performance has never been stellar. Even before the current performance-hindering refactoring and "correctification" work began, it was almost an order of magnitude slower than MRI ("Matz's Ruby Interpreter"). When I started working on my parts of the JRuby internal redesign, I knew thing were going to get worse before they got better...but I think they're finally starting to get better.

I ran some quick numbers comparing performance of JRuby 0.9.0 versus current trunk:

Under 090, gem install rake-0.7.1.gem:
real 1m39.088s
user 1m37.666s
sys 0m1.128s

Under trunk:
real 1m16.388s
user 1m15.233s
sys 0m0.924s

That equates to about a 23% improvement in speed. Considering that we've only been nibbling at performance and that our large-scale performance-related refactoring has just begun, things are looking a lot better than they were six months ago.

The current goal is to get interpreted-mode JRuby as close as possible to MRI performance before we commit to a bytecode compiler. Because the eventual compiler will have to appropriately hook into JRuby's runtime, this only makes sense: if we go full-bore on a compiler now we may see great improvement in performance, but we'll have a much harder time evolving the runtime. By making the interpreter runtime as well-designed and as fast as possible now, we run less of a risk that compilation later on will tie us to a poor runtime design. I believe too many language projects fall into the trap of immediately diving into compilation without first considering how a language should best be represented on the target machine. When we do the hard work of improving the interpreter first, we learn the nuances of the language and gain a better understanding of how that compiler should eventually look. It may even be the case that we find a more direct mapping from the language to the platform that allows us to minimize or eliminate the runtime entirely for compiled code. We'd never reach that conclusion if we prematurely optimized by banking on a compiler too early.

At any rate, things are looking good for JRuby performance, both for small-scale optimizations and large-scale refactorings. The compiler will just make good...better.

Tuesday, August 08, 2006

Interfaces Should Be Modules

Currently, in order to implement a Java interface in JRuby, you extend from it like so:

require 'java'

include_class "java.awt.event.ActionListener"

class MyActionListener < ActionListener
def actionPerformed(event)
puts event
end
end

While documenting a JRUBY-66 workaround and thinking about a longterm fix, it hit me like a diamond bullet through my forehead: interfaces should be treated like modules.

My justification:

  1. You can include many modules, but only extend one class...just like interfaces. Currently in JRuby you can only implement one interface, which is stupid.
  2. Modules imply a particular set of behaviors not specific to a given class hierarchy...just like interfaces.
  3. Ruby implementations of Java interfaces can't extend any other classes; you can't both extend Array and implement Collection, if that were your goal.
  4. Ruby implementations of Java interfaces have bugs when defining initialize, since they don't really just implement that interface...they extend one of our JavaSupport proxies.
Item #1 will be of particular importance as we start using JRuby more and more to implement Java services. In my opinion, this is an unacceptable limitation on JRuby's Java integration capabilities.

Item #3 limits your ability to re-open core Ruby classes and add new Java interfaces to them, something that might greatly simplify mapping Ruby types to Java-land.

Item #4 is the cause of JRUBY-66, since we need to make sure the proxy's initializer is called.

In our defence, we inherited much of this Java integration behavior from the original project owners; however I think mapping interfaces to modules allows for much more powerful and uniform Java integration support.

I know it would be a fairly significant change to make Java interfaces
act like modules, but it seems much more logical to me. It's also primarily a new feature we could phase in, with the < syntax continuing to work for old style interface implementation.

Thoughts?
# yes, I know encapsulation would be better...this is just an example
...
include_class "javax.swing.JButton"

class MyActionRecorder < Array
include ActionListener

def actionPerformed(event)
self << event
end
end

Monday, August 07, 2006

Distributed Ruby (DRb) "Working Well"

A new member of the JRuby community, Blane Dabney, submitted a patch for JRuby socket IO to resolve a DRb issue he'd been having. Our original implementation of a "write" method was not properly handling line terminators, and would end up blocking on write calls with nothing coming out the other end. After some investigation by us both, Blane managed to put together a simple, working patch that solves the issue.

According to him, DRb from a Ruby client to a JRuby backend now "seems to be working well." I'm letting the patch stew for a bit, but it will likely be committed to trunk in the next couple days.

The ability to use DRb from Ruby to JRuby opens up a whole new world of integration with Java services. I guess it's time one of us got busy on a DRb-to-EJB gateway, don't you think?

Conference Updates

RubyConf*MI

I am registered for RubyConf*MI, though it's still uncertain if I'll attend. The registration cost is a measly $20, but it sounds like it will be a good time. Grand Rapids is about a 9-hour drive, however, so I'm looking for someone to share transportation with from Minneapolis. I probably won't go if it's just me alone.

MinneDemo

I'll be doing a quick (<15 mins) JRuby demo at Minnesota's first DemoCamp. I have no idea what I'll demo yet, but perhaps a more elaborate IRB-based Swing demo like that I did at JavaOne.

RailsConf Europe

Various events that are in mostion and which I won't elaborate on may lead to me attending and presenting at RailsConf Europe. Hopefully those events pan out (and hopefully there's enough time between now and the conf to get Rails working suitably well).

RubyConf

I may not be presenting, but I shall attend! I managed to secure one of the coveted registrations before they sold out two hours later. Regardless of corporate reimbursement, I'm going to make the trek to Denver. I hope to do an unofficial or "lightning" session on JRuby as well, since I know there are many attendees interested in hearing about it.

Sunday Night Niblets

Camping

After a minor fix provided by Ola Bini (thanks Ola!) we now have Camping running under JRuby, using the ActiveRecord JDBC adapter. According to Ola, Camping under JRuby seems to run very well, and feels very snappy. I'm going to be playing with it a bit soon, and may have a demo site up by tomorrow.

JRuby Extras

The JRuby Extras project is officially launched on RubyForge. The ActiveRecord JDBC adapter is there and has received modifications to work with Oracle as well. The work thusfar on Mongrel is also there, and it appears that we may have Mongrel working under JRuby shortly. If you have a particular Ruby app that needs some JRuby-specific modifications or extensions, please let me know; this is a community-driven project to make Ruby apps spectacular under JRuby.

Thursday, August 03, 2006

Calling all Tor Norbyes

Ok, perhaps the blogosphere can help with this one. I've been trying to respond to an email from Tor Norbye at Sun Microsystems since Sunday. Unfortunately, my emails seem to be shuffled off into the ether. He sent another email to me today, saying he hasn't heard back from me.

Tor! I'm right here! Give me an alternate way to contact you...your Sun address seems to be kaput.

Wednesday, August 02, 2006

JRuby: It's pretty much my favorite animal.

Some of the local Rubyists and I were talking about publishers for a future JRuby book...ligers were brought up...and, well, here's the result.



It's just perfect...the lion of Java mated with the tiger of Ruby...and the magical JRuby is their offspring. It brings a tear to your eye, doesn't it?

Feel free to digg it.

Update: A couple folks have noted that O'Reilly's Java animal is already a tiger, so I guess I've got the roles reversed!

Busy Bees

I haven't posted anything substantive in a while, but things are moving rapidly forward. Here's a quick summary of what's been going on with JRuby:

RubySpec

I have launched an effort to build up a Ruby specification, Wiki-style. At The RubySpec Wiki, contributors can write short pages/articles on any aspect of Ruby: the language, the libraries, or the implementations. The eventual goal of this is to create a comprehensive library of content describing in detail how every aspect of Ruby is *supposed* to work. This in turn will help alternative implementations like JRuby, Ruby.NET, Cardinal, and others ensure they are functioning correctly.

CONTRIBUTORS ARE NEEDED! Please create an account and add whatever you can. Found out about a new feature, quirk, or bug in Ruby? Add it! Feel like porting over some core docs in a more spec-like format (i.e. including edge cases and formal semantics)? Go for it! The Spec will only succeed with user contributions. I may sponsor contests to see who can contribute the most...so keep an eye out!

The New RubyTests

The RubyTests project on RubyForge mainly houses the Rubicon suite, a collection of tests originally created for the first PickAxe book and based on Ruby 1.6. Over the past several years, it's been slowly, slowly updated for 1.8, but the library is showing its age. To complicate matters, other test libraries have sprung up to remedy some of Rubicon's deficiencies: BFTS, from the MetaRuby guys, and now a RubyTests project from the Ruby.NET team out of QUT. In addition, contributors to the RubySpec have called for a place to keep tests that go along with the specification. Something had to be done.

This past week, I sent out a proposal to all the RubyTests project members and the MetaRuby guys about finally unifying all our efforts under one grand test suite. The response so far has been excellent...Ryan Davis of MetaRuby told me he agrees with my plan, and others on the RubyTests project also agree this is the way to go. The wheels are in motion!

I will act as steward for the new RubyTests project, but only to fostor community collaboration. We'll initially consider pulling all the myriad projects under the RubyTests umbrella, and then start discussing issues like what testing framework to use, how or whether to generate tests, and how to provide traceability back to items in the nascent RubySpec. I encourage anyone interested in seeing Ruby improve and flourish on all platforms to join the project and contribute.

Block Refactoring Work

It recently became apparent that the current block-management code in JRuby (modeled almost exactly on C Ruby) is rather inefficient; doubly so in JRuby because we don't have C tricks like unions and longjmps. Tom also discovered after some research that much of the block-scoping semantics can be pulled out during the parsing process and stored, saving many searches later on. To these ends, we have both been working on refactoring JRuby internals to improve how blocks function.

I have been working to modify the call chain to pass blocks along as part of the frame. This simplifies a great many things, since the correct block to which to yield is now just a field-access away (and eventually, just an argument-access away). Previously, multiple stack pushes and pops were necessary to get the correct frame, causing great undue overhead. Also, I have rewired how Proc instances are invoked, so instead of two pushes and two pops on our internal "block stack", it now just calls the proc directly. Much cleaner. The eventual goal of this is to eliminate the "block stack" and also the "iter stack", which maintains a stack of flags indicating whether a block is available or currently executing.

Tom's work will make static much of the information about how blocks are scoped, since their relative orientation in the original source provides almost all the information we need. This will allow us to automatically or more quickly locate the appropriate variable when accessing such from within a block, as well as ensuring our variable scoping is handled correctly with multiple nested blocks. He is also keeping in mind that evals can change the list of variables, so the end result should work fine in those cases as well. The end result is that variable scoping will be much more reliable and performant when blocks are involved.

RubyInline for JRuby

After doing a bit of exploration on how Ruby extensions are written, I stumbled across yet another post from Ryan Davis about the beauty and simplicity of RubyInline. I am not a huge C fan, having had my fill of it during my old LiteStep days (I was lead LiteStep dev during the "great redesign" period), but the attraction of RubyInline is undeniable.

Ryan and I had a brief discussion over IM, during which we agreed that adding JRuby/Java support to RubyInline would be a really great idea. Then instead of just specifying C code in your RubyInline blocks, you could easily do the following:

class Example
inline(:C) do |builder|
builder.c "int test1() {
int x = 10;
return x;
}"
end
inline(:java) do |builder|
builder.java "public int test1() {
int x = 10;
return x;
}"
end
end

...and know that whether under JRuby or C Ruby, your inlined code would shine through. Look for this effort to pick up soon; Ryan has agreed to include it in RubyInline once it's ready.

Mongrel for JRuby

Danny Lagrouw and Ola Bini, perennial JRuby community superstars, have been working on implementing the native bits of Mongrel in Java. Danny put together a YACC-based HTTP request parser (since we don't have a Java Ragel yet) and today Ola implemented a quick ternary search tree in Java. With these two pieces working, we just have to wire up Mongrel and try it out in JRuby. It's very close.

What's the value of Mongrel when we have servlet containers to host Rails apps? That question answers itself. Name one Rails developer who's enamored of servlet containers. Yeah, I didn't think so. WEBrick is a poor substitute for a real container, and almost all Rails deployments are going Mongrel now. Not supporting Mongrel would be a showstopper for many, many Rails projects. Therefore, we're making it happen.

JRuby Extras!

I have requested a new project on RubyForge called "JRuby Extras". This project is intended to be a JRuby community love-fest, hosting all the bits and pieces needed to support Ruby apps running under JRuby. It will hold such juicy tidbits as:

  • The upcoming Mongrel support libraries (at least until they're hopefully included in Mongrel proper)
  • Nick Sieger's excellent ActiveRecord JDBC adapter (until included in Rails)
  • Any other JRuby-related extensions that don't have good homes elsewhere
  • Any Java or JRuby-related updates to other projects (like RubyInline) until included directly into those projects
Where the main JRuby project has only Tom and I as gatekeepers, the jruby-extras project will be more community-oriented. If you've got a good idea for how JRuby can be improved (think like Groovy, with its ten-thousand add-ons), toss us an email...and get busy!

The project, once approved, will be jruby-extras on RubyForge.

Standarizing JRuby Extensions

There are a number of extensions to JRuby internally, to replace missing C functionality from C Ruby. There are also a number of extensions being developed externally, to support things like Mongrel. Unfortunately, there is no standard way to write JRuby extensions like there is for C Ruby. The APIs that we expose are subject to change, and the Java world brings along its own conventions and expectations for how plugins ought to work. In order to settle this question, I have kicked off a thread on the JRuby dev mailing list.

We're going to figure out the best way to support JRuby extensions, along these rough lines:
  • Requiring an extension will look for an extension library just as it does in Ruby; however, it will be looking for a jar file in the load or class paths containing an appropriately-named entry point.
  • require "my/extension" will most likely look for extension.jar in under the my/ load or class path, and then load my.ExtensionLibrary contained therein
  • Since we'll want to use direct invocation now in JRuby, we'll want an easy way for extensions to have the same benefits. Rather than having them implement direct-callable interfaces, we'll likely build a code generator that can take a class and a list of method mappings and generate all stubs and callables needed for JRuby. This will also simplify our own code classes and extensions as well.
  • There are at least two ways within JRuby to define a new class, its metaclass, and their methods. One is easy but a bit broken; the other is correct but cumbersome. Extension writers will get something in the middle...easy but correct, via various helpers and factories. The same model will also be applied internally. Unification!
Making a Move

On a more personal note, there are events afoot that may give me more time to work on JRuby. I won't go into specifics...just let your imagination run wild.