There had been a problem with multi-byte characters in RedCloth when it was compiled for JRuby, but someone contributed a fix (switching out bytelists for character arrays), which I pulled into a branch called jruby-mbc.
The problem now is extra whitespace. I want to get it fixed so I can release it!
It seems to be mostly when there's HTML in the input. When it's a standalone HTML tag (just a block tag on a line), it puts two BRs after. When it's an HTML block (start tag, contents, end tag), it puts the BR inside the beginning of the next block. When just one newline ends the document, it puts a BR inside the end of the last block; two newlines before EOF behave fine though.
I'd greatly appreciate anyone who can help this Java dunce (me). Here's the fast way to get it checked out and set up:
git clone email@example.com:jgarber/redcloth.git
git checkout jruby-mbc
rvm install jruby
rvm use jruby-1.5.3@redcloth --create
gem install bundler
Hopefully you can get it figured out quickly and make some fast cash. Marek told me he thought the html_esc function needed to be rewritten for multi-byte characters, but that's just a guess. The deliverable required to be paid is a patch that makes all existing tests pass in JRuby (except "RedCloth should have EXTENSION_LANGUAGE")--and no brittle workarounds like stripping characters after the fact. The 33 failing tests are attached.
Skills: JRuby, Textile