March 24, 2005

charsets galore

after researching charsets for the [expletive deleted] time to help somebody on the forums, i decided it was time to create a tool to do away with some of that kind of tedious labor. so building on the API for java.nio.charset.Charset i whipped out a small CFC to poke and prod the charsets available on a given server (or to be more precise, charsets supported by cf's JRE). you can see it here. it can be used to deliver the available charsets on a cf server, determine if a charset is supported, and find out if one charset contains another.

oh yeah, once again in case you haven't been paying attention Just Use Unicode. it will save you a lot of trouble over the long run.

on another note, this CFC (100+ lines) was also the first piece of code i wrote from start to finish with cfeclipse. while it wasn't an entirely unpleasant experience, i think it will take me quite a bit more "getting used to" before i give up cfstudio for good.

March 21, 2005

diversity as wallpaper

starting off with the idea of printing all of unicode's characters on a 36 inch by 36 inch poster, ian albert ends up with 6 foot by 12 foot wallpaper printed at Kinko's. imagine that, most of humanity's writing systems printed at Kinko's for 20 bucks. i wonder what the clerk made of it?

March 08, 2005

cultural bias, leaping leap years batman!

pretty much everybody knows what a leap year is and when one occurs. and in case you don't, coldfusion has a function isLeapYear() that will tell you if a given year is a leap year in the gregorian calendar. in fact most calendars have the concept of a leap "something". the chinese and hebrew calendars have a "leap month" but apparently no concept of a leap year (though the icu4j HebrewCalendar class API are full of references to leap years). the civil version of the islamic calendar has a "leap day" which is added to the last month of 11 out of every 30 years but again no leap year. the persian calendar does have the concept of a leap year, handled via the PersianCalendarHelper class isLeapYear method.

which brings us to the point of this blog entry, this method expects the year argument to be a persian calendar "year" (right now its 1383 in the persian calendar). which i didn't quite grasp at first, as the other calendars (gregorian, buddhist and japanese) with leap years have an isLeapYear method that expects a gregorian year (yes, even the buddhist and japanese calendar classes expect a gregorian year, i imagine this is because these calendars extend the gregorian calendar class). and that's the way i expected the new persian calendar to behave (my own cultural bias--i use the buddhist and gregorian calendars on a daily basis). but it doesn't and why the heck would it? it is a persian calendar after all. so that got me to thinking about the other calendars and the way these "should" work and what other cultural biases have leaked into our code and test harnesses--especially the tests.

first thing i did was to rewrite the i18nIsLeapYear functions across all the calendars to expect a year argument in that calendar's system (it converts to gregorian year as needed and now automatically returns false for calendars lacking the concept of a "leap year").

then i went a hunting for any other places where my cultural bias might have leaked thru....and promptly found it in the getYear function. the getYear function takes a gregorian year value and returns the year in that calendar's system. i was doing that by creating a date:

thisDate=createDate(arguments.thisYear,1,2);

(and just in case you were wondering, the 2 for the day value is to make sure the date value created fell into that year, given that we're using UTC as the time zone standard for all the calendars). and then setting the calendar object to that date and returning the value for that calendar object's YEAR field:

tCalendar.setTime(thisDate);
return tCalendar.get(tCalendar.YEAR);


this worked swell for the gregorian, buddhist and japanese calendars because these calendars' year started at the same time. but after looking at the year values of formatted dates from the other calendars i realized that the getYear function was returning horrible nonsense for the other 4 calendars. without realizing it, i'd let my calendar bias creep in and assumed the calendar's were all the same as far as years were concerned. gregorian 2-jan actually falls into different calendar years depending on the calendar (of course, they're different freaking calendars). and the tests were only reporting whether the getYear function "worked" by checking if the year was a positive integer, no eyeball comparisons against the year bits of the formatted date strings. there's a lesson here some where.

so better grab the new code and maybe give that calendars a good poking at to make sure no other cultural bias is left in it.

March 05, 2005

persianCalendar update

a few days ago Dr. Ghasem Kiani updated his persianCalendar class to be "more" icu4j like. i wrapped it up in CFC and added it to the i18nCalendars package (which now contains 7, count 'em, 7 calendars). you can see it on it's own in a simple testbed here. you can download the persian calendar class from Dr. Ghasem's sourceforge project.

note that this version of the persian calendar uses a "well-known arithmetic algorithm for calculating the leap years" rather than astronomical calculations.

i'd like to publicly thank Dr. Ghasem Kiani for his work on this project, we've been waiting quite a while for a persian calendar to round off our i18n calendars. thanks.