June 24, 2010

sending multipart SMS

let's see if i remember how this blogging thing works ;-)

even in this age of twitter, SMS is still a popular form of mobile communication and one which ColdFusion handles quite nicely via it's SMS gateway. While it's shockingly easy to send and receive SMS via ColdFusion, one thing that i did struggle with recently was how to send a multipart or concatenated SMS using ColdFusion. hopefully this post will save someone else a few headaches.

what's a multipart SMS? well it's an SMS with message content that is longer than 160 chars which is sent as more than 1 SMS and reassembled on the receiving device using data in the SMS itself. say you had a message with 200 chars, this would be sent as 2 separate SMS and reassembled as one 200 char message on the receiving device, provided of course that your SMS vendor and the receiving device supported this (these days most devices do).

while many SMS vendors do support multipart SMS, they don't all follow the same technique. there are two basic multipart flavors, TLV (tag length value) and UDH (user data header).

ColdFusion easily supports the TLV flavor via the following optional fields:

sarMsgRefNum split and reassemble message reference number
a reference number for a particular concatenated short message.
sarTotalSegments split and reassemble total segments
indicates the total number of short messages within the concatenated short message
sarSegmentSeqnum split and reassemble segment sequence number
indicates the sequence number of a particular short message fragment within the concatenated short message.

here's an example of sending a multipart SMS with 2 parts using the TLV method.

// should be unique between different SMS but same within each batch testMsg.sarMsgRefNum=javacast("string",int(randRange(0,127)));
// how many parts in total? testMsg.sarTotalSegments=javacast("string",2);
for (i=1; i LTE 2; i=i+1) {
   // which part is this?    testMsg.sarSegmentSeqnum=javacast("string","#i#");
   testMsg.shortMessage="But skeletons ain’t got nowhere to stick their money, nobody makes britches that size #i#.";
   writeoutput("returned ID #msgID# for SMS sent #now()#<br>");

note that there's a gotcha involved with this method, while the SMPP 3.4 specs detail these optional fields as integers, ColdFusion wants a string for any SMS sent after the first one (yes the first one goes through just fine as an int). easy enough to fix, just cast the values as strings--and that's headache number one solved.

before I show you an example using the UDH method,i guess i better explain what exactly is a UDH? a UDH is a chunk of binary data (lets call it a smeg) prepended to each part of a multipart SMS that the receiving device uses to reassemble the separate SMS into one larger message. a typical UDH might consist of:

0x05 total length of header (the following 5 bytes to be exact)
0x00 information element identifier (IEI)
0x03 length of "header"
0x02 byte used as reference ID (00-FF), same for all parts of multipart, "2" in this case
0x02 total number of parts for this multipart SMS, say there are 2
0x01 this part's number in the sequence, say it's the 1st one

now lets see an example of this method:

function splitMessage(msgText,msgLength) {
   var tmpStr="";
   var msgParts=arrayNew(1);
   var splitSize=160; // var to account for UDH
   var msgTxt=arguments.msgText;
   if    (len(msgTxt) GT 160)
   while (len(trim(msgTxt))) {
   return msgParts;
// main code
msgLength=121; // know this works, anything larger doesn't send randomize(now().getTime()/1000);
msgText="So I'll meet you at the bottom if there really is one They always told me when you hit it you'll know it But I've been falling so long it's like gravity's gone and I'm just floating";
// cf7 has no byte datatype for javacast so... b=createObject("java","java.lang.Byte");
writeoutput("multipart message ID: #id#<br>");
for (i=1; i LTE arrayLen(msgs); i=i+1) {
   testMsg.esmClass=64; // 64 for SMS with UDH    bb=createObject("java","org.smpp.util.ByteBuffer");
   writeoutput("returned ID #msgID# for SMS sent #now()#: message part length:#len(msgs[i])#<br>");

there are a few things you should take note of:
  • the esmClass must be set to 64 (0x40) to let the receiving device know this is a multipart SMS
  • i used the Logica SMPP java library's ByteBuffer class as a convenience in creating the UDH and the message content. this library comes with ColdFusion (from 7 onwards), it powers the SMS bits.
  • since the message contents need to be binary, you will have to use the messagePayload field to hold them

first off, the ColdFusion docs for messagePayload state "it must start with 0x0424, followed by two bytes specifying the payload length, followed by the message contents". ignore that for UDH SMS, as 0x0424 & the message length value are sent as part of the SMS which plays havoc with the UDH itself. furthermore i'm not exactly sure if this ever needed as 0x0424 is actually the TLV for messagePayload in the first place (TLVs are used for conveying many types of information in SMS). headache number two, solved ;-) on to the next one.

according to the SMPP 3.4 specs and this, each part of the multipart SMS can hold no more than 153 7-bit chars or 134 8-bit chars to allow for the UDH & any padding. however, ColdFusion seems to only allow 121 chars no matter the encoding (i tested with the nifty jCharset java library which has 7-bit GSM charset among others). this 121 char limit actually isn't that big a deal for SMS with 320 chars. even if ColdFusion could send 153 7-bit chars, it would take 3 (2.09 to be exact) separate SMS to send all 320 chars. even with the 121 char limit, it still takes 3 SMS (2.64 SMS) to handle all 320 chars.

finally i imagine a question some folks are asking themselves is if ColdFusion's SMS gateway can receive multipart SMS? unfortunately, right now the answer is "not really". while the gateway sees the correct esmClass so the onIncomingMessage() method can determine it's a multipart SMS, somewhere in the incoming SMS handling the 2nd byte in the UDH, 0x00 the IEI, is getting stripped out ending up with something along the lines of:

0x05 total length of header
0x03 length of "header"
0x05 reference ID
0x02 total number of parts for this multipart SMS
0x02 sequence number
message text t and have some luck of course And it helps to have a tall man sitting on the horse'

which i guess is making a mess of multipart SMS processing. the data above was taken from a multipart SMS with 2 parts. the first, longer part, was never processed correctly and was never seen by the onIncomingMessage() method. only the final, shorter part was received properly.

and yes, Adobe already knows about these issues.

many thanks to the fine folks over at textit.com.au for their working through this with us.

ps: the SMS sample text are drive-by truckers lyrics just in case you were wondering.

Labels: ,

January 16, 2009

icu4j 4.01 released

the icu4j project has just released version 4.01. its a regular maintenance release with the following changes (common across all flavors):
  • Unicode 5.1
  • locale data: Common Locale Data Repository (CLDR) 1.6
  • charset converter file size improvement
  • date interval formatting (note only gregorian calendar is supported n this release)
  • improved plural support
specific icu4j changes include:
  • charset
    • ICU2022 converter
    • HZ converter
    • SCSU/BOCU-1 converter
    • charset converter callback
  • thai dictionary break iterator (yeah)
  • JDK TimeZone support (this is pretty decent as you can now share tz IDs between coldfusion/core java & icu4j)
  • locale service provider
  • more convenient formatting of year+month, day+month, and other combinations
  • simple duration formatting
i guess it's time to update the icu4j CFCs for the new formatting bits. as usual you can download the new version from here. btw you can still get a hold of the icu4j tools here.

January 06, 2009

timezone spatial locator

one of the challenges of trying to use timezones (tz) in an application is that there are so darned many of them--most of which aren't relevant to a given user. for example, have a look at the timezone CFC testbed. the list of tz just goes on and on (and on). you can narrow down the list somewhat if you know the user's location (maybe via geoLocation or, oh yeah, simply asking them), but big countries like the US will still have many tz and often tz IDs won't have much meaning to users. one idea to help with this issue that i've been kicking around for a while is to use a map to help a user pick a tz relevant to them. easing back into work this week i decided now would be as good time as any to put this idea into code. i've put up an alpha version here. it's a flex 3.x "mashup" using google maps as the UI plus geonames to supply the tz info once a user clicks on the map. some things to note:
  • it really is alpha quality, error handling, etc. isn't very pretty
  • the normal double click to zoom in google maps behavior doesn't work as my single mouse click listener seems to be swallowing all the double clicks, so use the zoom control instead
  • it's really too big to be of much practical use, ie a much smaller widget would be better for use on a page (i am as they say "design challenged")
once i iron out all the kinks i'll post the source in the usual places.