Java i18n Pluralisation using ChoiceFormat

Betfair‘s site is hugely popular all around the world, and obviously needs to provide a fully localised experience for users across different locales. Yesterday I was looking at how best to provide internationalisation (i18n) support within our core platform and came across the really useful ChoiceFormat class. I had seen it before but hadn’t actually explored what it can do until now. It’s an extremely basic class in terms of the functionality it provides, but it is as powerful as it is simple.

I’m sure we all know about Java resource bundles for i18n and have probably used them quite a lot – so I won’t go into any detail on that. I’ll just assume it’s common ground. However, if you haven’t come across java.text.ChoiceFormat, you’re missing out! This is the guts of pluralisation within the i18n stable, and is definitely your friend when you realise that users don’t like messages that say “Updated 1 second(s) ago”. You know it’s 1 second, right? So why not just say that and skip the “(s)” bit? I’m sure many have tried to solve this the hard way by doing the intelligence themselves (defining multiple keys and choosing which to use based on the argument to be passed into it), but this is where ChoiceFormat comes into play.

As per the docs, “The choice is specified with an ascending list of doubles, where each item specifies a half-open interval up to the next item”. So let’s use the example above to show how it would be defined in a resource bundle:

lastUpdated=Updated {0} {0,choice,0#seconds|1#second|1<seconds} ago

What this does is creates a bunch of contiguous ranges (in ascending order) and finds the best match. I said best match because sometimes there isn’t an exact match. In the example above, negative numbers don’t match any range – but because they’re smaller than the starting range, the first choice is selected. The same logic applies for values that are larger than the highest range (which isn’t possible in this example as the highest range ends at positive infinity).

So let’s run through the scenarios very quickly:

  • If you pass in a negative number, the first choice “seconds” is returned (because it is too small to match anything and the ranges work in ascending order)
  • If you pass in a number between 0 (inclusive) and 1 (exclusive), the first choice matches and “seconds” is returned
  • If you pass in number 1 (exactly), the second choice matches and “second” is returned
  • If you pass in a number greater than 1, the last choice matches and “seconds” is returned

You may have noticed that the definition is different for the last range. What this is effectively doing in code is calling ChoiceFormat.nextDouble(1) which returns the smallest double greater than 1, which is then used as the start of the range. This is not restricted to being used the last range, but can actually be used anywhere. There is a similar ChoiceFormat.previousDouble(double d) that is fairly self-explanatory.

Nifty! Even more so when you consider that some languages have multiple pluralisations to consider (e.g. Russian). So you can’t reasonably assume that you’re always dealing with a simple singular/plural as we have in English – sometimes there are many different plurals to take into account.


10 Comments on “Java i18n Pluralisation using ChoiceFormat”

  1. Tony Yunnie says:

    Hi Stuart, thanks, useful info! Tony Y.

  2. Paolo says:

    But can you express the rules for Slavic/Baltic languages? In some cases they’re too complicated to be written as a list of ranges (e.g. Lithuanian).

    • I think that’s a language-specific question more than a technical one (and is likely to come up for other languages too), so you’d probably need to ask someone who speaks one of those languages. The behaviour and functionality provided by the ChoiceFormat class are fairly well explained, so it’s really just a matter of applying that to your specific scenario to see if it fits.

    • Tony Yunnie says:

      Hi Paolo

      It is interesting to hear what you have to say. Can you give an example?

      TY (preferably in English:-))

  3. Paweł Dyda says:

    Thanks Stuart, it is surprizing but I was able to learn something (I am talking about PluralFormat here). Certainly, your comment was very helpful 🙂

  4. […] ChoiceFormat class that provides much of the functionality you need. Being a good lad he wrote a blog post about it and in a nice show of karma, he was given an even better library to use. Supported by IBM, […]

  5. […] week I wrote about Java’s built-in ChoiceFormat class and the support it provides for pluralisation. It is a […]