Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Advanced Java Internationalization: Locales, Resource Bundles, and Formatting

Tech 1

BCP 47 Extensions in Java

Since Java SE 7, the Locale class adheres to IETF BCP 47 standards, allowing the addition of extensions to locale identifiers. While extensions are represented by single-character keys, specific keys are reserved for standard use. The Unicode locale extension is denoted by the key 'u', while the private use extension uses 'x'.

Unicode locale extensions, defined by the Unicode Common Locale Data Repository (CLDR), handle non-linguistic data such as calendar types or currency variations. Private use extensions allow developers to define arbitrary metadata, like platform specifics (Windows, UNIX) or build versions.

Extensions follow a key-value structure where the key is a character (like 'u' or 'x'). The value syntax is defined as:

SUBTAG ('-' SUBTAG)*

For Unicode extensions ('u'), a SUBTAG must be between 2 and 8 alphanumeric characters. For private use ('x'), it can be 1 to 8 characters. Note that while keys are case-insensitive, the Locale API normalizes them to lowercase.

To interact with extensions, use getExtensionKeys() to retrieve the set of keys and getExtension(char key) to get the associated value string.

Unicode Locale Extensions

Identified by the 'u' key or UNICODE_LOCALE_EXTENSION constant, these extensions contain key-type pairs. Valid keys include two-letter codes such as 'ca' for calendar algorithm, 'nu' for number type, and 'cu' for currency type.

For instance, a locale for German users in Germany utilizing a Buddhist calendar and Thai digits might be represented as:

Locale germanBuddhist = new Locale.Builder()
    .setLanguage("de")
    .setRegion("DE")
    .setUnicodeLocaleKeyword("ca", "buddhist")
    .setUnicodeLocaleKeyword("nu", "thai")
    .build();

You can retrieve specific attributes using methods like getUnicodeLocaleKeys(), getUnicodeLocaleType(String), and getUnicodeLocaleAttributes().

Private Use Extensions

Private use extensions, keyed by 'x', allow for custom data formats. Examples include x-lvariant or x-java. These are useful for application-specific locale variations not covered by standard Unicode definitions.

Identifying Supported Locales

Creating a Locale object does not guarantee that the underlying runtime or host operating system supports it. The Locale object is merely an identifier. To determine which locales are supported by a locale-sensitive class (like DateFormat), invoke the getAvailableLocales() method.

import java.text.DateFormat;
import java.util.Arrays;
import java.util.Locale;

public class LocaleScanner {
    public static void main(String[] args) {
        Locale[] systemLocales = DateFormat.getAvailableLocales();
        Arrays.stream(systemLocales)
              .limit(5)
              .forEach(loc -> System.out.println(loc.getDisplayName()));
    }
}

When displaying locale names to end-users, avoid using the raw toString() output. Instead, use getDisplayName() to present the locale information in a format readable by the user.

Language Tag Filtering and Lookup

Java provides robust support for IETF BCP 47 language tags, enabling the matching of language preferences (Language Ranges) against a set of available tags (Locales). This is governed by RFC 4647.

Understanding Language Tags and Ranges

A language tag is a string formatted to identify a specific language, such as "en-US" or "zh-Hans-CN". Use Locale.forLanguageTag(String) to parse these strings into Locale objects.

A Language Range represents a set of language tags matching a specific pattern. Ranges can be basic (e.g., "en") or extended (e.g., "zh-*-*"). Users often provide a prioritized list of ranges (Language Priority List).

import java.util.List;
import java.util.Locale;

public class LocaleTagProcessing {
    public static void main(String[] args) {
        // Parse a range string with weights (q-values)
        String userPrefs = "en-US;q=1.0,en-GB;q=0.8,fr;q=0.5";
        List priorityList = Locale.LanguageRange.parse(userPrefs);
        
        System.out.println("Priority List: " + priorityList);
    }
}

Filtering

Filtering compares a priority list against a collection of locales and returns all matching locales. The Locale.filter method performs this operation.

List available = Arrays.asList(
    Locale.US, 
    Locale.UK, 
    Locale.FRANCE, 
    Locale.CANADA
);

List matched = Locale.filter(priorityList, available);
matched.forEach(System.out::println);

Lookup

Unlike filtering, lookup (via Locale.lookup) returns only the single best match. This is useful when only one locale can be selected, such as for an application's main interface language.

Locale bestMatch = Locale.lookup(priorityList, available);
System.out.println("Best Match: " + bestMatch);

Locale Scope and Defaults

Java applications are not required to use a single global Locale. You can assign specific locales to individual locale-sensitive objects, facilitating multilingual applications. However, most applications rely on the default locale.

The default locale is initialized when the JVM starts, typically based on the host operating system's settings. You can retrieve it using Locale.getDefault(). As of Java SE 7, defaults are categorized into Locale.Category.DISPLAY (for UI resources) and Locale.Category.FORMAT (for date/number formatting). You can query and set these independently.

In distributed systems, it is often best practice to keep data exchange locale-agnostic, performing localization only on the client side or presentation layer.

Locale Sensitive Services SPI

Java allows third-party providers to plug in implementations of locale-sensitive classes via the Service Provider Interface (SPI) mechanism. By implementing abstract classes in java.text.spi and java.util.spi, developers can extend support for currencies, time zones, and break iterators.

Common SPI interfaces include CurrencyNameProvider, TimeZoneNameProvider, and DateFormatProvider. This enables the runtime to load implementations dynamically at runtime without modifying the core JDK.

Isolating Locale-Specific Data

Locale-specific data, such as UI labels or error messages, must be separated from the business logic. The ResourceBundle class facilitates this by storing objects keyed by strings, retrievable based on the runtime Locale.

ResourceBundles fall into two main categories: PropertyResourceBundle, backed by simple .properties files, and ListResourceBundle, backed by Java classes that can handle complex object types.

ResourceBundle Naming and Fallback

ResourceBundles share a base name. The fully qualified name is constructed by appending the language, country, and variant codes. For example, a base name of "Messages" with a French Canada locale becomes Messages_fr_CA.

The getBundle method implements a fallback search strategy. If Messages_fr_CA is not found, it searches for Messages_fr, then the default Messages. This ensures that an application can always fall back to a base set of resources.

try {
    ResourceBundle uiLabels = ResourceBundle.getBundle("Messages", currentLocale);
    String windowTitle = uiLabels.getString("main.title");
} catch (MissingResourceException e) {
    // Handle missing bundle or key
}

Using PropertyResourceBundle

For simple strings, .properties files are preferred due to their ease of translation. Keys are strings, and values are the text displayed to the user.

Example Messages.properties (Default):

greeting=Welcome
farewell=Goodbye

Example Messages_fr.properties (French):

greeting=Bienvenue
farewell=Au revoir

Using ListResourceBundle

When resources are not simple strings (e.g., arrays, custom objects), ListResourceBundle is used. You create a subclass that overrides getContents().

import java.util.ListResourceBundle;

public class AppStats_en extends ListResourceBundle {
    @Override
    protected Object[][] getContents() {
        return new Object[][] {
            { "currency", "USD" },
            { "tax_rate", 0.07 },
            { "supported_regions", new String[]{"US", "CA", "MX"} }
        };
    }
}

Custom ResourceBundle Loading

The ResourceBundle.Control class allows developers to customize the loading process, including cache management and the fallback search path (Candidate Locales).

ResourceBundle bundle = ResourceBundle.getBundle(
    "MyResources", 
    targetLocale, 
    new ResourceBundle.Control() {
        @Override
        public List getCandidateLocales(String baseName, Locale locale) {
            // Custom logic to modify the fallback list
            if (locale.getLanguage().equals("ja")) {
                return Arrays.asList(locale, Locale.JAPAN, Locale.ROOT);
            }
            return super.getCandidateLocales(baseName, locale);
        }
    }
);

Formatting Data

Displaying numbers, currencies, dates, and messages requires formatting that adheres to cultural conventions. Java uses the java.text package for this purpose.

Numbers and Currency

The NumberFormat class provides factory methods to format numbers, currencies, and percentages based on a Locale.

double monetaryValue = 123456.78;
Locale usLocale = Locale.US;
Locale frLocale = Locale.FRANCE;

NumberFormat usCurrency = NumberFormat.getCurrencyInstance(usLocale);
NumberFormat frCurrency = NumberFormat.getCurrencyInstance(frLocale);

System.out.println("US: " + usCurrency.format(monetaryValue));
System.out.println("FR: " + frCurrency.format(monetaryValue));

For finer control, use DecimalFormat to apply patterns (e.g., "#,##0.00") and DecimalFormatSymbols to change separators.

Dates and Times

The DateFormat class handles date and time localization. It provides styles (SHORT, MEDIUM, LONG, FULL).

Date now = new Date();
DateFormat dateFormatter = DateFormat.getDateInstance(DateFormat.LONG, frLocale);
System.out.println(dateFormatter.format(now));

For custom patterns, use SimpleDateFormat.

SimpleDateFormat customFormatter = new SimpleDateFormat("EEEE, MMM d, yyyy", usLocale);
System.out.println(customFormatter.format(now));

Messages

Compound messages containing variables (e.g., "The disk '{0}' contains {1} files.") require MessageFormat. This class allows you to substitute variables and format them (numbers, dates) according to the locale within the message string.

String pattern = "At {0, time, short} on {0, date}, {1} files were found.";
MessageFormat msgFormat = new MessageFormat(pattern, currentLocale);

Object[] params = { new Date(), 105 };
String result = msgFormat.format(params);
System.out.println(result);

Handling pluralization correctly often requires combining MessageFormat with ChoiceFormat to select the correct phrase based on a numeric value.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.