Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Encoding Java Strings into GBK Format

Tech 1

Internally, the JVM reprseents String objects using UTF-16 character encoding. When interacting with legacy systems or specific regional formats, such as the GBK standard (an extension of GB2312 that includes Traditional Chinese and other symbols), explicit charset conversion becomes necessary. The String.getBytes(Charset) method extracts the text into a byte sequence using the specified encoding.

import java.nio.charset.Charset;

String message = "Welcome, 世界";
byte[] gbkEncodedBytes = message.getBytes(Charset.forName("GBK"));
String reconstructedMessage = new String(gbkEncodedBytes, Charset.forName("GBK"));

The process transitions the internal UTF-16 representation into a byte array formated according to the GBK standard, which can then be decoded back into a Java String.

stateDiagram-v2
    [*] --> InMemoryRepresentation
    InMemoryRepresentation --> UTF16
    UTF16 --> ByteExtraction
    ByteExtraction --> GBKByteArray
    GBKByteArray --> [*]
sequenceDiagram
    participant App as Application
    participant Str as Java String (UTF-16)
    participant ByteArr as Byte Array (GBK)
    App->>Str: Provide text
    Str->>ByteArr: getBytes(Charset.forName("GBK"))
    ByteArr->>App: Return GBK encoded bytes
    App->>ByteArr: Construct new String
    ByteArr->>Str: Decode bytes back to UTF-16

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.