Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Distributed Architecture: Distributed ID Generation and Distributed Transactions

Tech 1

Scenarios for Distributed Transactions

Distributed transactions typically arise in these common scenarios:

  1. Cross-JVM process communications leading to distributed transaction boundaries
  2. A monolithic application accessing multiple independent database instences
  3. Multiple microservices accessing the same shared database instance

CAP Theorem

Core Definitions

  • Consistency (C): After a successful write operation, any subsequent read across all distributed nodes will return the latest updated data value. For example, after writing data to the primary database, all replica nodes must reflect the new data before allowing reads, or lock replicas during sync to prevent stale reads.
  • Availability (A): Every client request receives a non-error response within a reasonable time frame, even during partial node failures. Replicas should not be locked during data sync, and may return stale data or predefined defaults instead of throwing errors or timing out.
  • Partition Tolerance (P): The system can continue operating normally even when network partitions disrupt communication between some nodes. This is achieved via asynchronous data replication and adding redundant replica nodes to handle individual node failures.

Valid CAP Combinations

In practice, distributed systems can only guarantee two of the three CAP properties at any time.


BASE Theory

BASE (Basically Available, Soft state, Eventually consistent) is an alternative to ACID for distributed systems, prioritizing availability and partition tolerance over strong consistency, allowing for gradual data synchronization across nodes.


Distributed Transaction Solutions

2PC (Two-Phase Commit) & XA Protocol

The XA protocol defines a standard interface for distributed transactions, requiring support from underlying databases. A key limitation is that resource locks are held until the entire two-phase commit process completes, leading to poor performance under high concurrency.

Seata Distributed Transaction Framework

Developed by Alibaba, Seata provides both AT (2PC-based) and TCC mode distributed transaction solutions. It models distributed transactions as a global transaction coordinating multiple branch transactions, ensuring all branches either commit successfully or roll back entirely.

Sample Seata Implementation

We'll create two microservices for cross-account transfer:

  1. seata-bank-a-service: Handles deducting funds from the source account and calls the target account service
  2. seata-bank-b-service: Handles adding funds to the target account
Configuration Files

Required config files: application.yml, application-local.yml, file.conf


Bank A Service Code

Account Data Access Object

@Mapper
@Component
public interface BankAccountADao {
    @Update("UPDATE bank_account SET account_balance = account_balance + #{changeAmount} WHERE account_number = #{accountNum}")
    int updateAccountBalance(@Param("accountNum") String accountNum, @Param("changeAmount") Double changeAmount);
}

Account Service Interface

public interface BankAccountAService {
    void adjustAccountBalance(String accountNum, Double changeAmount);
}

Account Service Implementation

@Service
@Slf4j
public class BankAccountAServiceImpl implements BankAccountAService {
    @Autowired
    private BankAccountADao accountDao;

    @Autowired
    private BankBClient bankBClient;

    @Transactional
    @GlobalTransactional(rollbackFor = Exception.class)
    @Override
    public void adjustAccountBalance(String accountNum, Double changeAmount) {
        // Deduct funds from source account
        int updateResult = accountDao.updateAccountBalance(accountNum, changeAmount * -1);
        if (updateResult <= 0) {
            throw new RuntimeException("Failed to deduct account balance");
        }

        // Call remote Bank B service to add funds
        String transferResponse = bankBClient.processTransfer(changeAmount);
        if ("fallback".equals(transferResponse)) {
            throw new RuntimeException("Remote transfer service call failed");
        }
    }
}

Feign Client for Bank B

@FeignClient(value = "seata-bank-b-service", fallback = BankBClientFallback.class)
public interface BankBClient {
    @GetMapping("/api/bank-b/transfer")
    String processTransfer(@RequestParam("amount") Double transferAmount);
}

Fallback Handler for Failed Feign Calls

@Component
public class BankBClientFallback implements BankBClient {
    @Override
    public String processTransfer(Double transferAmount) {
        return "fallback";
    }
}

Bank A Service Startup Class

@SpringBootApplication
@EnableDiscoveryClient
@EnableHystrix
@EnableFeignClients(basePackages = "com.example.seata.banka.client")
public class BankAApplication {
    public static void main(String[] args) {
        SpringApplication.run(BankAApplication.class, args);
    }
}

Bank B Service Code

Account Data Access Object

@Mapper
@Component
public interface BankAccountBDao {
    @Update("UPDATE bank_account SET account_balance = account_balance + #{addAmount} WHERE account_number = #{accountNum}")
    int updateAccountBalance(@Param("accountNum") String accountNum, @Param("addAmount") Double addAmount);
}

Account Service Interface

public interface BankAccountBService {
    void addAccountBalance(String accountNum, Double addAmount);
}

Account Service Implementation

@Service
@Slf4j
public class BankAccountBServiceImpl implements BankAccountBService {
    @Autowired
    private BankAccountBDao accountDao;

    @Transactional
    @Override
    public void addAccountBalance(String accountNum, Double addAmount) {
        int updateResult = accountDao.updateAccountBalance(accountNum, addAmount);
        if (updateResult <=0) {
            throw new RuntimeException("Failed to add account balance");
        }
    }
}

Distributed ID Generation Solusions

1. Database Auto-Increment Primary Key

Step 1: Create ID Sequence Table
CREATE TABLE `global_id_sequence` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `stub_column` char(10) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  UNIQUE KEY `uniq_stub` (`stub_column`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The stub_column is a placeholder with a unique constraint to ensure atomic ID generation.

Step 2: Generate ID via REPLACE INTO

Use REPLACE INTO instead of standard INSERT INTO to handle duplicate key conflicts:

  1. Attempt to insert a new row
  2. If a duplicate unique key is detected, delete the conflicting row first, then re-insert the new row
BEGIN;
REPLACE INTO global_id_sequence (stub_column) VALUES ('default_stub');
SELECT LAST_INSERT_ID();
COMMIT;

Pros: Simple implementation, ordered IDs, low storage overhead Cons: Low concurrency support, database single point of failure risk, no business meaning, security concerns, frequent database access leading to higher load


2. Database Segment Pattern

This solution reduces database access by pre-fetching batches of IDs and storing them in local memory for fast retrieval. Popular implementations include Didi's Tinyid.

Step 1: Create Segment ID Table
CREATE TABLE `id_segment_generator` (
  `id` int NOT NULL,
  `current_max_id` bigint NOT NULL COMMENT 'Current maximum allocated ID',
  `step` int NOT NULL COMMENT 'Batch ID segment size',
  `version` int NOT NULL COMMENT 'Optimistic lock version number',
  `biz_type` int NOT NULL COMMENT 'Business type identifier',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Step 2: Initialize Default Record
INSERT INTO `id_segment_generator` (`id`, `current_max_id`, `step`, `version`, `biz_type`)
VALUES (1, 0, 1000, 0, 101);
Step3: Fetch and Update ID Segment
-- Fetch current segment for business type 101
SELECT `current_max_id`, `step`, `version` FROM `id_segment_generator` WHERE `biz_type` = 101;

-- Update segment for next batch, using optimistic lock
UPDATE id_segment_generator 
SET current_max_id = current_max_id + step, version = version +1 
WHERE version = 0 AND `biz_type` =101;

Pros: Fewer database queries, lower DB load, ordered IDs Cons: Single point of failure risk, no business meaning, security concerns


3. NoSQL Based Generation

Redis Based ID Generation

Use Redis's INCR or INCRBY commands for atomic, ordered ID generation. For high availability, use Redis Cluster or Codis for large-scale deployments. Enable Redis persistence (RDB, AOF, or mixed mode) to prevent ID loss on server restart. Pros: High performance, ordered IDs Cons: Similar drawbacks to database auto-increment solutions

MongoDB ObjectId

MongoDB's built-in ObjectId is a 12-byte unique identifier:

  • 0-3: Unix timestamp
  • 3-6: Machine identifier
  • 6-8: Process ID
  • 8-11: Counter value Pros: High performance, ordered Cons: Risk of duplicate IDs if system time is incorrect, predictable pattern leading to security risks

4. UUID

UUID (Universally Unique Identifier) is a 32-character hexadecimal string formatted as 8-4-4-4-12. Java's JDK provides built-in generation via UUID.randomUUID(), which generates version 4 UUIDs using random data.

// Example output: 550e8400-e29b-41d4-a716-446655440000
UUID uniqueId = UUID.randomUUID();
int version = uniqueId.version(); // Returns 4

Pros: Fast generation, easy to implement Cons: Large storage overhead, unordered, no business meaning, risk of MAC address leakage, potential duplicates in edge cases. Not recommended for MySQL primary keys due to size and ordering issues.


5. Snowflake Algorithm

Twitter's Snowflake algorithm generates 64-bit signed integers, structured as follows:

  1. 1 bit: Sign bit, always 0 for positive IDs
  2. 41 bits: Millisecond timestamp (supports ~69 years of operation)
  3. 10 bits: Combination of data center ID and worker ID (5 bits for data center, 5 for worker, adjustable per deployment)
  4. 12 bits: Sequence number (supports 4096 unique IDs per worker per millisecond)

Many optimized open-source implementations exist, such as Meituan's Leaf and Baidu's UidGenerator, which fix original Snowflake issues like time rollback and fixed worker IDs. Seata also provides a modified Snowflake implementation with improved QPS and time rollback protection. Pros: Fast generation, ordered IDs, flexible customization Cons: Risk of duplicate IDs due to time rollback, dependency on fixed worker IDs which complicates dynamic scaling


TCC Distributed Transaction Notes

TCC (Try-Confirm-Cancel) pattern requires handling three critical edge cases:

  1. Empty rollback
  2. Idempotant execution
  3. Suspended transactions

Sample TCC Service Startup Class

@SpringBootApplication
@EnableDiscoveryClient
@EnableHystrix
@EnableAspectJAutoProxy
@EnableFeignClients(basePackages = "com.example.tcc.client")
@ComponentScan(basePackages = {"com.example.tcc", "org.dromara.hmily"})
public class TccBankApplication {
    public static void main(String[] args) {
        SpringApplication.run(TccBankApplication.class, args);
    }
}

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.