Fading Coder

An Old Coder’s Final Dance

You are here: Home > Tech > Content

Data Types in ClickHouse

Tech 3

ClickHouse offers a diverse set of data types categorized into basic, composite, and specialized types. Data types supported by ClickHouse along with their metadata can be explored using the system table data_type_families. Below is an SQL query to retrieve these details:

SELECT * FROM system.data_type_families;

This query fetches data type features such as names, case sensitivity, and aliases. Note that different versions of ClickHouse may include variations in supported data types. Observe some sample entries:

┌─name────────────────────┬─case_insensitive─┬─alias_to────┐
│ String                  │                0 │             │
│ Decimal                 │                1 │             │
│ IPv4                    │                0 │             │
│ UInt64                  │                0 │             │
│ FLOAT                   │                1 │ Float32     │
│ BIGINT                  │                1 │ Int64       │
│ TIMESTAMP               │                1 │ DateTime    │
└─────────────────────────┴──────────────────┴─────────────┘

Numeric Data Types

Signed Integer Types

These types are fixed-length integers that store signed numerical values. Examples include:

| Type | Bytes | Range | |---------|-------|-------------------| | Int8 | 1 | [-128 to 127] | | Int16 | 2 | [-32,768 to 32,767] | | Int32 | 4 | [-2,147,483,648 to 2,147,483,647] | | Int64 | 8 | [-9.2e+18 to 9.2e+18] | | Int128 | 16 | [-2^127 to 2^127-1] |

Unsigned Integer Types

These types store non-negative integers in a fixed-length format:

| Type | Bytes | Range | |---------|-------|-------------------| | UInt8 | 1 | [0 to 255] | | UInt16 | 2 | [0 to 65,535] | | UInt32 | 4 | [0 to 4,294,967,295] | | UInt64 | 8 | [0 to 1.8e+19] |

Floating Point Types

Float32

This type stores single-precision floating-point numbers. It provides up to 7 digits of precision:

| Type | Bytes | Precision (digits) | |---------|-------|--------------------| | Float32 | 4 | 7 |

SELECT toFloat32(0.123456789) AS result;
-- Output: 0.12345679

Float64

Double-precision floating-point type capable of storing values with up to 16 digits of precision:

| Type | Bytes | Precision (digits) | |---------|-------|--------------------| | Float64 | 8 | 16 |

SELECT toFloat64(0.1234567890123456789) AS result;
-- Output: 0.12345678901234568

Decimal Data Types

These types allow for signed fixed-point arithmetic while maintaining high precision for operations like multiplication. ClickHouse supports varying scales and precision levels:

| Precision Range | Example (P, S) | Equivalent Type | |-----------------|------------------|-----------------| | 1 to 9 | Decimal(9, 2) | Decimal32(2) | | 10 to 18 | Decimal(18, 2) | Decimal64(2) | | 19 to 38 | Decimal(38, 2) | Decimal128(2) | | 39 to 76 | Decimal(76, 2) | Decimal256(2) |

Operations Behavior

When performing arithmetic on decimals of different precisions, results conform to the precision rules below:

  • Addition/Subtraction: Scale matches the larger of the operands.
  • Multiplication: Scale equals the sum of operand scales.
  • Division: Result adheres to the scale of the dividend.
SELECT
    toDecimal64(2, 3) AS value1,
    toDecimal32(2, 2) AS value2,
    value1 + value2 AS result;
-- Output: Decimal(18, 3)

String Types

String

ClickHouse supports arbitrarily long strings. Strings may contain any byte sequence, which makes them a versatile replacement for traditional textual types such as VARCHAR or TEXT in other databases.

FixedString

A fixed-length string type useful for scenarios with clear size constraints:

CREATE TABLE example_table (fixed_col FixedString(6));
-- Note: trailing null bytes are appended for shorter values
SELECT toFixedString('hello', 6) AS padded_string;
-- Output: 'hello\0'

UUID Type

The UUID type is a 128-bit identifier written in the format 8-4-4-4-12. It is commonly used as a primary key.

CREATE TABLE uuids (id UUID);
INSERT INTO uuids VALUES (generateUUIDv4());
-- Example UUID: 'b6b019b5-ee5c-4967-9c4d-8ff95d332230'

Date and Time Types

Date

Represents dates within a range from 1970-01-01 onward, stored in two bytes. No timezone information is included.

CREATE TABLE dates (x Date);
INSERT INTO dates VALUES ('2020-10-01');
-- Retrieve data
SELECT x FROM dates;
-- Output: 2020-10-01

DateTime

Stores date-time values as Unix timestamps, allowing timezone-aware queries:

CREATE TABLE events (occurrence DateTime);
INSERT INTO events VALUES ('2020-10-01 00:00:00');
SELECT toDateTime(occurrence, 'Asia/Shanghai') AS localized_time;
-- Output: DateTime('Asia/Shanghai')

Boolean Type

ClickHouse lacks a native boolean type but utilizes UInt8 to represent boolean values (0 = false, 1 = true).

Array Type

Allows storage of values of a specified type T within arrays:

SELECT array(10, 20, 30) AS scores;
-- Output: Array(UInt32)

Tuple Type

Tuples group values of different types, unlike arrays:

CREATE TABLE tuple_test (info Tuple(String, Int8));
INSERT INTO tuple_test VALUES (('Alice', 25));
SELECT * FROM tuple_test;
-- Output: ('Alice', 25)

Specialized Types

Nullable

Nullable allows columns to store null values:

CREATE TABLE with_nulls (col1 Int8, col2 Nullable(Int8));
INSERT INTO with_nulls VALUES (42, NULL);
SELECT col1 + col2 AS result FROM with_nulls;
-- Output: NULL for rows where col2 is NULL

Domain Types

IPv4 and IPv6 are domain-specific types for storing optimized binary-compatible representations of IP addresses:

CREATE TABLE ip_records (url String, source IPv4);
INSERT INTO ip_records VALUES ('example.com', '192.168.0.1');
SELECT * FROM ip_records;
-- Output: Compact representation of IP addresses
Tags: ClickHouse

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.