跳转至

ClickHouse Reader

ClickHouseReader 插件支持从 ClickHouse数据库读取数据。

示例

表结构及数据信息

假定需要的读取的表的结构以及数据如下:

CREATE TABLE ck_addax (
    c_int8 Int8,
    c_int16 Int16,
    c_int32 Int32,
    c_int64 Int64,
    c_uint8 UInt8,
    c_uint16 UInt16,
    c_uint32 UInt32,
    c_uint64 UInt64,
    c_float32 Float32,
    c_float64 Float64,
    c_decimal Decimal(38,10),
    c_string String,
    c_fixstr FixedString(36),
    c_uuid UUID,
    c_date Date,
    c_datetime DateTime('Asia/Chongqing'),
    c_datetime64 DateTime64(3, 'Asia/Chongqing'),
    c_enum Enum('hello' = 1, 'world'=2)
) ENGINE = MergeTree() ORDER BY (c_int8, c_int16) SETTINGS index_granularity = 8192;

insert into ck_addax values(
    127,
    -32768,
    2147483647,
    -9223372036854775808,
    255,
    65535,
    4294967295,
    18446744073709551615,
    0.999999999999,
    0.99999999999999999,
    1234567891234567891234567891.1234567891,
    'Hello String',
    '2c:16:db:a3:3a:4f',
    '5F042A36-5B0C-4F71-ADFD-4DF4FCA1B863',
    '2021-01-01',
    '2021-01-01 11:22:33',
    '2021-01-01 10:33:23.123',
    'hello'
);

配置 json 文件

下面的配置文件表示从 ClickHouse 数据库读取指定的表数据并打印到终端

{
  "job": {
    "setting": {
      "speed": {
        "channel": 1,
        "bytes": -1
      },
      "errorLimit": {
        "record": 0,
        "percentage": 0.02
      }
    },
    "content": {
      "reader": {
        "name": "clickhousereader",
        "parameter": {
          "username": "root",
          "password": "root",
          "column": [
            "*"
          ],
          "connection": {
            "table": [
              "ck_addax"
            ],
            "jdbcUrl": "jdbc:clickhouse://127.0.0.1:8123/default"
          }
        }
      },
      "writer": {
        "name": "streamwriter",
        "parameter": {
          "print": true
        }
      }
    }
  }
}

将上述配置文件保存为 job/clickhouse2stream.json

执行采集命令

执行以下命令进行数据采集

bin/addax.sh job/clickhouse2stream.json

其输出信息如下(删除了非关键信息)

021-01-06 14:39:35.742 [main] INFO  VMInfo - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl

2021-01-06 14:39:35.767 [main] INFO  Engine -
{
    "content":
        {
            "reader":{
                "parameter":{
                    "column":[
                        "*"
                    ],
                    "connection":[
                        {
                            "jdbcUrl":[
                                "jdbc:clickhouse://127.0.0.1:8123/"
                            ],
                            "table":[
                                "ck_addax"
                            ]
                        }
                    ],
                    "username":"default"
                },
                "name":"clickhousereader"
            },
            "writer":{
                "parameter":{
                    "print":true
                },
                "name":"streamwriter"
            }
    },
    "setting":{
        "errorLimit":{
            "record":0,
            "percentage":0.02
        },
        "speed":{
            "channel":3
        }
    }
}

127 -32768  2147483647  -9223372036854775808    255 65535   4294967295  18446744073709551615    1   1   1234567891234567891234567891.1234567891Hello String 2c:16:db:a3:3a:4f   
5f042a36-5b0c-4f71-adfd-4df4fca1b863    2021-01-01  2021-01-01 00:00:00 2021-01-01 00:00:00 hello

任务启动时刻                    : 2021-01-06 14:39:35
任务结束时刻                    : 2021-01-06 14:39:39
任务总计耗时                    :                  3s
任务平均流量                    :               77B/s
记录写入速度                    :              0rec/s
读出记录总数                    :                   1
读写失败总数                    :                   0

参数说明

该插件基于 RDBMS Reader 实现,因此可以参考 RDBMS Reader 的所有参数。

支持的数据类型

Addax 内部类型 ClickHouse 数据类型
Long Uint8, Uint16, Uint32, Uint64, Int8, Int16, Int32, Int64, Enum8, Enum16
Double Float32, Float64, Decimal
String String, FixedString(N)
Date Date, DateTime, DateTime64
Boolean UInt8
Bytes String

限制

除上述罗列字段类型外,其他类型均不支持,如 Array、Nested 等