跳转至

快速使用

安装 Addax

如果你不想编译,你可以执行下面的命令,直接从下载已经编译好的二进制文件

curl -sS -o addax-4.0.2.tar.gz \ 
  https://github.com/wgzhao/Addax/releases/download/4.0.2/addax-4.0.2.tar.gz`

tar -xzf addax-4.0.2.tar.gz
cd addax-4.0.2
git clone https://github.com/wgzhao/addax.git
cd addax
git checkout 4.0.2
mvn clean package -pl '!:addax-docs'
mvn package assembly:single
cd target/addax/addax-4.0.2

开始第一个采集任务

要使用 Addax 进行数据采集,只需要编写一个任务采集文件,该文件为 JSON 格式,以下是一个简单的配置文件,该任务的目的是从内存读取读取指定内容的数据,并将其打印出来,文件保存在 job/test.json

{
  "job": {
    "setting": {
      "speed": {
        "byte": -1,
        "channel": 1
      },
      "errorLimit": {
        "record": 0,
        "percentage": 0.02
      }
    },
    "content": [
      {
        "reader": {
          "name": "streamreader",
          "parameter": {
            "column": [
              {
                "value": "addax",
                "type": "string"
              },
              {
                "value": 19890604,
                "type": "long"
              },
              {
                "value": "1989-06-04 00:00:00",
                "type": "date"
              },
              {
                "value": true,
                "type": "bool"
              }
            ],
            "sliceRecordCount": 10
          }
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "print": true
          }
        }
      }
    ]
  }
}

将上述文件保存为 job/test.json

然后执行下面的命令:

bin/addax.sh job/test.json

如果没有报错,应该会有类似这样的输出

  ___      _     _
 / _ \    | |   | |
/ /_\ \ __| | __| | __ ___  __
|  _  |/ _` |/ _` |/ _` \ \/ /
| | | | (_| | (_| | (_| |>  <
\_| |_/\__,_|\__,_|\__,_/_/\_\

:: Addax version ::    (v4.0.3-SNAPSHOT)

2021-08-23 13:45:17.199 [        main] INFO  VMInfo               - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl
2021-08-23 13:45:17.223 [        main] INFO  Engine               -
{
    "content":[
        {
            "reader":{
                "parameter":{
                    "column":[
                        {
                            "type":"string",
                            "value":"addax"
                        },
                        {
                            "type":"long",
                            "value":19890604
                        },
                        {
                            "type":"date",
                            "value":"1989-06-04 00:00:00"
                        },
                        {
                            "type":"bool",
                            "value":true
                        }
                    ],
                    "sliceRecordCount":10
                },
                "name":"streamreader"
            },
            "writer":{
                "parameter":{
                    "print":true
                },
                "name":"streamwriter"
            }
        }
    ],
    "setting":{
        "errorLimit":{
            "record":0,
            "percentage":0.02
        },
        "speed":{
            "byte":-1,
            "channel":1
        }
    }
}

2021-08-23 13:45:17.238 [        main] INFO  PerfTrace            - PerfTrace traceId=job_-1, isEnable=false, priority=0
2021-08-23 13:45:17.239 [        main] INFO  JobContainer         - Addax jobContainer starts job.
2021-08-23 13:45:17.240 [        main] INFO  JobContainer         - Set jobId = 0
2021-08-23 13:45:17.250 [       job-0] INFO  JobContainer         - Addax Reader.Job [streamreader] do prepare work .
2021-08-23 13:45:17.250 [       job-0] INFO  JobContainer         - Addax Writer.Job [streamwriter] do prepare work .
2021-08-23 13:45:17.251 [       job-0] INFO  JobContainer         - Job set Channel-Number to 1 channels.
2021-08-23 13:45:17.251 [       job-0] INFO  JobContainer         - Addax Reader.Job [streamreader] splits to [1] tasks.
2021-08-23 13:45:17.252 [       job-0] INFO  JobContainer         - Addax Writer.Job [streamwriter] splits to [1] tasks.
2021-08-23 13:45:17.276 [       job-0] INFO  JobContainer         - Scheduler starts [1] taskGroups.
2021-08-23 13:45:17.282 [ taskGroup-0] INFO  TaskGroupContainer   - taskGroupId=[0] start [1] channels for [1] tasks.
2021-08-23 13:45:17.287 [ taskGroup-0] INFO  Channel              - Channel set byte_speed_limit to -1, No bps activated.
2021-08-23 13:45:17.288 [ taskGroup-0] INFO  Channel              - Channel set record_speed_limit to -1, No tps activated.
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
addax   19890604    1989-06-04 00:00:00 true
2021-08-23 13:45:20.295 [       job-0] INFO  AbstractScheduler    - Scheduler accomplished all tasks.
2021-08-23 13:45:20.296 [       job-0] INFO  JobContainer         - Addax Writer.Job [streamwriter] do post work.
2021-08-23 13:45:20.297 [       job-0] INFO  JobContainer         - Addax Reader.Job [streamreader] do post work.
2021-08-23 13:45:20.302 [       job-0] INFO  JobContainer         - PerfTrace not enable!
2021-08-23 13:45:20.305 [       job-0] INFO  StandAloneJobContainerCommunicator - Total 10 records, 220 bytes | Speed 73B/s, 3 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.011s | Percentage 100.00%
2021-08-23 13:45:20.307 [       job-0] INFO  JobContainer         -
任务启动时刻                    : 2021-08-23 13:45:17
任务结束时刻                    : 2021-08-23 13:45:20
任务总计耗时                    :                  3s
任务平均流量                    :               73B/s
记录写入速度                    :              3rec/s
读出记录总数                    :                  10
读写失败总数                    :                   0
Back to top