💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
[TOC] <br > ***** # **在 CentOS 7 上安装 Kafka 2.3.1 单节点环境** <br > ## **前提** 假设我们已经满足了以下条件,如果没有的话,请参考 *附录A*: * 一台2核4G的虚拟机,IP 地址192.168.80.81 * 已安装 CentOS 7.7 64 位操作系统 * 具备 root 用户权限 * 已安装 JDK 1.8.0\_221 <br > ## **安装** 1. 下载 Kafka 从 [Apache Kafka](%E5%B9%B6%E8%A7%A3%E5%8E%8B%E7%BC%A9) 官网下载推荐的二进制包 [kafka\_2.12-2.3.1.tgz](https://www.apache.org/dyn/closer.cgi?path=/kafka/2.3.1/kafka_2.12-2.3.1.tgz),并上传到 /opt 目录解压缩: ~~~ # cd /opt # tar -xzf kafka_2.12-2.3.1.tgz # chown -R lemon:oper /opt/kafka_2.12-2.3.1 # cd /opt/kafka_2.12-2.3.1 ~~~ <br > 1. 启动服务器 Kafka 使用 [ZooKeeper](https://zookeeper.apache.org/) ,如果你还没有 ZooKeeper 服务器,你需要先启动一个 ZooKeeper 服务器。 您可以通过与 kafka 打包在一起的便捷脚本来快速简单地创建一个单节点 ZooKeeper 实例。 ~~~ $ nohup bin/zookeeper-server-start.sh config/zookeeper.properties > zookeeper.log & ... [2020-01-11 21:35:36,457] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory) ~~~ 稍等片刻,待 Zookeeper 启动完成后,再启动 Kafka 服务器: ~~~ $ export JMX_PORT=9988 $ nohup bin/kafka-server-start.sh config/server.properties > kafka-server.log & ... [2020-01-11 21:37:22,229] INFO [KafkaServer id=0] started (kafka.server.KafkaServer) ~~~ <br > ## **测试** 1. 创建一个 topic 让我们创建一个名为“test”的 topic,它有一个分区和一个副本: ~~~ $ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test Created topic test. ~~~ 现在我们可以运行 list(列表)命令来查看这个 topic: ~~~ $ bin/kafka-topics.sh --list --zookeeper localhost:2181 test ~~~ <br > 1. 发送一些消息 Kafka 自带一个命令行客户端,它从文件或标准输入中获取输入,并将其作为 message(消息)发送到 Kafka 集群。默认情况下,每行将作为单独的 message 发送。 使用另一个终端运行 producer,然后在控制台输入一些消息以发送到服务器。 ~~~ $ cd /opt/kafka_2.12-2.3.1/ $ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test > Hello World! > Hello China! ^C ~~~ <br > 1. 启动一个 consumer Kafka 还有一个命令行 consumer(消费者),将消息转储到标准输出。 ~~~ $ cd /opt/kafka_2.12-2.3.1/ $ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning Hello World! Hello China! ^C ~~~ 如果您将上述命令在不同的终端中运行,那么现在就可以将消息输入到生产者终端中,并将它们在消费终端中显示出来。 所有的命令行工具都有其他选项;运行不带任何参数的命令将显示更加详细的使用信息。 <br > 2. 生产者吞吐量测试 kafka-producer-perf-test 脚本是 Kafka 提供的用于测试 producer 性能的脚本,该脚本可以很方便地计算出 producer 在一段时间内的吞吐量和平均延时 ~~~ $ bin/kafka-producer-perf-test.sh --topic test --num-records 500000 --record-size 200 --throughput -1 --producer-props bootstrap.servers=localhost:9092 acks=-1 221506 records sent, 44063.3 records/sec (8.40 MB/sec), 2382.8 ms avg latency, 3356.0 ms max latency. 500000 records sent, 64102.564103 records/sec (12.23 MB/sec), 2078.40 ms avg latency, 3356.00 ms max latency, 1841 ms 50th, 3250 ms 95th, 3343 ms 99th, 3354 ms 99.9th. ~~~ 输出结果表明在这台测试机上运行一个 kafka producer,平均每秒发送 *64102* 条记录,平均吞吐量是每秒 *12.23MB*(占用 97.84Mb/s 左右的宽带),平均延时 *2078* 毫秒,最大延时 *3356* 毫秒,50% 的消息发送需 *1841* 毫秒,95% 的消息发送需 *3250* 毫秒,99% 的消息发送需 *3343* 毫秒,99.9 的消息发送需 *3354* 毫秒。 <br > 3. 消费者吞吐量测试 和 kafka-producer-perf-test 脚本类似,Kafka 为 consumer 也提供了方便、便捷的性能测试脚本,即 kafka-consumer-perf-test 脚本。我们首先用它在刚刚搭建的 Kafka 集群环境中测试一下新版本 consumer 的吞吐量。 ~~~ $ bin/kafka-consumer-perf-test.sh --broker-list localhost:9092 --messages 500000 --topic test start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec 2020-01-13 12:52:48:112, 2020-01-13 12:52:49:295, 95.3675, 80.6149, 500002, 422655.9594, 25, 1158, 82.3553, 431780.6563 ~~~ <br > ## **关闭** 1. 首先使用 kafka-server-stop 脚本关闭 kafka 集群: ~~~ $ bin/kafka-server-stop.sh ~~~ <br > 1. 然后稍等片刻,使用 zookeeper-server-stop 脚本关闭 zookeeper: ~~~ $ bin/zookeeper-server-stop.sh ~~~ <br > ## **调优** ### **操作系统调优** 1. 调整最大文件描述符上限: ~~~ # ulimit -n 100000 ~~~ > 使用 Kafka 的用户有时候会碰到“too many files open”的错误,这就需要为 broker 所在机器调优最大文件描述符上线。调优可参考这样的公式:broker 上可能的最大分区数 \* (每个分区平均数据量 / 平均的日志段大小 + 3)。实际一般将该值设置得很大,比如 1000000。 <br > 1. 关闭 swap: ~~~ # sysctl vm.swappiness=1 # vim /etc/sysctl.conf vm.swappiness=1 ~~~ > 关闭 swap 是很多使用磁盘的应用程序的常规调优手段,将 vm.swappiness 调整为 1 个较小的数 ,即大幅降低对 swap 空间的使用,以免极大地拉低性能。可以使用 `free -m` 命令验证。 <br > 1. 优化 /data 分区: 调整 /data 分区设置,*vim /etc/fstab*,在 defaults 后面增加 `,noatime,largeio`: ~~~ /dev/mapper/centos-data /data xfs defaults,noatime,largeio 0 0 ~~~ > 禁止 atime 更新:由于 Kafka 大量使用物理磁盘进行消息持久化,故文件系统的选择是重要的调优步骤。对于 Linux 系统上的任何文件系统,Kafka 都推荐用户在挂载文件系统(mount)时设置 noatime 选项,即取消文件 atime(最新访问时间)属性的更新——禁掉 atime 更新避免了 inode 访问时间的写入操作,因此极大地减少了文件系统写操作数,从而提升了集群性能。Kafka 并没有使用 atime,因此禁掉它是安全的操作。可以使用 `ls -l --time=atime` 命令验证。 > largeio 参数将影响 stat 调用返回的 I/O 大小。对于大数据量的磁盘写入操作而言,它能够提升一定的性能。 > 重新挂载 /data 分区: ~~~ # mount -o remount /data ~~~ <br > 1. 生产者吞吐量测试 ~~~ $ cd /opt/kafka_2.12-2.3.1 $ bin/kafka-producer-perf-test.sh --topic test --num-records 500000 --record-size 200 --throughput -1 --producer-props bootstrap.servers=localhost:9092 acks=-1 442836 records sent, 88496.4 records/sec (16.88 MB/sec), 1311.3 ms avg latency, 1937.0 ms max latency. 500000 records sent, 91642.228739 records/sec (17.48 MB/sec), 1328.20 ms avg latency, 1937.00 ms max latency, 1368 ms 50th, 1886 ms 95th, 1930 ms 99th, 1935 ms 99.9th. ~~~ <br > 1. 消费者吞吐量测试 和 kafka-producer-perf-test 脚本类似,Kafka 为 consumer 也提供了方便、便捷的性能测试脚本,即 kafka-consumer-perf-test 脚本。我们首先用它在刚刚搭建的 Kafka 集群环境中测试一下新版本 consumer 的吞吐量。 ~~~ $ bin/kafka-consumer-perf-test.sh --broker-list localhost:9092 --messages 500000 --topic test start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec 2020-01-13 13:17:34:483, 2020-01-13 13:17:35:580, 95.4016, 86.9659, 500181, 455953.5096, 20, 1077, 88.5809, 464420.6128 ~~~ <br > ### **JVM 调优** 1. 调整 JVM 启动参数: 编辑个人文件 `vim ~/.bash_profile`,将以下内容添加到文件中: ~~~ export KAFKA_HEAP_OPTS="-Xmx1g -Xms1g -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=85" ~~~ 编译 .bash\_profile ~~~ source ~/.bash_profile ~~~ <br > 1. 生产者吞吐量测试 ~~~ $ cd /opt/kafka_2.12-2.3.1 $ bin/kafka-producer-perf-test.sh --topic test --num-records 500000 --record-size 200 --throughput -1 --print-metrics --producer-props bootstrap.servers=localhost:9092 acks=-1 377763 records sent, 75552.6 records/sec (14.41 MB/sec), 1507.6 ms avg latency, 1979.0 ms max latency. 500000 records sent, 84160.915671 records/sec (16.05 MB/sec), 1466.62 ms avg latency, 1979.00 ms max latency, 1507 ms 50th, 1952 ms 95th, 1965 ms 99th, 1978 ms 99.9th. Metric Name Value app-info:commit-id:{client-id=producer-1} : 18a913733fb71c01 app-info:start-time-ms:{client-id=producer-1} : 1579140833567 app-info:version:{client-id=producer-1} : 2.3.1 kafka-metrics-count:count:{client-id=producer-1} : 102.000 producer-metrics:batch-size-avg:{client-id=producer-1} : 16337.068 producer-metrics:batch-size-max:{client-id=producer-1} : 16377.000 producer-metrics:batch-split-rate:{client-id=producer-1} : 0.000 producer-metrics:batch-split-total:{client-id=producer-1} : 0.000 producer-metrics:buffer-available-bytes:{client-id=producer-1} : 33554432.000 producer-metrics:buffer-exhausted-rate:{client-id=producer-1} : 0.000 producer-metrics:buffer-exhausted-total:{client-id=producer-1} : 0.000 producer-metrics:buffer-total-bytes:{client-id=producer-1} : 33554432.000 producer-metrics:bufferpool-wait-ratio:{client-id=producer-1} : 0.081 producer-metrics:bufferpool-wait-time-total:{client-id=producer-1} : 2834023273.000 producer-metrics:compression-rate-avg:{client-id=producer-1} : 1.000 producer-metrics:connection-close-rate:{client-id=producer-1} : 0.000 producer-metrics:connection-close-total:{client-id=producer-1} : 0.000 producer-metrics:connection-count:{client-id=producer-1} : 2.000 producer-metrics:connection-creation-rate:{client-id=producer-1} : 0.056 producer-metrics:connection-creation-total:{client-id=producer-1} : 2.000 producer-metrics:failed-authentication-rate:{client-id=producer-1} : 0.000 producer-metrics:failed-authentication-total:{client-id=producer-1} : 0.000 producer-metrics:failed-reauthentication-rate:{client-id=producer-1} : 0.000 producer-metrics:failed-reauthentication-total:{client-id=producer-1} : 0.000 producer-metrics:incoming-byte-rate:{client-id=producer-1} : 10062.678 producer-metrics:incoming-byte-total:{client-id=producer-1} : 360586.000 producer-metrics:io-ratio:{client-id=producer-1} : 0.012 producer-metrics:io-time-ns-avg:{client-id=producer-1} : 23895.015 producer-metrics:io-wait-ratio:{client-id=producer-1} : 0.098 producer-metrics:io-wait-time-ns-avg:{client-id=producer-1} : 204193.407 producer-metrics:io-waittime-total:{client-id=producer-1} : 3537650773.000 producer-metrics:iotime-total:{client-id=producer-1} : 413981132.000 producer-metrics:metadata-age:{client-id=producer-1} : 5.829 producer-metrics:network-io-rate:{client-id=producer-1} : 358.811 producer-metrics:network-io-total:{client-id=producer-1} : 12858.000 producer-metrics:outgoing-byte-rate:{client-id=producer-1} : 2939361.696 producer-metrics:outgoing-byte-total:{client-id=producer-1} : 105329087.000 producer-metrics:produce-throttle-time-avg:{client-id=producer-1} : 0.000 producer-metrics:produce-throttle-time-max:{client-id=producer-1} : 0.000 producer-metrics:reauthentication-latency-avg:{client-id=producer-1} : NaN producer-metrics:reauthentication-latency-max:{client-id=producer-1} : NaN producer-metrics:record-error-rate:{client-id=producer-1} : 0.000 producer-metrics:record-error-total:{client-id=producer-1} : 0.000 producer-metrics:record-queue-time-avg:{client-id=producer-1} : 1458.931 producer-metrics:record-queue-time-max:{client-id=producer-1} : 1977.000 producer-metrics:record-retry-rate:{client-id=producer-1} : 0.000 producer-metrics:record-retry-total:{client-id=producer-1} : 0.000 producer-metrics:record-send-rate:{client-id=producer-1} : 13975.068 producer-metrics:record-send-total:{client-id=producer-1} : 500000.000 producer-metrics:record-size-avg:{client-id=producer-1} : 286.000 producer-metrics:record-size-max:{client-id=producer-1} : 286.000 producer-metrics:records-per-request-avg:{client-id=producer-1} : 77.809 producer-metrics:request-latency-avg:{client-id=producer-1} : 4.382 producer-metrics:request-latency-max:{client-id=producer-1} : 136.000 producer-metrics:request-rate:{client-id=producer-1} : 179.406 producer-metrics:request-size-avg:{client-id=producer-1} : 16383.432 producer-metrics:request-size-max:{client-id=producer-1} : 16431.000 producer-metrics:request-total:{client-id=producer-1} : 6429.000 producer-metrics:requests-in-flight:{client-id=producer-1} : 0.000 producer-metrics:response-rate:{client-id=producer-1} : 179.411 producer-metrics:response-total:{client-id=producer-1} : 6429.000 producer-metrics:select-rate:{client-id=producer-1} : 481.330 producer-metrics:select-total:{client-id=producer-1} : 17325.000 producer-metrics:successful-authentication-no-reauth-total:{client-id=producer-1} : 0.000 producer-metrics:successful-authentication-rate:{client-id=producer-1} : 0.000 producer-metrics:successful-authentication-total:{client-id=producer-1} : 0.000 producer-metrics:successful-reauthentication-rate:{client-id=producer-1} : 0.000 producer-metrics:successful-reauthentication-total:{client-id=producer-1} : 0.000 producer-metrics:waiting-threads:{client-id=producer-1} : 0.000 producer-node-metrics:incoming-byte-rate:{client-id=producer-1, node-id=node--1} : 12.335 producer-node-metrics:incoming-byte-rate:{client-id=producer-1, node-id=node-0} : 10064.949 producer-node-metrics:incoming-byte-total:{client-id=producer-1, node-id=node--1} : 442.000 producer-node-metrics:incoming-byte-total:{client-id=producer-1, node-id=node-0} : 360144.000 producer-node-metrics:outgoing-byte-rate:{client-id=producer-1, node-id=node--1} : 1.702 producer-node-metrics:outgoing-byte-rate:{client-id=producer-1, node-id=node-0} : 2943220.331 producer-node-metrics:outgoing-byte-total:{client-id=producer-1, node-id=node--1} : 61.000 producer-node-metrics:outgoing-byte-total:{client-id=producer-1, node-id=node-0} : 105329026.000 producer-node-metrics:request-latency-avg:{client-id=producer-1, node-id=node--1} : NaN producer-node-metrics:request-latency-avg:{client-id=producer-1, node-id=node-0} : 4.382 producer-node-metrics:request-latency-max:{client-id=producer-1, node-id=node--1} : NaN producer-node-metrics:request-latency-max:{client-id=producer-1, node-id=node-0} : 136.000 producer-node-metrics:request-rate:{client-id=producer-1, node-id=node--1} : 0.056 producer-node-metrics:request-rate:{client-id=producer-1, node-id=node-0} : 179.590 producer-node-metrics:request-size-avg:{client-id=producer-1, node-id=node--1} : 30.500 producer-node-metrics:request-size-avg:{client-id=producer-1, node-id=node-0} : 16388.521 producer-node-metrics:request-size-max:{client-id=producer-1, node-id=node--1} : 37.000 producer-node-metrics:request-size-max:{client-id=producer-1, node-id=node-0} : 16431.000 producer-node-metrics:request-total:{client-id=producer-1, node-id=node--1} : 2.000 producer-node-metrics:request-total:{client-id=producer-1, node-id=node-0} : 6427.000 producer-node-metrics:response-rate:{client-id=producer-1, node-id=node--1} : 0.056 producer-node-metrics:response-rate:{client-id=producer-1, node-id=node-0} : 179.615 producer-node-metrics:response-total:{client-id=producer-1, node-id=node--1} : 2.000 producer-node-metrics:response-total:{client-id=producer-1, node-id=node-0} : 6427.000 producer-topic-metrics:byte-rate:{client-id=producer-1, topic=test} : 2934343.237 producer-topic-metrics:byte-total:{client-id=producer-1, topic=test} : 104981998.000 producer-topic-metrics:compression-rate:{client-id=producer-1, topic=test} : 1.000 producer-topic-metrics:record-error-rate:{client-id=producer-1, topic=test} : 0.000 producer-topic-metrics:record-error-total:{client-id=producer-1, topic=test} : 0.000 producer-topic-metrics:record-retry-rate:{client-id=producer-1, topic=test} : 0.000 producer-topic-metrics:record-retry-total:{client-id=producer-1, topic=test} : 0.000 producer-topic-metrics:record-send-rate:{client-id=producer-1, topic=test} : 13975.459 producer-topic-metrics:record-send-total:{client-id=producer-1, topic=test} : 500000.000 ~~~ > throughput:用来进行限流控制,当设定的值小于 0 时不限流,当设定的值大于 0 时,如果发送的吞吐量大于该值时就会被阻塞一段时间。 > print-metrics:指定了这个参数时会在测试完成之后打印很多指标信息,对很多测试任务而言具有一定的参考价值。 <br > 1. 消费者吞吐量测试 和 kafka-producer-perf-test 脚本类似,Kafka 为 consumer 也提供了方便、便捷的性能测试脚本,即 kafka-consumer-perf-test 脚本。我们首先用它在刚刚搭建的 Kafka 集群环境中测试一下新版本 consumer 的吞吐量。 ~~~ $ bin/kafka-consumer-perf-test.sh --broker-list localhost:9092 --messages 500000 --topic test start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec 2020-01-13 13:24:56:267, 2020-01-13 13:24:57:259, 95.4016, 96.1710, 500181, 504214.7177, 26, 966, 98.7594, 517785.7143 ~~~ <br > # **参考资料** * [Apache Kafka QuickStart](http://kafka.apache.org/quickstart) <br >