定稿人 | 定稿日期 | 系统环境
| :--------: | :-----: | :----: |
黄镇城 | 2018.2.25 | ubuntu14.04 + docker1.13 + docker-compose1.16
## cAdvisor+influxdb+grafana
#### 工具
* `cAdvisor`用来分析运行中的Docker容器的资源占用以及性能特性的工具。用来收集swarm节点性能数据保存到infuxdb中.
* `InfuxDB`是一个开源分布式时序数据库, 用来保存性能数据.
* `Grafana`性能绘图仪表盘工具, 读取Influxdb性能数据,绘图展示.
#### 创建overlay网络
```powershell
$ docker network create --driver overlay Monitor
5o2srynkuvecbs0x5wjccuvga
$ docker network ls | grep "Monitor"
5o2srynkuvec Monitor overlay swarm
```
#### influxDB on swarm
```powershell
$ docker service create --network logging -p 8083:8083 -p \
8086:8086 --mount source=influxdb-vol,type=volume,target=/var/lib/influxdb \
--name=influxdb --constraint 'node.hostname==Mai-II' influxdb:1.3
2ihgni3msvgxd4ljeb88jk3kn
Since --detach=false was not specified, tasks will be created in the background.
In a future release, --detach=false will become the default.
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
2ihgni3msvgx influxdb replicated 1/1 influxdb:1.3 *:8083->8083/tcp,*:8086->8086/tcp
$ docker service ps influxdb
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ur5tk4emglcu influxdb.1 influxdb:1.3 Mai-II Running Running about a minute ago
```
### 扩展
* influxDB相当于数据库。
![](https://box.kancloud.cn/37f966d9ededa1d794ddb14fa1298c0a_1539x559.png)
![](https://box.kancloud.cn/6e65f3760b9408323c49db31f23022c2_1538x392.png)
#### cAdvisor on swarm 【没用到,可以不看】
```powershell
$ docker service create --network logging --name cadvisor --mode global --mount source=/var/run,type=bind,target=/var/run,readonly=false --mount source=/,type=bind,target=/rootfs,readonly=true --mount source=/sys,type=bind,target=/sys,readonly=true --mount source=/var/lib/docker,type=bind,target=/var/lib/docker,readonly=true google/cadvisor:v0.23.8 -storage_driver=influxdb -storage_driver_host=influxdb:8086 -storage_driver_db=cadvisor
itcf68rul93zwpybg6vhvt17e
Since --detach=false was not specified, tasks will be created in the background.
In a future release, --detach=false will become the default.
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
2ihgni3msvgx influxdb replicated 1/1 influxdb:1.3 *:8083->8083/tcp,*:8086->8086/tcp
itcf68rul93z cadvisor global 2/2 google/cadvisor:v0.23.8
```
- `--mode global` 指定service运行在每个swarm节点上
- `--mount` 挂载本地docker socket用于监控docker性能
- `-storage_driver=influxdb` 指定存储驱动,使cadvisor将数据存储到数据库中
- `-storage_driver_host=influxdb:8086` InfluxDB地址
- `-storage_driver_db=cadvisor` 数据库名称
#### grafana on swarm
```powershell
$ docker service create --network logging -p 3000:3000 -e "GF_SECURITY_ADMIN_PASSWORD=admin" --constraint 'node.hostname==Mai-II' --name grafana grafana/grafana
h5mig45awt7svl4jy43bvahzl
Since --detach=false was not specified, tasks will be created in the background.
In a future release, --detach=false will become the default.
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
2ihgni3msvgx influxdb replicated 1/1 influxdb:1.3 *:8083->8083/tcp,*:8086->8086/tcp
h5mig45awt7s grafana replicated 1/1 grafana/grafana:latest *:3000->3000/tcp
itcf68rul93z cadvisor global 2/2 google/cadvisor:v0.23.8
```
* y
![](https://box.kancloud.cn/cfe13bba75298b81982919c1c1126fd8_1542x731.png)
![](https://box.kancloud.cn/1910e407d2d17b01759185bbefd95bde_1008x1024.png)
![](https://box.kancloud.cn/efbbb7358405886a1351dd3373a56328_1240x582.png)
![](https://box.kancloud.cn/efbbb7358405886a1351dd3373a56328_1240x582.png)
![](https://box.kancloud.cn/d79ebb5f70c1f048d9875f7557ed6897_1868x1013.png)
![](https://box.kancloud.cn/524881476bfb93b3c9e2470b3a5a8b8d_1867x1014.png)
## telegraf+influxdb+grafana
* 采用这个方案进行监控展示
#### 启动influxdb
```powershell
$ docker run -itd -p 8083:8083 -p 8086:8086 -e ADMIN_USER="root" -e INFLUXDB_INIT_PWD="root" -e PER_CREATE_DB="telegraf" --name influxdb influxdb:1.3
# 第一次telegraf容器
$ docker run -itd --name telegraf telegraf
# 将telegraf容器里的telegraf.conf配置文件复制出来
$ docker cp telegraf:/etc/telegraf/telegraf.conf ./telegraf.conf
# 再次启动telegraf容器前需要修改telegraf配置文件。如下【在正式启动telegraf容器前需要修改配置】
$ docker run -itd --name=telegraf -v /home/hzy/telegraf.conf:/etc/telegraf/telegraf.conf -v /var/run:/var/run telegraf
# 启动Grafana服务,将密码设置为“admin”
$ docker run -itd -p 3000:3000 -e "GF_SECURITY_ADMIN_PASSWORD=admin" --name grafana grafana/grafana
```
#### 在正式启动telegraf容器前需要修改配置
![](https://box.kancloud.cn/94667ec8a5754330ddebf25ee0d4af81_966x180.png)
![](https://box.kancloud.cn/5da02ee6bc7f97c72c077e20ce6db62b_871x632.png)
#### 在Grafana添加influxdb的数据源
![](https://box.kancloud.cn/d9cff7ea65c98ddbd9c561e406f33a6e_1242x1014.png)
#### 在Grafana官网下载监控配置文件
> https://grafana.com/dashboards
* grafana官网有许多比较完善的配置文件可以下载后直接导入
①筛选对应的插件配置
![](https://box.kancloud.cn/f68d6cca347f9bddf4e838167179d2cb_1210x700.png)
②注意版本号
![](https://box.kancloud.cn/024a1a8d1a207a85ee0b62f023aa92c2_1203x829.png)
#### 导入json配置文件
* 导入json文件的入口
![](https://box.kancloud.cn/35279d01f8f459259093b0e50a0acc83_582x470.png)
* 选择导入文件和influxdb源
![](https://box.kancloud.cn/bf1d437aa0c90c58c0dfa070f11e404e_748x442.png)
* 成功后的效果
![](https://box.kancloud.cn/6d1f8d4e0a41660a919afcc3e96ec552_1850x1015.png)
## 扩展
* 监控的目的可以使计算机的资源占用和容器的运行状态起到监督的作用。Grafana的数据展示使其各个资源和状态一目了然。不需要通过N个命令行分别查看。
* 未来:制作监控报警,比如资源占比90%的阈值时通过邮箱方式报警。