配置 · prometheus

## 配置configuration --- Prometheus可以通过命令行参数和配置文件来配置它的服务参数。命令行主要用于配置系统参数（例如：存储位置，保留在磁盘和内存中的数据量大小等），配置文件主要用于配置与抓取[任务和任务下的实例](https://prometheus.io/docs/concepts/jobs_instances/)相关的所有内容, 并且加载指定的抓取[规则file](https://prometheus.io/docs/querying/rules/#configuring-rules)。可以通过运行`prometheus -h`命令, 查看Prometheus服务所有可用的命令行参数， Prometheus服务可以reload它的配置。如果这个配置错误，则更改后的配置不生效。配置reolad是通过给Prometheus服务发送信号量`SIGHUP`或者通过http发送一个post请求到`/-/reload`。这也会重载所有配置的规则文件(rule files)。 ### 配置文件 Configuration file 使用`-config.file`命令行参数来指定Prometheus启动所需要的配置文件。这个配置文件是[YAML](http://en.wikipedia.org/wiki/YAML)格式，通过下面描述的范式定义, 括号表示参数是可选的。对于非列表参数，这个值被设置了默认值。通用占位符由下面定义： - `\<boolean\>`: 一个布尔值，包括`true`或者`false`. - `\<duration\>`: 持续时间，与正则表达式`[0-9]+(ms|smhdwy)`匹配 - `\<labelname\>`: 一个与正则表达式`[a-zA-Z_][a-zA-Z0-9_]*`匹配的字符串 - `\<labelvalue\>`: 一个为unicode字符串 - `\<filename\>`: 当前工作目录下的有效路径 - `\<host\>`: 一个包含主机名或者IP地址，并且可以带上一个非必需的端口号的有效字符串 - `\<path\>`: 一个有效的URL路径 - `\<scheme\>`: 一个可以是`http`或者`https`的字符串 - `\<string\>`: 一个正则表达式字符串其他的占位符被分开指定：一个有效的配置文件[示例](https://github.com/prometheus/prometheus/blob/master/config/testdata/conf.good.yml)。全局配置指定的参数，在其他上下文配置中是生效的。这也默认这些全局参数在其他配置区域有效。 ``` global: # 抓取目标实例的频率时间值，默认10s [ scrape_interval: <duration> | default = 10s ] # 一次抓取请求超时时间值，默认10s [ scrape_timeout: <duration> | default = 10s ] # 执行配置文件规则的频率时间值, 默认1m [ evaluation_interval: <duration> | default=1m ] # 当和外部系统通信时(federation, remote storage, Alertmanager), 这些标签会增加到度量指标数据中 external_labels: [ <labelname>: <labelvalue> ... ] # 规则文件指定规则文件路径列表。规则和警报是从所有匹配的文件中读取的 rule_files: [ - <filepath_glob> ...] # 抓取配置的列表 scrape_configs: [ - <scrape_config> ... ] # 警报设置 alerting: alert_relabel_configs: [ - <relabel_config> ... ] alertmanagers: [ - <alertmanager_config> ... ] # 设置涉及到未来的实验特征 remote_write: [url: <string> ] [ remote_timeout: <duration> | default = 30s ] tls_config: [ <tls_config> ] [proxy_url: <string> ] basic_auth: [user_name: <string> ] [password: <string> ] write_relabel_configs: [ - <relabel_config> ... ] ``` #### <scrape_config> `<scrape_config>`区域指定了目标列表和目标下的配置参数, 这些配置参数描述了如何抓取度量指标数据。通常，一个scrape_config只指定一个job，但是可以改变，一个scrape_config可以指定多个job，每个job下有多个targets 通过`static_configs`参数静态指定要监控的目标列表，或者使用一些服务发现机制发现目标。另外，`relabel_configs`允许在获取度量指标数据之前，对任何目标和它的标签进行进一步地修改。 ``` # 默认下任务名称赋值给要抓取的度量指标 job_name: <job_name> # 从这个任务中抓取目标的频率时间值 [ scrape_interval: <duration> | default= <global_config.scrape_interval>] # 当抓取这个任务的所有目标时，超时时间值 [ scrape_timeout: <duration> | default = <global_config.scrape_timeout> ] # 从目标列表中抓取度量指标的http资源路径, 默认为/metrics [ metrics_path: <path> | default = /metrics ] # honor_labels controls how Prometheus handles conflicts between would labels that are already present in scraped data and labels that Prometheus would attach server-side ("job" and "instance" labels, manually configured target labels, and labels generated by service discovery implementations). # If honor_labels is set to "true", label conflicts are resolved by keeping label # values from the scraped data and ignoring the conflicting server-side labe# ls. If honor_labels is set to "false", label conflicts are resolved by ren# amin conflicting labels in the scraped data to "exported_<original-label>" (for example "exported_instance", "exported_job") and then attaching server-side labels. This is useful for use cases such as federation, where all label#s specified in the target should be preserved. Note that any globally configured "external_labels" are unaffected by this # setting. In communication with external systems, they are always applied # only when a time series does not have a given label yet and are ignored otherwise. [ honor_labels: <boolean> | default = false ] # 配置请求的协议范式, 默认为http请求 [ scheme: <scheme> | default = http ] # 可选的http url参数 params: [ <string>:[<string>, ...]] # 在`Authorization`头部设置每次抓取请求的用户名和密码 basic_auth: [username: <string>] [password: <string>] # Sets the `Authorization` header on every scrape request with # the configured bearer token. It is mutually exclusive with `bearer_token_file`. [ bearer_token: <string> ] # Sets the `Authorization` header on every scrape request with the bearer token read from the configured file. It is mutually exclusive with `bearer_token`. [ bearer_token_file: /path/to/bearer/token/file ] # 配置抓取请求的TLS设置 tls_config: [ <tls_config> ] # 可选的代理URL [ proxy_url: <string> ] # 微软的Azure服务发现配置列表 azure_sd_configs: [ - <azure_sd_config> ... ] # Consul服务发现配置列表 consul_sd_configs: [ - <consul_sd_config> ... ] # DNS服务发现配置列表 dns_sd_configs: [ - <dns_sd_config> ... ] # 亚马逊EC2服务发现的配置列表 ec2_sd_configs: [ - <ec2_sd_config> ... ] # 文件服务发现配置列表 file_sd_configs: [ - <file_sd_config> ... ] # google GCE服务发现配置列表 gce_sd_configs: [ - <gce_sd_config> ... ] # Kubernetes服务发现配置列表 kubernetes_sd_configs: [ - <kubernetes_sd_config> ... ] # Marathon服务发现配置列表 marathon_sd_configs: [ - <marathon_sd_config> ... ] # AirBnB的Nerve服务发现配置列表 nerve_sd_configs: [ - <nerve_sd_config> ... ] # Zookeeper服务发现配置列表 serverset_sd_configs: [ - <serverset_sd_config> ... ] # Triton服务发现配置列表 triton_sd_configs: [ - <triton_sd_config> ... ] # 静态配置目标列表 static_configs: [ - <static_config> ... ] # 抓取之前的标签重构配置列表 relabel_configs: [ - <relabel_config> ... ] # List of metric relabel configurations. metric_relabel_configs: [ - <relabel_config> ... ] # Per-scrape limit on number of scraped samples that will be accepted. # If more than this number of samples are present after metric relabelling # the entire scrape will be treated as failed. 0 means no limit. [ sample_limit: <int> | default = 0 ] ``` 记住：在所有获取配置中`<job_name>`必须是唯一的。 #### <tls_config> `<tls_config>`允许配置TLS连接。 ``` # CA证书 [ ca_file: <filename> ] # 证书和key文件 [ cert_file: <filename> ] [ key_file: <filename> ] # ServerName extension to indicate the name of the server. # http://tools.ietf.org/html/rfc4366#section-3.1 [ server_name: <string> ] # Disable validation of the server certificate. [ insecure_skip_verify: <boolean> ] ``` #### <azure_sd_config> **Azure SD正处于测试阶段：在未来的版本中，仍然可能对配置进行实质性修改** Azure SD配置允许从Azure虚拟机中检索和获取目标。下面的测试标签在relabeling期间在目标上仍然是可用的： - `__meta_azure_machine_id`: 机器ID - `__meta_azure_machine_location`: 机器运行的位置 - `__meta_azure_machine_name`: 机器名称 - `__meta_azure_machine_private_ip`: 机器的内网IP - `__meta_azure_machine_resource_group`: 机器的资源组 - `__meta_azure_tag_<tagname>`: 机器的每个tag值对于Azure发现，看看下面的配置选项： ``` # The information to access the Azure API. # The subscription ID. subscription_id: <string> # The tenant ID. tenant_id: <string> # The client ID. client_id: <string> # The client secret. client_secret: <string> # Refresh interval to re-read the instance list. [ refresh_interval: <duration> | default = 300s ] # The port to scrape metrics from. If using the public IP address, this must # instead be specified in the relabeling rule. [ port: <int> | default = 80 ] ``` #### <consul_sd_config> Consul服务发现配置允许从Consul's Catalog API中检索和获取目标。下面的meta标签在relabeling期间在目标上仍然是可用的： - `__meta_consul_address`: 目标地址 - `__meta_consul_dc`: 目标的数据中心名称 - `__meta_consul_node`: 目标的节点名称 - `__meta_consul_service_address`: 目标的服务地址 - `__meta_consul_service_id`: 目标的服务ID - `__meta_consul_service_port`: 目标的服务端口 - `__meta_consul_service`: 这个目标属于哪个服务名称 - `__meta_consul_tags`: 由标签分隔符链接的目标的标签列表 ``` # 下面配置是访问Consul API所需要的信息 server: <host> [ token: <string> ] [ datacenter: <string> ] [ scheme: <string> ] [ username: <string> ] [ password: <string> ] # 指定对于某个目标的服务列表被检测，如果省略，所有服务被抓取 services: [ - <string> ] # The string by which Consul tags are joined into the tag label. [ tag_separator: <string> | default = , ] ``` 注意：用于获取目标的IP和PORT，被组装到`<__meta_consul_address>:<__meta_consul_service_port>`。然而，在一些Consul创建过程中，这个相关地址在`__meta_consul_service_address`。在这些例子中，你能使用[relabel](https://prometheus.io/docs/operating/configuration/#relabel_config)特性去替换指定的`__address__`标签。 #### <dns_sd_config> 一个基于DNS的服务发现配置允许指定一系列的DNS域名称，这些DNS域名被周期性地查询，用来发现目标列表。这些DNS服务是从`/etc/resolv.conf`获取的。这些服务发现方法仅仅支持基本的DNS A，AAAA和SRV记录查询，但不支持在RFC6763中指定更高级的DNS-SD方案。在[重构标签阶段](https://prometheus.io/docs/operating/configuration/#relabel_config)，这个标签`__meta_dns_name`在每一个目标上都是可用的，并且会设置生产发现的目标到记录名称中。 ``` # 将被查询的DNS域名列表 names: [ - <domain_name> ] # 要执行DNS查询类型，默认为SRV，其他方式：A、AAAA和SRV [ type: <query_type> | default = 'SRV' ] # 如果查询类型不是SRV，这端口被使用 [ port: <number>] # 刷新周期, 默认30s [ refresh_interval: <duration> | default = 30s ] ``` `<domain_name>`必须是一个有效的DNS域名。`<query_type>`必须是`SRV, A， AAAA`三种之一。 #### <ec2_sd_config> EC2 SD配置允许从AWS EC2实例中检索目标。默认情况下用内网IP地址, 但是在relabeling期间可以改变成公网ID地址。下面meta标签在relabeling期间在目标上是可用的： - `__meta_ec2_availability_zone`: 正在运行的实例的可用域。 - `__meta_ec2_instance_id`: EC2的实例ID - `__meta_ec2_instance_state`: EC2的实例状态 - `__meta_ec2_instance_type`: EC2的实例类型 - `__meta_ec2_private_ip`: 如果存在，表示内网IP的地址 - `__meta_ec2_public_dns_name`: 如果可用，表示实例的公网DNS名称 - `__meta_ec2_public_ip`: 如果可用，表示实例的公网IP地址 - `__meta_ec2_subnet_id`: 如果可用，表示子网IDs的列表。 - `__meta_ec2_tag_<tagkey>`: 这个实例的tag值 - `__meta_ec2_vpc_id`: 如果可用，表示正在运行的实例的VPC的ID 对于EC2 discovery，看看下面的配置选项： ``` # 访问EC2 API的信息 # AWS域 region: <string> # AWS API keys. 如果空白，环境变量`AWS_ACCESS_KEY_ID`和`AWS_SECRET_ACCESS_KEY`可以被使用 [ access_key: <string> ] [ secret_key: <string> ] # Named AWS profile used to connect to the API. [ profile: <string> ] # Refresh interval to re-read the instance list. [ refresh_interval: <duration> | default = 60s ] # The port to scrape metrics from. If using the public IP address, this must # instead be specified in the relabeling rule. [ port: <int> | default = 80 ] ``` #### <file_sd_config> 基于文件的服务发现提供了一些通用方法去配置静态目标，以及作为插件自定义服务发现机制的接口。它读取包含零个或者多个`<static_config>s`的一些文件。通过磁盘监视器检测对所有定义文件的更改，并立即应用。文件可能以YAML或JSON格式提供。只应用于形成良好目标群体的变化。这个JSON文件必须包含静态配置的列表，使用这个格式： ``` [ { "targets": [ "<host>", ... ], "labels": { "<labelname>": "<labelvalue>", ... } }, ... ] ``` 文件内容也可以通过周期性刷新时间重新加载。在标签重构阶段，每个目标有一个meta标签`__meta_filepath`。它的值被设置成从目标中提取的文件路径。 ``` # Patterns for files from which target groups are extracted. files: [ - <filename_pattern> ... ] # Refresh interval to re-read the files. [ refresh_interval: <duration> | default = 5m ] ``` `filename_pattern`可以是以`.json, .yml, .yaml`结尾。最后路径段可以包含单个`*`，它匹配任何字符顺序，例如: `my/path/tg_*.json`。在`v0.20`, `names`: 用`files:`代替。 #### <gce_sd_config> **GCE SD在测试中：在将来版本中，配置可能会有实质性变化。** 从GCP GCE实例中，GCE SD配置允许检索和获取目标。这个内网IP地址被默认使用，但是在relabeling期间，这个公网IP地址可能会发生变化。在relabeling期间，下面的meta标签在目标上是可用的： - `__meta_gce_instance_name`: 实例名称 - `__meta_gce_metadata_<name>`: 实例每一个metadata项 - `__meta_gce_network`: 实例的网络 - `__meta_gce_private_ip`: 实例的内网IP - `__meta_gce_project`: 正在运行的GCP项目 - `__meta_gce_public_ip`: 如果存在，表示GCP的公网IP地址 - `__meta_gce_subnetwork`: 实例的子网 - `__meta_gce_tags`: 实例的tag列表 - `__meta_gce_zone`: 正在运行的实例的GCE区域对于GCE discovery，看看下面的配置选项： ``` # The information to access the GCE API. # The GCP Project project: <string> # The zone of the scrape targets. If you need multiple zones use multiple # gce_sd_configs. zone: <string> # Filter can be used optionally to filter the instance list by other criteria [ filter: <string> ] # Refresh interval to re-read the instance list [ refresh_interval: <duration> | default = 60s ] # The port to scrape metrics from. If using the public IP address, this must # instead be specified in the relabeling rule. [ port: <int> | default = 80 ] # The tag separator is used to separate the tags on concatenation [ tag_separator: <string> | default = , ] ``` Google Cloud SDK默认客户端通过查找一下位置发现凭据，优先选择找到的第一个位置： 1. 由GOOGLE_APPLICATION_CREENTIALS环境变量指定的JSON文件 2. 一个JSON文件在大家都熟悉的路径下：$HOME/.config/gclooud/application_default_credentials.json 3. 从GCE元数据服务器获取如果Prometheus运行在GCE上，关联这个正在运行的实例的服务账号，应该至少可以从计算资源上有读取数据的权限。如果运行在GCE外面，需要确保创建一个合适的服务账号，并把证书文件放置在指定的某个地方。 #### <kubernets_sd_config> **Kubernets SD在测试中，在将来的版本中，配置可能会有实质性的变化** 从Kubernetes's REST API上，Kubernets SD配置允许检索和获取目标，并且始终保持与集群状态同步。下面`role`类型中的任何一个都能在发现目标上配置： ##### 节点node 这个`node`角色发现带有地址的每一个集群节点一个目标，都指向Kublelet的HTTP端口。这个目标地址默认为Kubernetes节点对象的第一个现有地址，地址类型为`NodeInernalIP, NodeExternalIP, NodeLegacyHostIP和NodeHostName`。可用的meta标签： - `__meta_kubernetes_node_name`: 节点对象的名称 - `__meta_kubernetes_node_label_<labelname>`: 节点对象的每个标签 - `__meta_kubernetes_node_annotation_<annotationname>`: 节点对象的每个注释 __meta_kubernetes_node_address_<address_type>: 如果存在，每一个节点对象类型的第一个地址另外，对于节点的`instance`标签，将会被设置成从API服务中获取的节点名称。 ##### 服务service 对于每个服务每个服务端口，`service`角色发现一个目标。对于一个服务的黑盒监控是通常有用的。这个地址被设置成这个服务的Kubernetes DNS域名, 以及各自的服务端口。可用的meta标签： - `__meta_kubernetes_namespace`: 服务对象的命名空间 - `__meta_kubernetes_service_name`: 服务对象的名称 - `__meta_kubernetes_service_label_<labelname>`: 服务对象的标签。 - `__meta_kubernetes_service_annotation_<annotationname>`: 服务对象的注释 - `__meta_kubernetes_service_port_name`: 目标服务端口的名称 - `__meta_kubernetes_service_port_number`: 目标服务端口的数量 - `__meta_kubernetes_service_port_portocol`: 目标服务端口的协议 #### pod `pod`角色发现所有的pods，并暴露它们的容器作为目标。对于每一个容器的声明端口，单个目标被生成。如果一个容器没有指定端口，每个容器的无端口目标都是通过relabeling手动添加端口而创建的。可用的meta标签： - `__meta_kubernetes_namespace`: pod对象的命名空间 - `__meta_kubernetes_pod_name`: pod对象的名称 - `__meta_kubernetes_pod_ip`: pod对象的IP地址 - `__meta_kubernetes_pod_label_<labelname>`: pod对象的标签 - `__meta_kubernetes_pod_annotation_<annotationname>`: pod对象的注释 - `__meta_kubernetes_pod_container_name`: 目标地址的容器名称 - `__meta_kubernetes_pod_container_port_name`: 容器端口名称 - `__meta_kubernetes_pod_container_port_number`: 容器端口的数量 - `__meta_kubernetes_pod_container_port_protocol`: 容器端口的协议 - `__meta_kubernetes_pod_ready`: 设置pod ready状态为true或者false - `__meta_kubernetes_pod_node_name`: pod调度的node名称 - `__meta_kubernetes_pod_host_ip`: 节点对象的主机IP ##### endpoints端点 `endpoints`角色发现来自于一个服务的列表端点目标。对于每一个终端地址，一个目标被一个port发现。如果这个终端被写入到pod中，这个节点的所有其他容器端口，未绑定到端点的端口，也会被目标发现。可用的meta标签： - `__meta_kubernetes_namespace`: 端点对象的命名空间 - `__meta_kubernetes_endpoints_name`: 端点对象的名称 - 对于直接从端点列表中获取的所有目标，下面的标签将会被附加上。 - `__meta_kubernetes_endpoint_ready`: endpoint ready状态设置为true或者false。 - `__meta_kubernetes_endpoint_port_name`: 端点的端口名称 - `__meta_kubernetes_endpoint_port_protocol`: 端点的端口协议 - 如果端点属于一个服务，这个角色的所有标签：服务发现被附加上。 - 对于在pod中的所有目标，这个角色的所有表掐你：pod发现被附加上对于Kuberntes发现，看看下面的配置选项： ``` # The information to access the Kubernetes API. # The API server addresses. If left empty, Prometheus is assumed to run inside # of the cluster and will discover API servers automatically and use the pod's # CA certificate and bearer token file at /var/run/secrets/kubernetes.io/serviceaccount/. [ api_server: <host> ] # The Kubernetes role of entities that should be discovered. role: <role> # Optional authentication information used to authenticate to the API server. # Note that `basic_auth`, `bearer_token` and `bearer_token_file` options are # mutually exclusive. # Optional HTTP basic authentication information. basic_auth: [ username: <string> ] [ password: <string> ] # Optional bearer token authentication information. [ bearer_token: <string> ] # Optional bearer token file authentication information. [ bearer_token_file: <filename> ] # TLS configuration. tls_config: [ <tls_config> ] ``` `<role>`必须是`endpoints`, `service`, `pod`或者`node`。关于Prometheus的一个详细配置例子，见[路径]（https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml）你可能希望查看第三方的Prometheus操作符，它可以自动执行Kubernetes上的Prometheus设置。 #### <marathon_sd_config> **Marathon SD正在测试中：在将来的版本中配置可能会有实质性的变化** Marathon SD配置使用[Marathon](https://mesosphere.github.io/marathon/)REST API允许检索和获取目标。Prometheus将会定期地检查当前运行的任务REST端点，以及对每个app创建一个目标组，这个app至少有一个健康的任务。在relabeling期间，下面的meta标签在目标机上是可用的： - `__meta_marathon_app`: app的名称 - `__meta_marathon_image`: 正在使用的Docker镜像名称 - `__meta_marathon_task`: Mesos任务ID - `__meta_marathon_app_label_<labelname>`: 附加在app上的Marathon标签对于Marathon发现，详见下面的配置选项： ``` # List of URLs to be used to contact Marathon servers. # You need to provide at least one server URL, but should provide URLs for # all masters you have running. servers: - <string> # Polling interval [ refresh_interval: <duration> | default = 30s ] ``` 默认情况下，在Markdown的每个列出的app会被Prometheus抓取。如果不是所有提供Prometheus度量指标，你能使用一个Marathon标签和Prometheus relabeling去控制实际过程中被获取的实例。默认情况下所有的app也会以Prometheus系统中的一个任务的形式显示出来，这可以通过使用relabeling改变这些。 #### <nerve_sd_config> 从存储在Zookeeper中的AirBnB's Nerve上，Nerve SD配置允许检索和获取目标。在relabeling期间，下面的meta标签在目标上是可用的： - `__meta_nerve_path`: 在Zookeeper集群中的端节点全路径 - `__meta_nerve_endpoint_host`: 端点的IP - `__meta_nerve_endpoint_port`: 端点的端口 - `__meta_nerve_endpoint_name`: 端点的名称 ``` # The Zookeeper servers. servers: - <host> # Paths can point to a single service, or the root of a tree of services. paths: - <string> [ timeout: <duration> | default = 10s ] ``` #### <serverset_sd_config> Serverset SD配置允许检索和获取从存储在Zookeeper中的Serversetsd的目标。Servesets由[Finagle](https://twitter.github.io/finagle/)和[Aurora](http://aurora.apache.org/)经常使用。在relabeling期间，下面的meta标签在目标上是可用的： - `__meta_serverset_path`: 在zookeeper里的serverset成员的全路径 - `__meta_serverset_endpoint_host`: 默认端点的host - `__meta_serverset_endpoint_port`: 默认端点的端口 - `__meta_serverset_endpoint_host_<endpoint>`: 给定端点的host - `__meta_serverset_endpoint_port_<endpoint>`: 给定端点的port - `__meta_serverset_shard`: 成员的分片数 - `__meta_serverset_status`: 成员的状态 ``` # The Zookeeper servers. servers: - <host> # Paths can point to a single serverset, or the root of a tree of serversets. paths: - <string> [ timeout: <duration> | default = 10s ] ``` Serverset数据必须是JSON格式，Thrift格式当前不被支持 #### <triton_sd_config> ** Triton SD正在测试中：在将来的版本中配置可能会有实质性的变化** [Triton](https://github.com/joyent/triton) SD配置允许从容器监控发现端点的目标中检索和获取。在relabeling期间，下面的meta标签在目标上是可用的： - `__meta_triton_machine_id`: 目标容器的UUID - `__meta_triton_machine_alias`: 目标容器的别名 - `__meta_triton_machine_image`: 目标容器的镜像类型 - `__meta_triton_machine_server_id`: 目标容器的服务UUID ``` # The information to access the Triton discovery API. # The account to use for discovering new target containers. account: <string> # The DNS suffix which should be applied to target containers. dns_suffix: <string> # The Triton discovery endpoint (e.g. 'cmon.us-east-3b.triton.zone'). This is # often the same value as dns_suffix. endpoint: <string> # The port to use for discovery and metric scraping. [ port: <int> | default = 9163 ] # The interval which should should be used for refreshing target containers. [ refresh_interval: <duration> | default = 60s ] # The Triton discovery API version. [ version: <int> | default = 1 ] # TLS configuration. tls_config: [ <tls_config> ] ``` #### <static_config> 一个`static_config`允许指定目标列表，以及附带的通用标签。在获取配置中指定静态目标是规范的方法 ``` # The targets specified by the static config. targets: [ - '<host>' ] # Labels assigned to all metrics scraped from the targets. labels: [ <labelname>: <labelvalue> ... ] ``` #### <relabel_config> Relabeling是一个非常强大的工具，在获取度量指标之前，它可以动态地重写标签集合。每个获取配置过程中，多个relabeling步骤能够被配置。它们按照出现在配置文件中的顺序，应用到每个目标的标签集中。最初，除了配置的每个目标标签之外，目标的作业标签设置为相应获取配置的`job_name`值，这个`__address__`标签设置为目标地址<host>:<port>。在relabeling之后，这个`instance`标签默认设置为`__address__`标签值。这个`__scheme__`和`__metrics_path__`标签设置为各自目标的范式和度量指标路径。 `__param_<name>`标签设置为成为`<name>`的第一个传入的URL参数。另外以`__meta__`为前缀的标签在relabeling阶段是可用的。他们由服务发现机制设置。在relabeling完成之后，由`__`开头的标签将会从标签集合从移除。如果一个relabeling步骤仅仅需要临时地存储标签值（作为后续relabeling步骤的输入），使用以`__tmp`为前缀的标签名称。这个前缀需要确保Prometheus本身从没有使用。 ``` # The source labels select values from existing labels. Their content is concatenated # using the configured separator and matched against the configured regular expression # for the replace, keep, and drop actions. [ source_labels: '[' <labelname> [, ...] ']' ] # Separator placed between concatenated source label values. [ separator: <string> | default = ; ] # Label to which the resulting value is written in a replace action. # It is mandatory for replace actions. [ target_label: <labelname> ] # Regular expression against which the extracted value is matched. [ regex: <regex> | default = (.*) ] # Modulus to take of the hash of the source label values. [ modulus: <uint64> ] # Replacement value against which a regex replace is performed if the # regular expression matches. [ replacement: <string> | default = $1 ] # Action to perform based on regex matching. [ action: <relabel_action> | default = replace ] ``` `<regex>`是任何有效的正则表达式，它提供`replace, keep, drop, labelmap, labeldrop, labelkeep`动作，正则表达式处于两端。要取消指定正则表达式，请使用。.*<regex>.*。 `<relabel_action>`决定要采取的relabeling动作。 - `replace`: 匹配与`source_labels`相反的regex。然后，设置`target_label`替换`source_labels`, 返回结果包括(${1}, ${2}, ...)。如果正则表达会不匹配，则不进行任何替换。 - `keep`: 放弃与`source_labels`标签不匹配的目标 - `drop`: 放弃与`source_labels`标签匹配的目标 - `hashmod`: 将`target_label`设置为`source_labels`的散列模数 - `labelmap`: 匹配所有的标签名称，然后将匹配到的标签值复制为由匹配组引用(${1}, ${2},...) 替换的标签名称替换为其值 - `labeldrop`: 匹配所有的标签名称。然后删除匹配到的标签集合。 - `labelkeep`: 匹配所有的标签名称。然后保留匹配到的标签集合。必须注意`labeldrop`和`labelkeep`, 以确保除去标签后，度量指标仍然会被唯一标识。 #### <alert_relabel_configs> 在警告被发送到Alertmanager之前，警告relabeling应用到alerts。它有相同配置格式和目标relabeling动作。警告relabeling被应用到外部标签。一个用途是确保HA对Prometheus服务与不同的外部标签发送相同的警告。 #### <alertmanager_config> **Alertmanager实例的动态发现是处于alpha状态。在将来的版本中配置会发生较大地更改。通过`-alertmanager.url`标志使用静态配置** `alertmanager_config`区域指定了Prometheus服务发送警告的Alertmanager实例。它也提供参数配置与这些Alertmanagers的通信。 Alertmanagers可以通过`static_configs`参数静态配置，或者使用服务发现机制动态发现目标。另外，从发现的实体和使用的API路径，`relabel_configs`允许从发现的实体列表和提供可使用的API路径中选择路径。这个api path是通过`__alerts_path__`标签暴露出来的。 ```config # Per-target Alertmanager timeout when pushing alerts. [ timeout: <duration> | default = 10s ] # Prefix for the HTTP path alerts are pushed to. [ path_prefix: <path> | default = / ] # Configures the protocol scheme used for requests. [ scheme: <scheme> | default = http ] # Sets the `Authorization` header on every request with the # configured username and password. basic_auth: [ username: <string> ] [ password: <string> ] # Sets the `Authorization` header on every request with # the configured bearer token. It is mutually exclusive with `bearer_token_file`. [ bearer_token: <string> ] # Sets the `Authorization` header on every request with the bearer token # read from the configured file. It is mutually exclusive with `bearer_token`. [ bearer_token_file: /path/to/bearer/token/file ] # Configures the scrape request's TLS settings. tls_config: [ <tls_config> ] # Optional proxy URL. [ proxy_url: <string> ] # List of Azure service discovery configurations. azure_sd_configs: [ - <azure_sd_config> ... ] # List of Consul service discovery configurations. consul_sd_configs: [ - <consul_sd_config> ... ] # List of DNS service discovery configurations. dns_sd_configs: [ - <dns_sd_config> ... ] # List of EC2 service discovery configurations. ec2_sd_configs: [ - <ec2_sd_config> ... ] # List of file service discovery configurations. file_sd_configs: [ - <file_sd_config> ... ] # List of GCE service discovery configurations. gce_sd_configs: [ - <gce_sd_config> ... ] # List of Kubernetes service discovery configurations. kubernetes_sd_configs: [ - <kubernetes_sd_config> ... ] # List of Marathon service discovery configurations. marathon_sd_configs: [ - <marathon_sd_config> ... ] # List of AirBnB's Nerve service discovery configurations. nerve_sd_configs: [ - <nerve_sd_config> ... ] # List of Zookeeper Serverset service discovery configurations. serverset_sd_configs: [ - <serverset_sd_config> ... ] # List of Triton service discovery configurations. triton_sd_configs: [ - <triton_sd_config> ... ] # List of labeled statically configured Alertmanagers. static_configs: [ - <static_config> ... ] # List of Alertmanager relabel configurations. relabel_configs: [ - <relabel_config> ... ] ``` #### <remote_write> **远程写是实验性的：在将来的版本中配置可能会实质性地变化** `url`是发送样本的端点URL。`remote_timeout`指定发送请求到URL的超时时间。目前没有重试机制 `basic_auth`, `tls_config`和`proxy_url`和在`scrape_config`区域里有相同的含义。 `write_relabel_configs`是relabeling应用到样本数据的。写relabeling是应用到外部标签之后的。这可能有样本发送数量的限制。这里有一个[小Demo](https://github.com/prometheus/prometheus/tree/master/documentation/examples/remote_storage)，告诉你怎样使用这个功能