# How Filebeat works
> In this topic, you learn about the key building blocks of Filebeat and how they work together. Understanding these concepts will help you make informed decisions about configuring Filebeat for specific use cases.
> Filebeat consists of two main components:[inputs](https://www.elastic.co/guide/en/beats/filebeat/current/how-filebeat-works.html#input "What is an input?")and[harvesters](https://www.elastic.co/guide/en/beats/filebeat/current/how-filebeat-works.html#harvester "What is a harvester?"). These components work together to tail files and send event data to the output that you specify.
## What is a harvester?
> A harvester is responsible for reading the content of a single file.
**The harvester reads each file, line by line, and sends the content to the output.** **One harvester is started for each file.** The harvester is responsible for opening and closing the file, which means that the file descriptor remains open while the harvester is running. If a file is removed or renamed while it’s being harvested, Filebeat continues to read the file. This has the side effect that the space on your disk is reserved until the harvester closes. By default, Filebeat keeps the file open until[`close_inactive`](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-close-inactive "close_inactive")is reached.
Closing a harvester has the following consequences:
* The file handler is closed, freeing up the underlying resources if the file was deleted while the harvester was still reading the file.
* The harvesting of the file will only be started again after[`scan_frequency`](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-scan-frequency "scan_frequency")has elapsed.
* If the file is moved or removed while the harvester is closed, harvesting of the file will not continue.
To control when a harvester is closed, use the[`close_*`](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-close-options "close_*")configuration options.
## What is an input?
> An input is responsible for managing the harvesters and finding all sources to read from.
If the input type is`log`, **the input finds all files on the drive that match the defined glob paths and starts a harvester for each file.** Each input runs in its own Go routine.
The following example configures Filebeat to harvest lines from all log files that match the specified glob patterns:
~~~
filebeat.inputs:
- type: log
paths:
- /var/log/*.log
- /var/path2/*.log
~~~
Filebeat currently supports several `input` types. Each input type can be defined multiple times. The`log`input checks each file to see whether a harvester needs to be started, whether one is already running, or whether the file can be ignored (see[`ignore_older`](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-ignore-older "ignore_older")). New lines are only picked up if the size of the file has changed since the harvester was closed.
## How does Filebeat keep the state of files?
Filebeat keeps the state of each file and frequently flushes the state to disk in the registry file. The state is used to remember the last offset a harvester was reading from and to ensure all log lines are sent. If the output, such as Elasticsearch or Logstash, is not reachable, Filebeat keeps track of the last lines sent and will continue reading the files as soon as the output becomes available again. While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.
For each input, Filebeat keeps a state of each file it finds. Because files can be renamed or moved, the filename and path are not enough to identify a file. For each file, Filebeat stores unique identifiers to detect whether a file was harvested previously.
If your use case involves creating a large number of new files every day, you might find that the registry file grows to be too large. See[Registry file is too large](https://www.elastic.co/guide/en/beats/filebeat/current/reduce-registry-size.html "Registry file is too large")for details about configuration options that you can set to resolve this issue.
## How does Filebeat ensure at-least-once delivery
Filebeat guarantees that events will be delivered to the configured output at least once and with no data loss. Filebeat is able to achieve this behavior because it stores the delivery state of each event in the registry file.
In situations where the defined output is blocked and has not confirmed all events, Filebeat will keep trying to send events until the output acknowledges that it has received the events.
If Filebeat shuts down while it’s in the process of sending events, it does not wait for the output to acknowledge all events before shutting down. Any events that are sent to the output, but not acknowledged before Filebeat shuts down, are sent again when Filebeat is restarted. This ensures that each event is sent at least once, but you can end up with duplicate events being sent to the output. You can configure Filebeat to wait a specific amount of time before shutting down by setting the[`shutdown_timeout`](https://www.elastic.co/guide/en/beats/filebeat/current/configuration-general-options.html#shutdown-timeout "shutdown_timeout")option.
There is a limitation to Filebeat’s at-least-once delivery guarantee involving log rotation and the deletion of old files. If log files are written to disk and rotated faster than they can be processed by Filebeat, or if files are deleted while the output is unavailable, data might be lost. On Linux, it’s also possible for Filebeat to skip lines as the result of inode reuse. See[*Common problems*](https://www.elastic.co/guide/en/beats/filebeat/current/faq.html "Common problems")for more details about the inode reuse issue.
- springcloud
- springcloud的作用
- springboot服务提供者和消费者
- Eureka
- ribbon
- Feign
- feign在微服务中的使用
- feign充当http请求工具
- Hystrix 熔断器
- Zuul 路由网关
- Spring Cloud Config 分布式配置中心
- config介绍与配置
- Spring Cloud Config 配置实战
- Spring Cloud Bus
- gateway
- 概念讲解
- 实例
- GateWay
- 统一日志追踪
- 分布式锁
- 1.redis
- springcloud Alibaba
- 1. Nacos
- 1.1 安装
- 1.2 特性
- 1.3 实例
- 1. 整合nacos服务发现
- 2. 整合nacos配置功能
- 1.4 生产部署方案
- 环境隔离
- 原理讲解
- 1. 服务发现
- 2. sentinel
- 3. Seata事务
- CAP理论
- 3.1 安装
- 分布式协议
- 4.熔断和降级
- springcloud与alibba
- oauth
- 1. abstract
- 2. oauth2 in micro-service
- 微服务框架付费
- SkyWalking
- 介绍与相关资料
- APM系统简单对比(zipkin,pinpoint和skywalking)
- server安装部署
- agent安装
- 日志清理
- 统一日志中心
- docker安装部署
- 安装部署
- elasticsearch 7.x
- logstash 7.x
- kibana 7.x
- ES索引管理
- 定时清理数据
- index Lifecycle Management
- 没数据排查思路
- ELK自身组件监控
- 多租户方案
- 慢查询sql
- 日志审计
- 开发
- 登录认证
- 链路追踪
- elk
- Filebeat
- Filebeat基础
- Filebeat安装部署
- 多行消息Multiline
- how Filebeat works
- Logstash
- 安装
- rpm安装
- docker安装Logstash
- grok调试
- Grok语法调试
- Grok常用表达式
- 配置中常见判断
- filter提取器
- elasticsearch
- 安装
- rpm安装
- docker安装es
- 使用
- 概念
- 基础
- 中文分词
- 统计
- 排序
- 倒排与正排索引
- 自定义dynamic
- 练习
- nested object
- 父子关系模型
- 高亮
- 搜索提示
- kibana
- 安装
- docker安装
- rpm安装
- 整合
- 收集日志
- 慢sql
- 日志审计s
- 云
- 分布式架构
- 分布式锁
- Redis实现
- redisson
- 熔断和降级