5.ZooKeeper之Watcher机制 · java核心知识整理

[TOC] 本文思维导图如下： ![](https://img.kancloud.cn/e6/67/e667c5297ea0167c4810d44b110c902c_1133x215.png) image ### 前言 Watcher机制是zookeeper最重要三大特性**数据节点Znode+Watcher机制+ACL权限控制**中的其中一个，它是zk很多应用场景的一个前提，比如集群管理、集群配置、发布/订阅。 Watcher机制涉及到客户端与服务器(注意，不止一个机器，一般是集群，这里先认为一个整体分析)的两者数据通信与消息通信，除此之外还涉及到客户端的watchManager。下面正式进入主题。 ### 1.watcher原理框架 ![](https://img.kancloud.cn/42/12/4212792f91d1e0a5add1bac44e317e6e_1196x667.png) 由图看出，zk的watcher由客户端，客户端WatchManager，zk服务器组成。整个过程涉及了消息通信及数据存储。 * zk客户端向zk服务器注册watcher的同时，会将watcher对象存储在客户端的watchManager。 * Zk服务器触发watcher事件后，会向客户端发送通知，客户端线程从watchManager中回调watcher执行相应的功能。注意的是server服务器端一般有多台共同一起对外提供服务的，里面涉及到zk专有的ZAB协议(分布式原子广播协议)。在这先不分析，后面会有单独一文来介绍，因为ZAB协议是zookeeper的实现精髓，有了zab协议才能使zk真正落地，真正的高可靠，数据同步，适于商用。 ![](https://img.kancloud.cn/f9/bb/f9bb6596cc5908550c4a3a73b3783bf3_679x426.png) 有木有看到小红旗？加入小红旗是一个watcher，当小红旗被创建并注册到node1节点(会有相应的API实现)后，就会监听node1+node\_a+node\_b或node\_a+node\_b。这里两种情况是因为在创建watcher注册时会有多种途径。并且watcher不能监听到孙节点。注意注意注意，watcher设置后，一旦触发一次后就会失效，如果要想一直监听，需要在process回调函数里重新注册相同的 **watcher**。 ### 2.通知状态与事件 ~~~java public class WatcherTest implements Watcher { @Override public void process(WatchedEvent event) { // TODO Auto-generated method stub WatcherTest w = new WatcherTest(); ZooKeeper zk = new ZooKeeper(wx.getZkpath(),10000, w); } public static void main(String[] args){ WatcherTest w = new WatcherTest(); ZooKeeper zk = new ZooKeeper(wx.getZkpath(), 10000, w); } } ~~~ 上面例子是把异常处理，逻辑处理等都省掉。watcher的应用很简单，主要有两步：继承 **Watcher** 接口，重写 **process** 回调函数。当然注册方式有很多，有默认和重新覆盖方式，可以一次触发失效也可以一直有效触发。这些都可以通过代码实现。 #### 2.1 KeeperStatus通知状态 KeeperStatus完整的类名是`org.apache.zookeeper.Watcher.Event.KeeperState`。 #### 2.2 EventType事件类型 EventType完整的类名是`org.apache.zookeeper.Watcher.Event.EventType`。 ![](https://img.kancloud.cn/98/77/98774274543b998e1240b50a5ab33006_1505x639.png) 此图是zookeeper常用的通知状态与对应事件类型的对应关系。除了客户端与服务器连接状态下，有多种事件的变化，其他状态的事件都是None。这也是符合逻辑的，因为没有连接服务器肯定不能获取获取到当前的状态，也就无法发送对应的事件类型了。这里重点说下几个重要而且容易迷惑的事件： * NodeDataChanged事件 * 无论节点数据发生变化还是数据版本发生变化都会触发 * 即使被更新数据与新数据一样，数据版本dataVersion都会发生变化 * NodeChildrenChanged * 新增节点或者删除节点 * AuthFailed * 重点是客户端会话没有权限而是授权失败客户端只能收到服务器发过来的相关事件通知，并不能获取到对应数据节点的原始数据及变更后的新数据。因此，如果业务需要知道变更前的数据或者变更后的新数据，需要业务保存变更前的数据(本机数据结构、文件等)和调用接口获取新的数据 ### 3.watcher注册过程 #### 3.1涉及接口创建zk客户端对象实例时注册: ~~~java ZooKeeper(String connectString, int sessionTimeout, Watcher watcher) ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, boolean canBeReadOnly) ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd) ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) ~~~ 通过这种方式注册的watcher将会作为整个zk会话期间的**默认watcher**，会一直被保存在客户端ZK **WatchManager** 的 **defaultWatcher** 中，如果这个被创建的节点在其它时候被创建watcher并注册，则这个默认的watcher会被覆盖。注意注意注意，watcher触发一次就会失效，不管是创建节点时的 **watcher** 还是以后创建的 **watcher**。其他注册watcher的API： * `getChildren(String path, Watcher watcher)` * `getChildren(String path, boolean watch)` * Boolean watch表示是否使用上下文中默认的watcher，即创建zk实例时设置的watcher * `getData(String path, boolean watch, Stat stat)` * Boolean watch表示是否使用上下文默认的watcher，即创建zk实例时设置的watcher * `getData(String path, Watcher watcher, AsyncCallback.DataCallback cb, Object ctx)` * `exists(String path, boolean watch)` * Boolean watch表示是否使用上下文中默认的watcher，即创建zk实例时设置的watcher * `exists(String path, Watcher watcher)` ## 举栗子 ![](https://img.kancloud.cn/8e/ec/8eec473949c0dc2e3dac50a9a2b762db_634x160.png) ![](https://img.kancloud.cn/e2/3f/e23f4a8e992e4960f12593e8c9e31f08_682x78.png) ![](https://img.kancloud.cn/0d/e6/0de63c73d138dc66d16693e764f40846_685x135.png) ![](https://img.kancloud.cn/27/fc/27fc85b1fd0ee39be01580c59e61e8ec_615x296.png) ![](https://img.kancloud.cn/fc/a7/fca7c36a8917c9d69dd3091c8fac199d_689x227.png) 这就是watcher的简单例子，zk的实际应用集群管理，发布订阅等复杂功能其实就在这个小例子上拓展的。 #### 3.2客户端注册 ![](https://img.kancloud.cn/42/d0/42d0daad6924439ac7790a89027026b2_687x265.png) 这里的客户端注册主要是把上面第一点的zookeeper原理框架的注册步骤展开，简单来说就是zk客户端在注册时会先向zk服务器请求注册，服务器会返回请求响应，如果响应成功则zk服务端把watcher对象放到客户端的WatchManager管理并返回响应给客户端。 #### 3.3服务器端注册 ![](https://img.kancloud.cn/1e/54/1e54c6f12db16fafef222b78bbb10194_691x280.png) ##### FinalRequestProcessor ~~~dart /** * This Request processor actually applies any transaction associated with a * request and services any queries. It is always at the end of a * RequestProcessor chain (hence the name), so it does not have a nextProcessor * member. * * This RequestProcessor counts on ZooKeeperServer to populate the * outstandingRequests member of ZooKeeperServer. */ public class FinalRequestProcessor implements RequestProcessor ~~~ 由源码注释得知，**FinalRequestProcessor**类实际是任何事务请求和任何查询的的最终处理类。也就是我们客户端对节点的set/get/delete/create/exists等操作最终都会运行到这里。以exists函数为例子： ~~~dart case OpCode.exists: { lastOp = "EXIS"; // TODO we need to figure out the security requirement for this! ExistsRequest existsRequest = new ExistsRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, existsRequest); String path = existsRequest.getPath(); if (path.indexOf('\0') != -1) { throw new KeeperException.BadArgumentsException(); } Stat stat = zks.getZKDatabase().statNode(path, existsRequest .getWatch() ? cnxn : null); rsp = new ExistsResponse(stat); break; } ~~~ `existsRequest.getWatch() ? cnxn : null`此句是在调用exists API时，判断是否注册watcher，若是就返回 **cnxn**，**cnxn**是由此句代码`ServerCnxn cnxn = request.cnxn;`创建的。 ~~~dart /** * Interface to a Server connection - represents a connection from a client * to the server. */ public abstract class ServerCnxn implements Stats, Watcher ~~~ 通过`ServerCnxn`类的源码注释得知，`ServerCnxn`是维持服务器与客户端的**tcp连接**与实现了 **watcher**。总的来说，ServerCnxn类创建的对象**cnxn**即包含了连接信息又包含watcher信息。同时仔细看**ServerCnxn类**里面的源码，发现有以下这个函数，process函数正是watcher的回调函数啊。 ~~~java public abstract class ServerCnxn implements Stats, Watcher { . . public abstract void process(WatchedEvent event); Stat stat = zks.getZKDatabase().statNode(path, existsRequest.getWatch() ? cnxn : null); //getZKDatabase实际上是获取是在zookeeper运行时的数据库。请看下面 . . } ~~~ ##### ZKDatabase ~~~dart /** * This class maintains the in memory database of zookeeper * server states that includes the sessions, datatree and the * committed logs. It is booted up after reading the logs * and snapshots from the disk. */ public class ZKDatabase ~~~ 通过源码注释得知**ZKDatabase**是在zookeeper运行时的数据库，在`FinalRequestProcessor`的case exists中会把existsRequest(exists请求传递给ZKDatabase)。 ~~~kotlin /** * the datatree for this zkdatabase * @return the datatree for this zkdatabase */ public DataTree getDataTree() { return this.dataTree; } ~~~ **ZKDatabase**里面有这关键的一个函数是从zookeeper运行时展开的节点数型结构中搜索到合适的节点返回。 ##### watchManager * Zk服务器端Watcher的管理者 * 从两个维度维护watcher * watchTable从数据节点的粒度来维护 * watch2Paths从watcher的粒度来维护 * 负责watcher事件的触发 ~~~dart class WatchManager { private final Map<String, Set<Watcher>> watchTable = new HashMap<String, Set<Watcher>>(); private final Map<Watcher, Set<String>> watch2Paths = new HashMap<Watcher, Set<String>>(); Set<Watcher> triggerWatch(String path, EventType type) { return triggerWatch(path, type, null);} } ~~~ ##### watcher触发 ~~~java public Stat setData(String path, byte data[], int version, long zxid,long time) throws KeeperException.NoNodeException { Stat s = new Stat(); DataNode n = nodes.get(path); if (n == null) { throw new KeeperException.NoNodeException(); } byte lastdata[] = null; synchronized (n) { lastdata = n.data; n.data = data; n.stat.setMtime(time); n.stat.setMzxid(zxid); n.stat.setVersion(version); n.copyStat(s); } // now update if the path is in a quota subtree. String lastPrefix = getMaxPrefixWithQuota(path); if(lastPrefix != null) { this.updateBytes(lastPrefix, (data == null ? 0 : data.length) - (lastdata == null ? 0 : lastdata.length)); } dataWatches.triggerWatch(path, EventType.NodeDataChanged); //触发事件 return s; } ~~~ 客户端回调watcher步骤： * 反序列化,将孒节流转换成WatcherEvent对象。因为在Java中网络传输肯定是使用了序列化的，主要是为了节省网络IO和提高传输效率。 * 处理chrootPath。获取节点的根节点路径，然后再搜索树而已。 * 还原watchedEvent:把WatcherEvent对象转换成WatchedEvent。主要是把zk服务器那边的WatchedEvent事件变为WatcherEvent，标为已watch触发。 * 回调Watcher:把WatchedEvent对象交给EventThread线程。EventThread线程主要是负责从客户端的ZKWatchManager中取出Watcher,并放入waitingEvents队列中，然后供客户端获取。 ### 4.小结到此，zookeeper的watcher机制基本告一段落了，watcher机制主要是客户端、zk服务器和watchManager三者的协调合作完成的作者：dandan的微笑链接：https://www.jianshu.com/p/4c071e963f18 来源：简书