（九）-- 扩展Thrift框架来实现Attachable的RPC调用 · Thrift源码分析

最近有一个分布式调用跟踪系统的项目，需要给基于Thrift的RPC调用添加调用链上下文信息，从而可以生成一次RPC调用的调用链信息。这篇讲讲如何扩展Thrift框架来实现RPC过程中无侵入地添加额外attachment信息的场景。 Thrift框架本身提供了很多机制来支持扩展，比如 1. 扩展TProtocol来实现自定义的序列化类 2. 扩展TTransport来实现自定义的流 3. 采用装饰器模式来装饰Processor，从而在服务实现方法被调用的前后插入自定义逻辑 4. 构建Client和Server时，可以把自定义的Protocol, Transport, Processor作为参数Args传入，从而使用自定义的这些类来处理请求下图是Thrfit RPC调用涉及到的主要组件，RPC框架都大同小异，基本的结构差不多。绿色部分是可以扩展的点。比如在Client端包一层，可以增加服务寻址，负载均衡等分布式集群的功能，在Server端包一层，可以实现服务端的配置管理，监控等等。 ![](https://box.kancloud.cn/2016-02-19_56c6c62c4389e.jpg) 在这个简化的例子中，只需要扩展TProtocol和Processor，就可以实现在RPC调用时添加额外的attachment。 TProtocol表示了RPC调用的序列化过程，更多可以看这篇[Thrift源码分析（二）-- 协议和编解码](http://blog.csdn.net/iter_zc/article/details/39497863) 。TProtocol将序列化过程分为几步 1. write/read Message，读写消息头，消息头包含了方法名，序列号等信息 2. write/read Struct，将RPC方法的参数/返回值封装成结构体，读写结构体即表示要读写RPC方法参数了 3. write/read Field，每一个参数都被抽象成Field，Field主要包含了字段的索引信息，类型信息等 4. write/read Type，即读写各种具体的数据 TBinaryProtocol是使用地比较多的一种基于二进制流的协议，它实现了上述所有的write/read方法。 ~~~ public void writeMessageBegin(TMessage message) throws TException { if (strictWrite_) { int version = VERSION_1 | message.type; writeI32(version); writeString(message.name); writeI32(message.seqid); } else { writeString(message.name); writeByte(message.type); writeI32(message.seqid); } } public void writeMessageEnd() {} public void writeStructBegin(TStruct struct) {} public void writeStructEnd() {} public void writeFieldBegin(TField field) throws TException { writeByte(field.type); writeI16(field.id); } public void writeFieldEnd() {} ~~~ 看一下上面TBinaryProtocol几个方法实现可以发现，它的write/read Struct是空实现，也即写完Message消息头之后直接开始写Field。具体一个Thrift服务生成的客户端中包含了一个服务方法所有的结构信息，比如所有的参数都被创建了相应的TFiled对象，TField都是从1开始往后编号，并且生成了如何序列化一个具体参数的方法，可以看这篇[Thrift源码分析（三）-- IDL和生成代码分析](http://blog.csdn.net/iter_zc/article/details/39522531) 所以基于TBinaryProtocol协议生成的RPC调用字节流大致如下： ![](https://box.kancloud.cn/2016-02-19_56c6c62c54b67.jpg) Thrift生成的读这个字节流的代码流程大致如下 ~~~ readMessageBegin(); readStructBegin(); while(true){ field = readField(); if(field.type == Type.stop){ break; } switch(field.id){ case 1: readXXXX(); break; ..... case n: readXXXX(); break; default: TProtocolUtil.skip(iprotocol, field.type); } readFieldEnd(); } readStructEnd(); readMessageEnd(); ~~~ 从这个流程中，可以看到几点： 1. Thrift生成的代码在读写字节流时，都是按照生成的TField的索引号去判断，然后读取的 2. Thrift提供了skip和stop解析Filed的机制我们可以从TFiled的索引号入手，通过下列方法来添加Attachment 1. 扩展TBinaryProtocol, 将我们想往字节流里插入的数据通过特定编号写入字节流 2. 然后在正常解析字节流之前，先将插入的特定编号的TFiled读取出来 3. 将字节流复位，交给后续的Processor处理第2，3步的处理都是在装饰后的Processor中处理的。最后生成的字节流如下 ![](https://box.kancloud.cn/2016-02-19_56c6c62c67ef8.jpg) 先看一下AttachableBinaryProtocol的实现 1. 定义了一个私有的Map类型的attachment字段，支持Key-Vaule结构的attachment 2. 扩展了writeMessageBegin方法，在写完message头之后，判断是否有attachment，如果有，就调用writeFieldZero方法讲attachment写入到字节流 3. writeFieldZero方法将attachment作为0号字段写入到字节流。Thrift本身支持Map类型，按照Thrift的规范，将attachment写入字节流 4. readFieldZero方法会从字节流中读取0号索引的Map类型的数据，写入到attachment 5. resetTFramedTransport，将字节流复位。在使用NIO类型的Thrift server的时候，默认使用TFramedTransport作为流实现，TFramedTransport是基于缓冲区的流实现，它内部使用了TMemoryInputTrasport流来存储读入的字节流。而TMemoryInputTrasport提供了reset方法来复位流的position。 ~~~ import java.lang.reflect.Field; import java.util.HashMap; import java.util.Map; import org.apache.thrift.TException; import org.apache.thrift.protocol.TBinaryProtocol; import org.apache.thrift.protocol.TField; import org.apache.thrift.protocol.TMap; import org.apache.thrift.protocol.TMessage; import org.apache.thrift.protocol.TProtocol; import org.apache.thrift.protocol.TProtocolFactory; import org.apache.thrift.protocol.TType; import org.apache.thrift.transport.TMemoryInputTransport; import org.apache.thrift.transport.TTransport; public class AttachableBinaryProtocol extends TBinaryProtocol { private Map<String, String> attachment; public AttachableBinaryProtocol(TTransport trans) { super(trans); attachment = new HashMap<String, String>(); } public AttachableBinaryProtocol(TTransport trans, boolean strictRead, boolean strictWrite) { super(trans); strictRead_ = strictRead; strictWrite_ = strictWrite; attachment = new HashMap<String, String>(); } /** * Factory */ public static class Factory implements TProtocolFactory { protected boolean strictRead_ = false; protected boolean strictWrite_ = true; protected int readLength_; public Factory() { this(false, true); } public Factory(boolean strictRead, boolean strictWrite) { this(strictRead, strictWrite, 0); } public Factory(boolean strictRead, boolean strictWrite, int readLength) { strictRead_ = strictRead; strictWrite_ = strictWrite; readLength_ = readLength; } public TProtocol getProtocol(TTransport trans) { AttachableBinaryProtocol proto = new AttachableBinaryProtocol( trans, strictRead_, strictWrite_); if (readLength_ != 0) { proto.setReadLength(readLength_); } return proto; } } public void writeMessageBegin(TMessage message) throws TException { super.writeMessageBegin(message); if(attachment.size() > 0){ writeFieldZero(); } } public void writeFieldZero() throws TException{ TField RTRACE_ATTACHMENT = new TField("rtraceAttachment", TType.MAP, (short) 0); this.writeFieldBegin(RTRACE_ATTACHMENT); { this.writeMapBegin(new TMap(TType.STRING, TType.STRING, attachment .size())); for (Map.Entry<String, String> entry : attachment.entrySet()) { this.writeString(entry.getKey()); this.writeString(entry.getValue()); } this.writeMapEnd(); } this.writeFieldEnd(); } public boolean readFieldZero() throws Exception { TField schemeField = this.readFieldBegin(); if (schemeField.id == 0 && schemeField.type == org.apache.thrift.protocol.TType.MAP) { TMap _map = this.readMapBegin(); attachment = new HashMap<String, String>(2 * _map.size); for (int i = 0; i < _map.size; ++i) { String key = this.readString(); String value = this.readString(); attachment.put(key, value); } this.readMapEnd(); } this.readFieldEnd(); return attachment.size() > 0 ? true: false; } public Map<String, String> getAttachment() { return attachment; } /* * 重置TFramedTransport流，不影响Thrift原有流程 */ public void resetTFramedTransport(TProtocol in) { try { Field readBuffer_ = TFramedTransportFieldsCache.getInstance() .getTFramedTransportReadBuffer(); Field buf_ = TFramedTransportFieldsCache.getInstance() .getTMemoryInputTransportBuf(); if (readBuffer_ == null || buf_ == null) { return; } TMemoryInputTransport stream = (TMemoryInputTransport) readBuffer_ .get(in.getTransport()); byte[] buf = (byte[]) (buf_.get(stream)); stream.reset(buf, 0, buf.length); } catch (Exception e) { e.printStackTrace(); } } private static class TFramedTransportFieldsCache { private static TFramedTransportFieldsCache instance; private final Field readBuffer_; private final Field buf_; private final String TFramedTransport_readBuffer_ = "readBuffer_"; private final String TMemoryInputTransport_buf_ = "buf_"; private TFramedTransportFieldsCache() throws Exception { readBuffer_ = org.apache.thrift.transport.TFramedTransport.class .getDeclaredField(TFramedTransport_readBuffer_); readBuffer_.setAccessible(true); buf_ = org.apache.thrift.transport.TMemoryInputTransport.class .getDeclaredField(TMemoryInputTransport_buf_); buf_.setAccessible(true); } public static TFramedTransportFieldsCache getInstance() throws Exception { if (instance == null) { synchronized (TFramedTransportFieldsCache.class) { if (instance == null) { instance = new TFramedTransportFieldsCache(); } } } return instance; } public Field getTFramedTransportReadBuffer() { return readBuffer_; } public Field getTMemoryInputTransportBuf() { return buf_; } } } ~~~ 来具体说下resetTFramedTransport这个方法，它采用了反射机制来从传入的TProtocol中复位字节流。由于TMemoryInputTransport是TFramedTransport的私有属性，只有通过反射机制才能访问到这个readBuffer属性。而真正的字节流存储在TMemoryInputTransport的私有属性buf中，还需要再次通过反射机制来访问TMemoryInputTransport的私有属性buf,TMemoryInputTransport提供了公有的reset方法，可以直接被调用。 resetTFramedTransport方法演示了如何通过反射机制来访问一个对象的私有属性。Filed.get是线程安全的，它最后落脚在Unsafe类上，通过Unsafe类的getObject方法，根据传入的对象和字段的偏移量来直接从内存中读取对应偏移量上属性值。 ~~~ public void resetTFramedTransport(TProtocol in) { try { Field readBuffer_ = TFramedTransportFieldsCache.getInstance() .getTFramedTransportReadBuffer(); Field buf_ = TFramedTransportFieldsCache.getInstance() .getTMemoryInputTransportBuf(); if (readBuffer_ == null || buf_ == null) { return; } TMemoryInputTransport stream = (TMemoryInputTransport) readBuffer_ .get(in.getTransport()); byte[] buf = (byte[]) (buf_.get(stream)); stream.reset(buf, 0, buf.length); } catch (Exception e) { e.printStackTrace(); } } public class TFramedTransport extends TTransport { private TMemoryInputTransport readBuffer_ = new TMemoryInputTransport(new byte[0]); } public final class TMemoryInputTransport extends TTransport { private byte[] buf_; private int pos_; private int endPos_; public TMemoryInputTransport() { } public TMemoryInputTransport(byte[] buf) { reset(buf); } public TMemoryInputTransport(byte[] buf, int offset, int length) { reset(buf, offset, length); } public void reset(byte[] buf) { reset(buf, 0, buf.length); } public void reset(byte[] buf, int offset, int length) { buf_ = buf; pos_ = offset; endPos_ = offset + length; } } ~~~ 再来看看装饰的Processor类, TraceProcessor类，这是一个典型的装饰器模式，实现TProcessor接口，并且维护了一个TProcessor对象。 1. 在process方法中，先将输入流转化成AttachableProtocol，然后读取消息头 readMessageBegin，然后readFieldZero读0号索引的Map字段。 2. 调用resetTFramedProtocol将输入流复位，然后交给实际的realProcessor处理，在readProcessor中最终会调用到Thrift服务的实现类。 ~~~ import java.util.Map; import org.apache.thrift.TException; import org.apache.thrift.TProcessor; import org.apache.thrift.protocol.TMessage; import org.apache.thrift.protocol.TProtocol; public class TraceProcessor implements TProcessor { private TProcessor realProcessor; private String serviceId; private String serviceName; private int port; public TraceProcessor(TProcessor realProcessor, int port) { this(realProcessor, "", "", port); } public TraceProcessor(TProcessor realProcessor, String serviceName, int port) { this(realProcessor, serviceName, serviceName, port); } public TraceProcessor(TProcessor realProcessor, String serviceId, String serviceName, int port) { this.realProcessor = realProcessor; this.serviceId = serviceId; this.serviceName = serviceName; this.port = port; } @Override public boolean process(TProtocol in, TProtocol out) throws TException { Map<String, String> attachment = null; if(in instanceof AttachableBinaryProtocol){ AttachableBinaryProtocol inProtocal = (AttachableBinaryProtocol)in; TMessage message = inProtocal.readMessageBegin(); // 先读MessageBegin来获得Attachment boolean isAttachableRequest = false; try { isAttachableRequest = inProtocal.readFieldZero(); } catch (Exception e) { } // 重置TramedTransport内部的流,不影响Thrift框架的正常执行流程 inProtocal.resetTFramedTransport(in); if(isAttachableRequest){ attachment = ((AttachableBinaryProtocol)in).getAttachment(); XXXX = attachment.get(XXXX); XXXX = attachment.get(XXXX); } } boolean result = realProcessor.process(in, out); return result; } } ~~~ 采用插入特定索引号的字段到Thrift生成的字节流有个好处是兼容性比较好，因为Thrift反序列化对象时，会按照生成的特定索引号去读取，一旦读到不是指定的索引号，就会skip到，继续读取下一个字段。这样就不会影响Thrift框架原有的序列化机制。