🔥码云GVP开源项目 12k star Uniapp+ElementUI 功能强大 支持多语言、二开方便! 广告
# Apache HBase 外部 APIs > 贡献者:[xixici](https://github.com/xixici) 本章将介绍通过非 Java 语言和自定义协议访问 Apache HBase。 有关使用本机 HBase API 的信息,请参阅[User API Reference](https://hbase.apache.org/apidocs/index.html) [HBase APIs](#hbase_apis) ## 97\. REST 表述性状态转移 Representational State Transfer (REST)于 2000 年在 HTTP 规范的主要作者之一 Roy Fielding 的博士论文中引入。 REST 本身超出了本文档的范围,但通常,REST 允许通过与 URL 本身绑定的 API 进行客户端 - 服务器交互。 本节讨论如何配置和运行 HBase 附带的 REST 服务器,该服务器将 HBase 表,行,单元和元数据公开为 URL 指定的资源。 还有一系列关于如何使用 Jesse Anderson 的 Apache HBase REST 接口的博客 [How-to: Use the Apache HBase REST Interface](http://blog.cloudera.com/blog/2013/03/how-to-use-the-apache-hbase-rest-interface-part-1/) ### 97.1\. 启动和停止 REST 服务 包含的 REST 服务器可以作为守护程序运行,该守护程序启动嵌入式 Jetty servlet 容器并将 servlet 部署到其中。 使用以下命令之一在前台或后台启动 REST 服务器。 端口是可选的,默认为 8080。 ``` # Foreground $ bin/hbase rest start -p <port> # Background, logging to a file in $HBASE_LOGS_DIR $ bin/hbase-daemon.sh start rest -p <port> ``` 要停止 REST 服务器,请在前台运行时使用 Ctrl-C,如果在后台运行则使用以下命令。 ``` $ bin/hbase-daemon.sh stop rest ``` ### 97.2\. 配置 REST 服务器和客户端 有关为 SSL 配置 REST 服务器和客户端以及为 REST 服务器配置`doAs` 模拟的信息,请参阅[配置 Thrift 网关以代表客户端进行身份验证](#security.gateway.thrift)以及 [Securing Apache HBase](#security) 。 ### 97.3\. 使用 REST 以下示例使用占位符服务器 http://example.com:8000,并且可以使用`curl`或`wget`命令运行以下命令。您可以通过不为纯文本添加头信息来请求纯文本(默认),XML 或 JSON 输出,或者为 XML 添加头信息“Accept:text / xml”,为 JSON 添加“Accept:application / json”或为协议缓冲区添加“Accept: application/x-protobuf”。 > > 除非指定,否则使用`GET`请求进行查询,`PUT`或`POST`请求进行创建或修改,`DELETE`用于删除。 | 端口 | HTTP 名词 | 描述 | 示例 | | :----------------- | :-------- | :------------------ | :----------------------------------------------------------- | | `/version/cluster` | `GET` | 集群上运行 HBase 版本 | ``` curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/version/cluster"``` | | `/status/cluster` | `GET` | 集群状态 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/status/cluster"``` | | `/` | `GET` | 列出所有非系统表格 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/"``` | | 端口 | HTTP 名词 | 描述 | 示例 | | -------------------------------- | --------- | ------------------------ | ------------------------------------------------------------ | | `/namespaces` | `GET` | 列出所有命名空间 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/"``` | | `/namespaces/_namespace_` | `GET` | 描述指定命名空间 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/special_ns"``` | | `/namespaces/_namespace_` | `POST` | 创建命名空间 | ```curl -vi -X POST \ -H "Accept: text/xml" \ "example.com:8000/namespaces/special_ns"``` | | `/namespaces/_namespace_/tables` | `GET` | 列出指定命名空间内所有表 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/special_ns/tables"``` | | `/namespaces/_namespace_` | `PUT` | 更改现有命名空间,未使用 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/special_ns``` | | `/namespaces/_namespace_` | `DELETE` | 删除命名空间.必须为空 | ```curl -vi -X DELETE \ -H "Accept: text/xml" \ "example.com:8000/namespaces/special_ns"``` | | 端口 | HTTP 名词 | 描述 | 示例 | | ------------------ | --------- | ----------------------------------------------------- | ------------------------------------------------------------ | | `/_table_/schema` | `GET` | 描述指定表结构 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/schema"``` | | `/_table_/schema` | `POST` | 更新表结构 | ```curl -vi -X POST \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<?xml version="1.0" encoding="UTF-8"?><TableSchema name="users"><ColumnSchema name="cf" KEEP_DELETED_CELLS="true" /></TableSchema>' \ "http://example.com:8000/users/schema"``` | | `/_table_/schema` | `PUT` | 新建表, 更新已有表结构 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<?xml version="1.0" encoding="UTF-8"?><TableSchema name="users"><ColumnSchema name="cf" /></TableSchema>' \ "http://example.com:8000/users/schema"``` | | `/_table_/schema` | `DELETE` | 删除表. 必须使用`/_table_/schema` 不仅仅 `/_table_/`. | ```curl -vi -X DELETE \ -H "Accept: text/xml" \ "http://example.com:8000/users/schema"``` | | `/_table_/regions` | `GET` | 列出表区域 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/regions``` | | 端口 | HTTP 名词 | 描述 | 示例 | | ----------------------------------------------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | `/_table_/_row_` | `GET` | 获取单行的所有列。值为 Base-64 编码。这需要“Accept”请求标头,其类型可以包含多个列 (like xml, json or protobuf). | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1"``` | | `/_table_/_row_/_column:qualifier_/_timestamp_` | `GET` | 获取单个列的值。值为 Base-64 编码。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a/1458586888395"``` | | `/_table_/_row_/_column:qualifier_` | `GET` | 获取单个列的值。值为 Base-64 编码。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a" 或者 curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a/"``` | | `/_table_/_row_/_column:qualifier_/?v=_number_of_versions_` | `GET` | 获取指定单元格的指定版本数. 获取单个列的值。值为 Base-64 编码。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a?v=2"``` | | 端口 | HTTP 名词 | 描述 | 示例 | | ------------------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | `/_table_/scanner/` | `PUT` | 获取 Scanner 对象。 所有其他扫描操作都需要。 将批处理参数调整为扫描应在批处理中返回的行数。 请参阅下一个向扫描仪添加过滤器的示例。 扫描程序端点 URL 作为 HTTP 响应中的 `Location`返回。 此表中的其他示例假定扫描仪端点为`http://example.com:8000/users/scanner/145869072824375522207`。 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<Scanner batch="1"/>' \ "http://example.com:8000/users/scanner/"``` | | `/_table_/scanner/` | `PUT` | 要向扫描程序对象提供过滤器或以任何其他方式配置扫描程序,您可以创建文本文件并将过滤器添加到文件中。 例如,要仅返回以<codeph>u123</codeph>开头的行并使用批量大小为 100 的行,过滤器文件将如下所示:[source,xml] ---- <Scanner batch="100"> <filter> { "type": "PrefixFilter", "value": "u123" } </filter> </Scanner>----将文件传递给`curl`的`-d`参数 请求。 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type:text/xml" \ -d @filter.txt \ "http://example.com:8000/users/scanner/"``` | | `/_table_/scanner/_scanner-id_` | `GET` | 从扫描仪获取下一批。 单元格值是字节编码的。 如果扫描仪已耗尽,则返回 HTTP 状态`204`。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/scanner/145869072824375522207"``` | | `_table_/scanner/_scanner-id_` | `DELETE` | 删除所有扫描并释放资源 | ```curl -vi -X DELETE \ -H "Accept: text/xml" \ "http://example.com:8000/users/scanner/145869072824375522207"``` | | 端口 | HTTP 名词 | 描述 | 示例 | | -------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | `/_table_/_row_key_` | `PUT` | 在表中写一行。 行,列限定符和值必须均为 Base-64 编码。 要编码字符串,请使用`base64`命令行实用程序。 要解码字符串,请使用`base64 -d`。 有效负载位于`--data`参数中,`/ users / fakerow`值是占位符。 通过将多行添加到`<CellSet>`元素来插入多行。 您还可以将要插入的数据保存到文件中,并使用`-d @ filename.txt`等语法将其传递给`-d`参数。 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93NQo="><Cell column="Y2Y6ZQo=">dmFsdWU1Cg==</Cell></Row></CellSet>' \ "http://example.com:8000/users/fakerow" 或者 curl -vi -X PUT \ -H "Accept: text/json" \ -H "Content-Type: text/json" \ -d '{"Row":[{"key":"cm93NQo=", "Cell": [{"column":"Y2Y6ZQo=", "$":"dmFsdWU1Cg=="}]}]}'' \ "example.com:8000/users/fakerow"``` | ### 97.4\. REST XML 结构 ``` <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="RESTSchema"> <element name="Version" type="tns:Version"></element> <complexType name="Version"> <attribute name="REST" type="string"></attribute> <attribute name="JVM" type="string"></attribute> <attribute name="OS" type="string"></attribute> <attribute name="Server" type="string"></attribute> <attribute name="Jersey" type="string"></attribute> </complexType> <element name="TableList" type="tns:TableList"></element> <complexType name="TableList"> <sequence> <element name="table" type="tns:Table" maxOccurs="unbounded" minOccurs="1"></element> </sequence> </complexType> <complexType name="Table"> <sequence> <element name="name" type="string"></element> </sequence> </complexType> <element name="TableInfo" type="tns:TableInfo"></element> <complexType name="TableInfo"> <sequence> <element name="region" type="tns:TableRegion" maxOccurs="unbounded" minOccurs="1"></element> </sequence> <attribute name="name" type="string"></attribute> </complexType> <complexType name="TableRegion"> <attribute name="name" type="string"></attribute> <attribute name="id" type="int"></attribute> <attribute name="startKey" type="base64Binary"></attribute> <attribute name="endKey" type="base64Binary"></attribute> <attribute name="location" type="string"></attribute> </complexType> <element name="TableSchema" type="tns:TableSchema"></element> <complexType name="TableSchema"> <sequence> <element name="column" type="tns:ColumnSchema" maxOccurs="unbounded" minOccurs="1"></element> </sequence> <attribute name="name" type="string"></attribute> <anyAttribute></anyAttribute> </complexType> <complexType name="ColumnSchema"> <attribute name="name" type="string"></attribute> <anyAttribute></anyAttribute> </complexType> <element name="CellSet" type="tns:CellSet"></element> <complexType name="CellSet"> <sequence> <element name="row" type="tns:Row" maxOccurs="unbounded" minOccurs="1"></element> </sequence> </complexType> <element name="Row" type="tns:Row"></element> <complexType name="Row"> <sequence> <element name="key" type="base64Binary"></element> <element name="cell" type="tns:Cell" maxOccurs="unbounded" minOccurs="1"></element> </sequence> </complexType> <element name="Cell" type="tns:Cell"></element> <complexType name="Cell"> <sequence> <element name="value" maxOccurs="1" minOccurs="1"> <simpleType><restriction base="base64Binary"> </simpleType> </element> </sequence> <attribute name="column" type="base64Binary" /> <attribute name="timestamp" type="int" /> </complexType> <element name="Scanner" type="tns:Scanner"></element> <complexType name="Scanner"> <sequence> <element name="column" type="base64Binary" minOccurs="0" maxOccurs="unbounded"></element> </sequence> <sequence> <element name="filter" type="string" minOccurs="0" maxOccurs="1"></element> </sequence> <attribute name="startRow" type="base64Binary"></attribute> <attribute name="endRow" type="base64Binary"></attribute> <attribute name="batch" type="int"></attribute> <attribute name="startTime" type="int"></attribute> <attribute name="endTime" type="int"></attribute> </complexType> <element name="StorageClusterVersion" type="tns:StorageClusterVersion" /> <complexType name="StorageClusterVersion"> <attribute name="version" type="string"></attribute> </complexType> <element name="StorageClusterStatus" type="tns:StorageClusterStatus"> </element> <complexType name="StorageClusterStatus"> <sequence> <element name="liveNode" type="tns:Node" maxOccurs="unbounded" minOccurs="0"> </element> <element name="deadNode" type="string" maxOccurs="unbounded" minOccurs="0"> </element> </sequence> <attribute name="regions" type="int"></attribute> <attribute name="requests" type="int"></attribute> <attribute name="averageLoad" type="float"></attribute> </complexType> <complexType name="Node"> <sequence> <element name="region" type="tns:Region" maxOccurs="unbounded" minOccurs="0"> </element> </sequence> <attribute name="name" type="string"></attribute> <attribute name="startCode" type="int"></attribute> <attribute name="requests" type="int"></attribute> <attribute name="heapSizeMB" type="int"></attribute> <attribute name="maxHeapSizeMB" type="int"></attribute> </complexType> <complexType name="Region"> <attribute name="name" type="base64Binary"></attribute> <attribute name="stores" type="int"></attribute> <attribute name="storefiles" type="int"></attribute> <attribute name="storefileSizeMB" type="int"></attribute> <attribute name="memstoreSizeMB" type="int"></attribute> <attribute name="storefileIndexSizeMB" type="int"></attribute> </complexType> </schema> ``` ### 97.5\. REST Protobufs 结构 ``` message Version { optional string restVersion = 1; optional string jvmVersion = 2; optional string osVersion = 3; optional string serverVersion = 4; optional string jerseyVersion = 5; } message StorageClusterStatus { message Region { required bytes name = 1; optional int32 stores = 2; optional int32 storefiles = 3; optional int32 storefileSizeMB = 4; optional int32 memstoreSizeMB = 5; optional int32 storefileIndexSizeMB = 6; } message Node { required string name = 1; // name:port optional int64 startCode = 2; optional int32 requests = 3; optional int32 heapSizeMB = 4; optional int32 maxHeapSizeMB = 5; repeated Region regions = 6; } // node status repeated Node liveNodes = 1; repeated string deadNodes = 2; // summary statistics optional int32 regions = 3; optional int32 requests = 4; optional double averageLoad = 5; } message TableList { repeated string name = 1; } message TableInfo { required string name = 1; message Region { required string name = 1; optional bytes startKey = 2; optional bytes endKey = 3; optional int64 id = 4; optional string location = 5; } repeated Region regions = 2; } message TableSchema { optional string name = 1; message Attribute { required string name = 1; required string value = 2; } repeated Attribute attrs = 2; repeated ColumnSchema columns = 3; // optional helpful encodings of commonly used attributes optional bool inMemory = 4; optional bool readOnly = 5; } message ColumnSchema { optional string name = 1; message Attribute { required string name = 1; required string value = 2; } repeated Attribute attrs = 2; // optional helpful encodings of commonly used attributes optional int32 ttl = 3; optional int32 maxVersions = 4; optional string compression = 5; } message Cell { optional bytes row = 1; // unused if Cell is in a CellSet optional bytes column = 2; optional int64 timestamp = 3; optional bytes data = 4; } message CellSet { message Row { required bytes key = 1; repeated Cell values = 2; } repeated Row rows = 1; } message Scanner { optional bytes startRow = 1; optional bytes endRow = 2; repeated bytes columns = 3; optional int32 batch = 4; optional int64 startTime = 5; optional int64 endTime = 6; } ``` ## 98\. Thrift 相关文档已转移到 [Thrift API and Filter Language](#thrift). ## 99\. C/C++ Apache HBase 客户端 FB’s Chip Turner 编写了纯正 C/C++ 客户端. [Check it out](https://github.com/hinaria/native-cpp-hbase-client). C++ client 实现. 详见: [HBASE-14850](https://issues.apache.org/jira/browse/HBASE-14850). ## 100\. 将 Java Data Objects (JDO) 和 HBase 一起使用 [Java 数据对象 Java Data Objects (JDO)](https://db.apache.org/jdo/) 是一种访问数据库中持久数据的标准方法,使用普通的旧 Java 对象(POJO)来表示持久数据。 依赖 此代码示例具有以下依赖项: 1. HBase 0.90.x 或更高版本 2. commons-beanutils.jar ([https://commons.apache.org/](https://commons.apache.org/)) 3. commons-pool-1.5.5.jar ([https://commons.apache.org/](https://commons.apache.org/)) 4. transactional-tableindexed for HBase 0.90 ([https://github.com/hbase-trx/hbase-transactional-tableindexed](https://github.com/hbase-trx/hbase-transactional-tableindexed)) 下载 `hbase-jdo` 下载源码 [http://code.google.com/p/hbase-jdo/](http://code.google.com/p/hbase-jdo/). Example 26\. JDO 示例 此示例使用 JDO 创建一个表和一个索引,在表中插入行,获取行,获取列值,执行查询以及执行一些其他 HBase 操作。 ``` package com.apache.hadoop.hbase.client.jdo.examples; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.util.Hashtable; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hbase.client.tableindexed.IndexedTable; import com.apache.hadoop.hbase.client.jdo.AbstractHBaseDBO; import com.apache.hadoop.hbase.client.jdo.HBaseBigFile; import com.apache.hadoop.hbase.client.jdo.HBaseDBOImpl; import com.apache.hadoop.hbase.client.jdo.query.DeleteQuery; import com.apache.hadoop.hbase.client.jdo.query.HBaseOrder; import com.apache.hadoop.hbase.client.jdo.query.HBaseParam; import com.apache.hadoop.hbase.client.jdo.query.InsertQuery; import com.apache.hadoop.hbase.client.jdo.query.QSearch; import com.apache.hadoop.hbase.client.jdo.query.SelectQuery; import com.apache.hadoop.hbase.client.jdo.query.UpdateQuery; /** * Hbase JDO Example. * * dependency library. * - commons-beanutils.jar * - commons-pool-1.5.5.jar * - hbase0.90.0-transactionl.jar * * you can expand Delete,Select,Update,Insert Query classes. * */ public class HBaseExample { public static void main(String[] args) throws Exception { AbstractHBaseDBO dbo = new HBaseDBOImpl(); //*drop if table is already exist.* if(dbo.isTableExist("user")){ dbo.deleteTable("user"); } //*create table* dbo.createTableIfNotExist("user",HBaseOrder.DESC,"account"); //dbo.createTableIfNotExist("user",HBaseOrder.ASC,"account"); //create index. String[] cols={"id","name"}; dbo.addIndexExistingTable("user","account",cols); //insert InsertQuery insert = dbo.createInsertQuery("user"); UserBean bean = new UserBean(); bean.setFamily("account"); bean.setAge(20); bean.setEmail("ncanis@gmail.com"); bean.setId("ncanis"); bean.setName("ncanis"); bean.setPassword("1111"); insert.insert(bean); //select 1 row SelectQuery select = dbo.createSelectQuery("user"); UserBean resultBean = (UserBean)select.select(bean.getRow(),UserBean.class); // select column value. String value = (String)select.selectColumn(bean.getRow(),"account","id",String.class); // search with option (QSearch has EQUAL, NOT_EQUAL, LIKE) // select id,password,name,email from account where id='ncanis' limit startRow,20 HBaseParam param = new HBaseParam(); param.setPage(bean.getRow(),20); param.addColumn("id","password","name","email"); param.addSearchOption("id","ncanis",QSearch.EQUAL); select.search("account", param, UserBean.class); // search column value is existing. boolean isExist = select.existColumnValue("account","id","ncanis".getBytes()); // update password. UpdateQuery update = dbo.createUpdateQuery("user"); Hashtable<String, byte[]> colsTable = new Hashtable<String, byte[]>(); colsTable.put("password","2222".getBytes()); update.update(bean.getRow(),"account",colsTable); //delete DeleteQuery delete = dbo.createDeleteQuery("user"); delete.deleteRow(resultBean.getRow()); //////////////////////////////////// // etc // HTable pool with apache commons pool // borrow and release. HBasePoolManager(maxActive, minIdle etc..) IndexedTable table = dbo.getPool().borrow("user"); dbo.getPool().release(table); // upload bigFile by hadoop directly. HBaseBigFile bigFile = new HBaseBigFile(); File file = new File("doc/movie.avi"); FileInputStream fis = new FileInputStream(file); Path rootPath = new Path("/files/"); String filename = "movie.avi"; bigFile.uploadFile(rootPath,filename,fis,true); // receive file stream from hadoop. Path p = new Path(rootPath,filename); InputStream is = bigFile.path2Stream(p,4096); } } ``` ## 101\. Scala ### 101.1\. 设置类路径 要将 Scala 与 HBase 一起使用,您的 CLASSPATH 必须包含 HBase 的类路径以及代码所需的 Scala JAR。首先,在运行 HBase RegionServer 进程的服务器上使用以下命令,以获取 HBase 的类路径。 ``` $ ps aux |grep regionserver| awk -F 'java.library.path=' {'print $2'} | awk {'print $1'} /usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64 ``` 设置`$CLASSPATH`环境变量以包括您在上一步中找到的路径,以及项目所需的`scala-library.jar`路径和每个与 Scala 相关的其他 JAR。 ``` $ export CLASSPATH=$CLASSPATH:/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64:/path/to/scala-library.jar ``` ### 101.2\. Scala SBT 文件 `build.sbt` 需要使用 `resolvers` 和 `libraryDependencies` . ``` resolvers += "Apache HBase" at "https://repository.apache.org/content/repositories/releases" resolvers += "Thrift" at "https://people.apache.org/~rawson/repo/" libraryDependencies ++= Seq( "org.apache.hadoop" % "hadoop-core" % "0.20.2", "org.apache.hbase" % "hbase" % "0.90.4" ) ``` ### 101.3\. Scala 示例 此示例列出 HBase 表,创建新表并向其添加行: ``` import org.apache.hadoop.hbase.HBaseConfiguration import org.apache.hadoop.hbase.client.{Connection,ConnectionFactory,HBaseAdmin,HTable,Put,Get} import org.apache.hadoop.hbase.util.Bytes val conf = new HBaseConfiguration() val connection = ConnectionFactory.createConnection(conf); val admin = connection.getAdmin(); // list the tables val listtables=admin.listTables() listtables.foreach(println) // let's insert some data in 'mytable' and get the row val table = new HTable(conf, "mytable") val theput= new Put(Bytes.toBytes("rowkey1")) theput.add(Bytes.toBytes("ids"),Bytes.toBytes("id1"),Bytes.toBytes("one")) table.put(theput) val theget= new Get(Bytes.toBytes("rowkey1")) val result=table.get(theget) val value=result.value() println(Bytes.toString(value)) ``` ## 102\. Jython ### 102.1\. 设置类路径 要将 Jython 与 HBase 一起使用,您的 CLASSPATH 必须包含 HBase 的类路径以及代码所需的 Jython JAR。 将路径设置为包含`Jython.jar`的目录,以及每个项目需要的附加的 Jython 相关 JAR。然后将的 HBASE_CLASSPATH 指向$JYTHON_HOME 。 ``` $ export HBASE_CLASSPATH=/directory/jython.jar ``` 在类路径中使用 HBase 和 Hadoop JAR 启动 Jython shell: $ bin/hbase org.python.util.jython ### 102.2\. Jython 示例 Example 27\. 使用 Jython 创建表,填充,获取和删除表 以下 Jython 代码示例检查表,如果存在,则删除它然后创建它。然后,它使用数据填充表并获取数据。 ``` import java.lang from org.apache.hadoop.hbase import HBaseConfiguration, HTableDescriptor, HColumnDescriptor, TableName from org.apache.hadoop.hbase.client import Admin, Connection, ConnectionFactory, Get, Put, Result, Table from org.apache.hadoop.conf import Configuration # First get a conf object. This will read in the configuration # that is out in your hbase-*.xml files such as location of the # hbase master node. conf = HBaseConfiguration.create() connection = ConnectionFactory.createConnection(conf) admin = connection.getAdmin() # Create a table named 'test' that has a column family # named 'content'. tableName = TableName.valueOf("test") table = connection.getTable(tableName) desc = HTableDescriptor(tableName) desc.addFamily(HColumnDescriptor("content")) # Drop and recreate if it exists if admin.tableExists(tableName): admin.disableTable(tableName) admin.deleteTable(tableName) admin.createTable(desc) # Add content to 'column:' on a row named 'row_x' row = 'row_x' put = Put(row) put.addColumn("content", "qual", "some content") table.put(put) # Now fetch the content just added, returns a byte[] get = Get(row) result = table.get(get) data = java.lang.String(result.getValue("content", "qual"), "UTF8") print "The fetched row contains the value '%s'" % data ``` Example 28\. 使用 Jython 进行表扫描 此示例扫描表并返回与给定族限定符匹配的结果。 ``` import java.lang from org.apache.hadoop.hbase import TableName, HBaseConfiguration from org.apache.hadoop.hbase.client import Connection, ConnectionFactory, Result, ResultScanner, Table, Admin from org.apache.hadoop.conf import Configuration conf = HBaseConfiguration.create() connection = ConnectionFactory.createConnection(conf) admin = connection.getAdmin() tableName = TableName.valueOf('wiki') table = connection.getTable(tableName) cf = "title" attr = "attr" scanner = table.getScanner(cf) while 1: result = scanner.next() if not result: break print java.lang.String(result.row), java.lang.String(result.getValue(cf, attr)) ```