# Apache HBase 外部 APIs
> 贡献者:[xixici](https://github.com/xixici)
本章将介绍通过非 Java 语言和自定义协议访问 Apache HBase。 有关使用本机 HBase API 的信息,请参阅[User API Reference](https://hbase.apache.org/apidocs/index.html) [HBase APIs](#hbase_apis)
## 97\. REST
表述性状态转移 Representational State Transfer (REST)于 2000 年在 HTTP 规范的主要作者之一 Roy Fielding 的博士论文中引入。
REST 本身超出了本文档的范围,但通常,REST 允许通过与 URL 本身绑定的 API 进行客户端 - 服务器交互。 本节讨论如何配置和运行 HBase 附带的 REST 服务器,该服务器将 HBase 表,行,单元和元数据公开为 URL 指定的资源。 还有一系列关于如何使用 Jesse Anderson 的 Apache HBase REST 接口的博客 [How-to: Use the Apache HBase REST Interface](http://blog.cloudera.com/blog/2013/03/how-to-use-the-apache-hbase-rest-interface-part-1/)
### 97.1\. 启动和停止 REST 服务
包含的 REST 服务器可以作为守护程序运行,该守护程序启动嵌入式 Jetty servlet 容器并将 servlet 部署到其中。 使用以下命令之一在前台或后台启动 REST 服务器。 端口是可选的,默认为 8080。
```
# Foreground
$ bin/hbase rest start -p <port>
# Background, logging to a file in $HBASE_LOGS_DIR
$ bin/hbase-daemon.sh start rest -p <port>
```
要停止 REST 服务器,请在前台运行时使用 Ctrl-C,如果在后台运行则使用以下命令。
```
$ bin/hbase-daemon.sh stop rest
```
### 97.2\. 配置 REST 服务器和客户端
有关为 SSL 配置 REST 服务器和客户端以及为 REST 服务器配置`doAs` 模拟的信息,请参阅[配置 Thrift 网关以代表客户端进行身份验证](#security.gateway.thrift)以及 [Securing Apache HBase](#security) 。
### 97.3\. 使用 REST
以下示例使用占位符服务器 http://example.com:8000,并且可以使用`curl`或`wget`命令运行以下命令。您可以通过不为纯文本添加头信息来请求纯文本(默认),XML 或 JSON 输出,或者为 XML 添加头信息“Accept:text / xml”,为 JSON 添加“Accept:application / json”或为协议缓冲区添加“Accept: application/x-protobuf”。
>
> 除非指定,否则使用`GET`请求进行查询,`PUT`或`POST`请求进行创建或修改,`DELETE`用于删除。
| 端口 | HTTP 名词 | 描述 | 示例 |
| :----------------- | :-------- | :------------------ | :----------------------------------------------------------- |
| `/version/cluster` | `GET` | 集群上运行 HBase 版本 | ``` curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/version/cluster"``` |
| `/status/cluster` | `GET` | 集群状态 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/status/cluster"``` |
| `/` | `GET` | 列出所有非系统表格 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/"``` |
| 端口 | HTTP 名词 | 描述 | 示例 |
| -------------------------------- | --------- | ------------------------ | ------------------------------------------------------------ |
| `/namespaces` | `GET` | 列出所有命名空间 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/"``` |
| `/namespaces/_namespace_` | `GET` | 描述指定命名空间 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/special_ns"``` |
| `/namespaces/_namespace_` | `POST` | 创建命名空间 | ```curl -vi -X POST \ -H "Accept: text/xml" \ "example.com:8000/namespaces/special_ns"``` |
| `/namespaces/_namespace_/tables` | `GET` | 列出指定命名空间内所有表 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/special_ns/tables"``` |
| `/namespaces/_namespace_` | `PUT` | 更改现有命名空间,未使用 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/special_ns``` |
| `/namespaces/_namespace_` | `DELETE` | 删除命名空间.必须为空 | ```curl -vi -X DELETE \ -H "Accept: text/xml" \ "example.com:8000/namespaces/special_ns"``` |
| 端口 | HTTP 名词 | 描述 | 示例 |
| ------------------ | --------- | ----------------------------------------------------- | ------------------------------------------------------------ |
| `/_table_/schema` | `GET` | 描述指定表结构 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/schema"``` |
| `/_table_/schema` | `POST` | 更新表结构 | ```curl -vi -X POST \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<?xml version="1.0" encoding="UTF-8"?><TableSchema name="users"><ColumnSchema name="cf" KEEP_DELETED_CELLS="true" /></TableSchema>' \ "http://example.com:8000/users/schema"``` |
| `/_table_/schema` | `PUT` | 新建表, 更新已有表结构 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<?xml version="1.0" encoding="UTF-8"?><TableSchema name="users"><ColumnSchema name="cf" /></TableSchema>' \ "http://example.com:8000/users/schema"``` |
| `/_table_/schema` | `DELETE` | 删除表. 必须使用`/_table_/schema` 不仅仅 `/_table_/`. | ```curl -vi -X DELETE \ -H "Accept: text/xml" \ "http://example.com:8000/users/schema"``` |
| `/_table_/regions` | `GET` | 列出表区域 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/regions``` |
| 端口 | HTTP 名词 | 描述 | 示例 |
| ----------------------------------------------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| `/_table_/_row_` | `GET` | 获取单行的所有列。值为 Base-64 编码。这需要“Accept”请求标头,其类型可以包含多个列 (like xml, json or protobuf). | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1"``` |
| `/_table_/_row_/_column:qualifier_/_timestamp_` | `GET` | 获取单个列的值。值为 Base-64 编码。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a/1458586888395"``` |
| `/_table_/_row_/_column:qualifier_` | `GET` | 获取单个列的值。值为 Base-64 编码。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a" 或者 curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a/"``` |
| `/_table_/_row_/_column:qualifier_/?v=_number_of_versions_` | `GET` | 获取指定单元格的指定版本数. 获取单个列的值。值为 Base-64 编码。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a?v=2"``` |
| 端口 | HTTP 名词 | 描述 | 示例 |
| ------------------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| `/_table_/scanner/` | `PUT` | 获取 Scanner 对象。 所有其他扫描操作都需要。 将批处理参数调整为扫描应在批处理中返回的行数。 请参阅下一个向扫描仪添加过滤器的示例。 扫描程序端点 URL 作为 HTTP 响应中的 `Location`返回。 此表中的其他示例假定扫描仪端点为`http://example.com:8000/users/scanner/145869072824375522207`。 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<Scanner batch="1"/>' \ "http://example.com:8000/users/scanner/"``` |
| `/_table_/scanner/` | `PUT` | 要向扫描程序对象提供过滤器或以任何其他方式配置扫描程序,您可以创建文本文件并将过滤器添加到文件中。 例如,要仅返回以<codeph>u123</codeph>开头的行并使用批量大小为 100 的行,过滤器文件将如下所示:[source,xml] ---- <Scanner batch="100"> <filter> { "type": "PrefixFilter", "value": "u123" } </filter> </Scanner>----将文件传递给`curl`的`-d`参数 请求。 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type:text/xml" \ -d @filter.txt \ "http://example.com:8000/users/scanner/"``` |
| `/_table_/scanner/_scanner-id_` | `GET` | 从扫描仪获取下一批。 单元格值是字节编码的。 如果扫描仪已耗尽,则返回 HTTP 状态`204`。 | ```curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/scanner/145869072824375522207"``` |
| `_table_/scanner/_scanner-id_` | `DELETE` | 删除所有扫描并释放资源 | ```curl -vi -X DELETE \ -H "Accept: text/xml" \ "http://example.com:8000/users/scanner/145869072824375522207"``` |
| 端口 | HTTP 名词 | 描述 | 示例 |
| -------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| `/_table_/_row_key_` | `PUT` | 在表中写一行。 行,列限定符和值必须均为 Base-64 编码。 要编码字符串,请使用`base64`命令行实用程序。 要解码字符串,请使用`base64 -d`。 有效负载位于`--data`参数中,`/ users / fakerow`值是占位符。 通过将多行添加到`<CellSet>`元素来插入多行。 您还可以将要插入的数据保存到文件中,并使用`-d @ filename.txt`等语法将其传递给`-d`参数。 | ```curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93NQo="><Cell column="Y2Y6ZQo=">dmFsdWU1Cg==</Cell></Row></CellSet>' \ "http://example.com:8000/users/fakerow" 或者 curl -vi -X PUT \ -H "Accept: text/json" \ -H "Content-Type: text/json" \ -d '{"Row":[{"key":"cm93NQo=", "Cell": [{"column":"Y2Y6ZQo=", "$":"dmFsdWU1Cg=="}]}]}'' \ "example.com:8000/users/fakerow"``` |
### 97.4\. REST XML 结构
```
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="RESTSchema">
<element name="Version" type="tns:Version"></element>
<complexType name="Version">
<attribute name="REST" type="string"></attribute>
<attribute name="JVM" type="string"></attribute>
<attribute name="OS" type="string"></attribute>
<attribute name="Server" type="string"></attribute>
<attribute name="Jersey" type="string"></attribute>
</complexType>
<element name="TableList" type="tns:TableList"></element>
<complexType name="TableList">
<sequence>
<element name="table" type="tns:Table" maxOccurs="unbounded" minOccurs="1"></element>
</sequence>
</complexType>
<complexType name="Table">
<sequence>
<element name="name" type="string"></element>
</sequence>
</complexType>
<element name="TableInfo" type="tns:TableInfo"></element>
<complexType name="TableInfo">
<sequence>
<element name="region" type="tns:TableRegion" maxOccurs="unbounded" minOccurs="1"></element>
</sequence>
<attribute name="name" type="string"></attribute>
</complexType>
<complexType name="TableRegion">
<attribute name="name" type="string"></attribute>
<attribute name="id" type="int"></attribute>
<attribute name="startKey" type="base64Binary"></attribute>
<attribute name="endKey" type="base64Binary"></attribute>
<attribute name="location" type="string"></attribute>
</complexType>
<element name="TableSchema" type="tns:TableSchema"></element>
<complexType name="TableSchema">
<sequence>
<element name="column" type="tns:ColumnSchema" maxOccurs="unbounded" minOccurs="1"></element>
</sequence>
<attribute name="name" type="string"></attribute>
<anyAttribute></anyAttribute>
</complexType>
<complexType name="ColumnSchema">
<attribute name="name" type="string"></attribute>
<anyAttribute></anyAttribute>
</complexType>
<element name="CellSet" type="tns:CellSet"></element>
<complexType name="CellSet">
<sequence>
<element name="row" type="tns:Row" maxOccurs="unbounded" minOccurs="1"></element>
</sequence>
</complexType>
<element name="Row" type="tns:Row"></element>
<complexType name="Row">
<sequence>
<element name="key" type="base64Binary"></element>
<element name="cell" type="tns:Cell" maxOccurs="unbounded" minOccurs="1"></element>
</sequence>
</complexType>
<element name="Cell" type="tns:Cell"></element>
<complexType name="Cell">
<sequence>
<element name="value" maxOccurs="1" minOccurs="1">
<simpleType><restriction base="base64Binary">
</simpleType>
</element>
</sequence>
<attribute name="column" type="base64Binary" />
<attribute name="timestamp" type="int" />
</complexType>
<element name="Scanner" type="tns:Scanner"></element>
<complexType name="Scanner">
<sequence>
<element name="column" type="base64Binary" minOccurs="0" maxOccurs="unbounded"></element>
</sequence>
<sequence>
<element name="filter" type="string" minOccurs="0" maxOccurs="1"></element>
</sequence>
<attribute name="startRow" type="base64Binary"></attribute>
<attribute name="endRow" type="base64Binary"></attribute>
<attribute name="batch" type="int"></attribute>
<attribute name="startTime" type="int"></attribute>
<attribute name="endTime" type="int"></attribute>
</complexType>
<element name="StorageClusterVersion" type="tns:StorageClusterVersion" />
<complexType name="StorageClusterVersion">
<attribute name="version" type="string"></attribute>
</complexType>
<element name="StorageClusterStatus"
type="tns:StorageClusterStatus">
</element>
<complexType name="StorageClusterStatus">
<sequence>
<element name="liveNode" type="tns:Node"
maxOccurs="unbounded" minOccurs="0">
</element>
<element name="deadNode" type="string" maxOccurs="unbounded"
minOccurs="0">
</element>
</sequence>
<attribute name="regions" type="int"></attribute>
<attribute name="requests" type="int"></attribute>
<attribute name="averageLoad" type="float"></attribute>
</complexType>
<complexType name="Node">
<sequence>
<element name="region" type="tns:Region"
maxOccurs="unbounded" minOccurs="0">
</element>
</sequence>
<attribute name="name" type="string"></attribute>
<attribute name="startCode" type="int"></attribute>
<attribute name="requests" type="int"></attribute>
<attribute name="heapSizeMB" type="int"></attribute>
<attribute name="maxHeapSizeMB" type="int"></attribute>
</complexType>
<complexType name="Region">
<attribute name="name" type="base64Binary"></attribute>
<attribute name="stores" type="int"></attribute>
<attribute name="storefiles" type="int"></attribute>
<attribute name="storefileSizeMB" type="int"></attribute>
<attribute name="memstoreSizeMB" type="int"></attribute>
<attribute name="storefileIndexSizeMB" type="int"></attribute>
</complexType>
</schema>
```
### 97.5\. REST Protobufs 结构
```
message Version {
optional string restVersion = 1;
optional string jvmVersion = 2;
optional string osVersion = 3;
optional string serverVersion = 4;
optional string jerseyVersion = 5;
}
message StorageClusterStatus {
message Region {
required bytes name = 1;
optional int32 stores = 2;
optional int32 storefiles = 3;
optional int32 storefileSizeMB = 4;
optional int32 memstoreSizeMB = 5;
optional int32 storefileIndexSizeMB = 6;
}
message Node {
required string name = 1; // name:port
optional int64 startCode = 2;
optional int32 requests = 3;
optional int32 heapSizeMB = 4;
optional int32 maxHeapSizeMB = 5;
repeated Region regions = 6;
}
// node status
repeated Node liveNodes = 1;
repeated string deadNodes = 2;
// summary statistics
optional int32 regions = 3;
optional int32 requests = 4;
optional double averageLoad = 5;
}
message TableList {
repeated string name = 1;
}
message TableInfo {
required string name = 1;
message Region {
required string name = 1;
optional bytes startKey = 2;
optional bytes endKey = 3;
optional int64 id = 4;
optional string location = 5;
}
repeated Region regions = 2;
}
message TableSchema {
optional string name = 1;
message Attribute {
required string name = 1;
required string value = 2;
}
repeated Attribute attrs = 2;
repeated ColumnSchema columns = 3;
// optional helpful encodings of commonly used attributes
optional bool inMemory = 4;
optional bool readOnly = 5;
}
message ColumnSchema {
optional string name = 1;
message Attribute {
required string name = 1;
required string value = 2;
}
repeated Attribute attrs = 2;
// optional helpful encodings of commonly used attributes
optional int32 ttl = 3;
optional int32 maxVersions = 4;
optional string compression = 5;
}
message Cell {
optional bytes row = 1; // unused if Cell is in a CellSet
optional bytes column = 2;
optional int64 timestamp = 3;
optional bytes data = 4;
}
message CellSet {
message Row {
required bytes key = 1;
repeated Cell values = 2;
}
repeated Row rows = 1;
}
message Scanner {
optional bytes startRow = 1;
optional bytes endRow = 2;
repeated bytes columns = 3;
optional int32 batch = 4;
optional int64 startTime = 5;
optional int64 endTime = 6;
}
```
## 98\. Thrift
相关文档已转移到 [Thrift API and Filter Language](#thrift).
## 99\. C/C++ Apache HBase 客户端
FB’s Chip Turner 编写了纯正 C/C++ 客户端. [Check it out](https://github.com/hinaria/native-cpp-hbase-client).
C++ client 实现. 详见: [HBASE-14850](https://issues.apache.org/jira/browse/HBASE-14850).
## 100\. 将 Java Data Objects (JDO) 和 HBase 一起使用
[Java 数据对象 Java Data Objects (JDO)](https://db.apache.org/jdo/) 是一种访问数据库中持久数据的标准方法,使用普通的旧 Java 对象(POJO)来表示持久数据。
依赖
此代码示例具有以下依赖项:
1. HBase 0.90.x 或更高版本
2. commons-beanutils.jar ([https://commons.apache.org/](https://commons.apache.org/))
3. commons-pool-1.5.5.jar ([https://commons.apache.org/](https://commons.apache.org/))
4. transactional-tableindexed for HBase 0.90 ([https://github.com/hbase-trx/hbase-transactional-tableindexed](https://github.com/hbase-trx/hbase-transactional-tableindexed))
下载 `hbase-jdo`
下载源码 [http://code.google.com/p/hbase-jdo/](http://code.google.com/p/hbase-jdo/).
Example 26\. JDO 示例
此示例使用 JDO 创建一个表和一个索引,在表中插入行,获取行,获取列值,执行查询以及执行一些其他 HBase 操作。
```
package com.apache.hadoop.hbase.client.jdo.examples;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.Hashtable;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.tableindexed.IndexedTable;
import com.apache.hadoop.hbase.client.jdo.AbstractHBaseDBO;
import com.apache.hadoop.hbase.client.jdo.HBaseBigFile;
import com.apache.hadoop.hbase.client.jdo.HBaseDBOImpl;
import com.apache.hadoop.hbase.client.jdo.query.DeleteQuery;
import com.apache.hadoop.hbase.client.jdo.query.HBaseOrder;
import com.apache.hadoop.hbase.client.jdo.query.HBaseParam;
import com.apache.hadoop.hbase.client.jdo.query.InsertQuery;
import com.apache.hadoop.hbase.client.jdo.query.QSearch;
import com.apache.hadoop.hbase.client.jdo.query.SelectQuery;
import com.apache.hadoop.hbase.client.jdo.query.UpdateQuery;
/**
* Hbase JDO Example.
*
* dependency library.
* - commons-beanutils.jar
* - commons-pool-1.5.5.jar
* - hbase0.90.0-transactionl.jar
*
* you can expand Delete,Select,Update,Insert Query classes.
*
*/
public class HBaseExample {
public static void main(String[] args) throws Exception {
AbstractHBaseDBO dbo = new HBaseDBOImpl();
//*drop if table is already exist.*
if(dbo.isTableExist("user")){
dbo.deleteTable("user");
}
//*create table*
dbo.createTableIfNotExist("user",HBaseOrder.DESC,"account");
//dbo.createTableIfNotExist("user",HBaseOrder.ASC,"account");
//create index.
String[] cols={"id","name"};
dbo.addIndexExistingTable("user","account",cols);
//insert
InsertQuery insert = dbo.createInsertQuery("user");
UserBean bean = new UserBean();
bean.setFamily("account");
bean.setAge(20);
bean.setEmail("ncanis@gmail.com");
bean.setId("ncanis");
bean.setName("ncanis");
bean.setPassword("1111");
insert.insert(bean);
//select 1 row
SelectQuery select = dbo.createSelectQuery("user");
UserBean resultBean = (UserBean)select.select(bean.getRow(),UserBean.class);
// select column value.
String value = (String)select.selectColumn(bean.getRow(),"account","id",String.class);
// search with option (QSearch has EQUAL, NOT_EQUAL, LIKE)
// select id,password,name,email from account where id='ncanis' limit startRow,20
HBaseParam param = new HBaseParam();
param.setPage(bean.getRow(),20);
param.addColumn("id","password","name","email");
param.addSearchOption("id","ncanis",QSearch.EQUAL);
select.search("account", param, UserBean.class);
// search column value is existing.
boolean isExist = select.existColumnValue("account","id","ncanis".getBytes());
// update password.
UpdateQuery update = dbo.createUpdateQuery("user");
Hashtable<String, byte[]> colsTable = new Hashtable<String, byte[]>();
colsTable.put("password","2222".getBytes());
update.update(bean.getRow(),"account",colsTable);
//delete
DeleteQuery delete = dbo.createDeleteQuery("user");
delete.deleteRow(resultBean.getRow());
////////////////////////////////////
// etc
// HTable pool with apache commons pool
// borrow and release. HBasePoolManager(maxActive, minIdle etc..)
IndexedTable table = dbo.getPool().borrow("user");
dbo.getPool().release(table);
// upload bigFile by hadoop directly.
HBaseBigFile bigFile = new HBaseBigFile();
File file = new File("doc/movie.avi");
FileInputStream fis = new FileInputStream(file);
Path rootPath = new Path("/files/");
String filename = "movie.avi";
bigFile.uploadFile(rootPath,filename,fis,true);
// receive file stream from hadoop.
Path p = new Path(rootPath,filename);
InputStream is = bigFile.path2Stream(p,4096);
}
}
```
## 101\. Scala
### 101.1\. 设置类路径
要将 Scala 与 HBase 一起使用,您的 CLASSPATH 必须包含 HBase 的类路径以及代码所需的 Scala JAR。首先,在运行 HBase RegionServer 进程的服务器上使用以下命令,以获取 HBase 的类路径。
```
$ ps aux |grep regionserver| awk -F 'java.library.path=' {'print $2'} | awk {'print $1'}
/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64
```
设置`$CLASSPATH`环境变量以包括您在上一步中找到的路径,以及项目所需的`scala-library.jar`路径和每个与 Scala 相关的其他 JAR。
```
$ export CLASSPATH=$CLASSPATH:/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64:/path/to/scala-library.jar
```
### 101.2\. Scala SBT 文件
`build.sbt` 需要使用 `resolvers` 和 `libraryDependencies` .
```
resolvers += "Apache HBase" at "https://repository.apache.org/content/repositories/releases"
resolvers += "Thrift" at "https://people.apache.org/~rawson/repo/"
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-core" % "0.20.2",
"org.apache.hbase" % "hbase" % "0.90.4"
)
```
### 101.3\. Scala 示例
此示例列出 HBase 表,创建新表并向其添加行:
```
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.{Connection,ConnectionFactory,HBaseAdmin,HTable,Put,Get}
import org.apache.hadoop.hbase.util.Bytes
val conf = new HBaseConfiguration()
val connection = ConnectionFactory.createConnection(conf);
val admin = connection.getAdmin();
// list the tables
val listtables=admin.listTables()
listtables.foreach(println)
// let's insert some data in 'mytable' and get the row
val table = new HTable(conf, "mytable")
val theput= new Put(Bytes.toBytes("rowkey1"))
theput.add(Bytes.toBytes("ids"),Bytes.toBytes("id1"),Bytes.toBytes("one"))
table.put(theput)
val theget= new Get(Bytes.toBytes("rowkey1"))
val result=table.get(theget)
val value=result.value()
println(Bytes.toString(value))
```
## 102\. Jython
### 102.1\. 设置类路径
要将 Jython 与 HBase 一起使用,您的 CLASSPATH 必须包含 HBase 的类路径以及代码所需的 Jython JAR。
将路径设置为包含`Jython.jar`的目录,以及每个项目需要的附加的 Jython 相关 JAR。然后将的 HBASE_CLASSPATH 指向$JYTHON_HOME 。
```
$ export HBASE_CLASSPATH=/directory/jython.jar
```
在类路径中使用 HBase 和 Hadoop JAR 启动 Jython shell: $ bin/hbase org.python.util.jython
### 102.2\. Jython 示例
Example 27\. 使用 Jython 创建表,填充,获取和删除表
以下 Jython 代码示例检查表,如果存在,则删除它然后创建它。然后,它使用数据填充表并获取数据。
```
import java.lang
from org.apache.hadoop.hbase import HBaseConfiguration, HTableDescriptor, HColumnDescriptor, TableName
from org.apache.hadoop.hbase.client import Admin, Connection, ConnectionFactory, Get, Put, Result, Table
from org.apache.hadoop.conf import Configuration
# First get a conf object. This will read in the configuration
# that is out in your hbase-*.xml files such as location of the
# hbase master node.
conf = HBaseConfiguration.create()
connection = ConnectionFactory.createConnection(conf)
admin = connection.getAdmin()
# Create a table named 'test' that has a column family
# named 'content'.
tableName = TableName.valueOf("test")
table = connection.getTable(tableName)
desc = HTableDescriptor(tableName)
desc.addFamily(HColumnDescriptor("content"))
# Drop and recreate if it exists
if admin.tableExists(tableName):
admin.disableTable(tableName)
admin.deleteTable(tableName)
admin.createTable(desc)
# Add content to 'column:' on a row named 'row_x'
row = 'row_x'
put = Put(row)
put.addColumn("content", "qual", "some content")
table.put(put)
# Now fetch the content just added, returns a byte[]
get = Get(row)
result = table.get(get)
data = java.lang.String(result.getValue("content", "qual"), "UTF8")
print "The fetched row contains the value '%s'" % data
```
Example 28\. 使用 Jython 进行表扫描
此示例扫描表并返回与给定族限定符匹配的结果。
```
import java.lang
from org.apache.hadoop.hbase import TableName, HBaseConfiguration
from org.apache.hadoop.hbase.client import Connection, ConnectionFactory, Result, ResultScanner, Table, Admin
from org.apache.hadoop.conf import Configuration
conf = HBaseConfiguration.create()
connection = ConnectionFactory.createConnection(conf)
admin = connection.getAdmin()
tableName = TableName.valueOf('wiki')
table = connection.getTable(tableName)
cf = "title"
attr = "attr"
scanner = table.getScanner(cf)
while 1:
result = scanner.next()
if not result:
break
print java.lang.String(result.row), java.lang.String(result.getValue(cf, attr))
```
- HBase™ 中文参考指南 3.0
- Preface
- Getting Started
- Apache HBase Configuration
- Upgrading
- The Apache HBase Shell
- Data Model
- HBase and Schema Design
- RegionServer Sizing Rules of Thumb
- HBase and MapReduce
- Securing Apache HBase
- Architecture
- In-memory Compaction
- Backup and Restore
- Synchronous Replication
- Apache HBase APIs
- Apache HBase External APIs
- Thrift API and Filter Language
- HBase and Spark
- Apache HBase Coprocessors
- Apache HBase Performance Tuning
- Troubleshooting and Debugging Apache HBase
- Apache HBase Case Studies
- Apache HBase Operational Management
- Building and Developing Apache HBase
- Unit Testing HBase Applications
- Protobuf in HBase
- Procedure Framework (Pv2): HBASE-12439
- AMv2 Description for Devs
- ZooKeeper
- Community
- Appendix