## 背景
对于商业数据库来说,备份的功能一般都非常的全面。
比如Oracle,它的备份工具rman是非常强大的,很多年前就已经支持全量、增量、归档的备份模式,支持压缩等。
还支持元数据存储到数据库中,管理也非常的方便,例如保留多少归档,备份集的管理也很方便,例如要恢复到什么时间点,将此前的备份清除等等。
对于开源数据库来说,支持向商业版本这么丰富功能的比较少,PostgreSQL算是非常完善的一个。
PostgreSQL作为最高级的开源数据库,备份方面已经向商业数据库看齐。
目前PostgreSQL已经支持类似Oracle的rman备份工具的功能,支持全量、增量、归档三种备份模式,支持压缩,支持备份集的管理等。
有了块级增量备份,对于那种非常庞大的数据库,备份起来就不像只支持全量和归档的模式那么吃力了。
## PostgreSQL增量备份是怎么做到的呢?
一个数据页的框架如下
~~~
* +----------------+---------------------------------+
* | PageHeaderData | linp1 linp2 linp3 ... |
* +-----------+----+---------------------------------+
* | ... linpN | |
* +-----------+--------------------------------------+
* | ^ pd_lower |
* | |
* | v pd_upper |
* +-------------+------------------------------------+
* | | tupleN ... |
* +-------------+------------------+-----------------+
* | ... tuple3 tuple2 tuple1 | "special space" |
* +--------------------------------+-----------------+
~~~
数据页头部的数据结构
~~~
typedef struct PageHeaderData
{
/* XXX LSN is member of *any* block, not only page-organized ones */
PageXLogRecPtr pd_lsn; /* LSN: next byte after last byte of xlog
* record for last change to this page */
uint16 pd_checksum; /* checksum */
uint16 pd_flags; /* flag bits, see below */
LocationIndex pd_lower; /* offset to start of free space */
LocationIndex pd_upper; /* offset to end of free space */
LocationIndex pd_special; /* offset to start of special space */
uint16 pd_pagesize_version;
TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */
ItemIdData pd_linp[FLEXIBLE_ARRAY_MEMBER]; /* line pointer array */
} PageHeaderData;
~~~
因为如果对象是持久化的,那么它的所有变更都会记录REDO,数据页头部的pd_lsn表示该数据页最后一次变化时,变化产生的REDO在xlog file中的结束位置.
即如果xlog flush的xlog地址位 大于或等于 此页pd_lsn,那么这个页的更改就可以认为是可靠的。
~~~
* pd_lsn - identifies xlog record for last change to this page.
* pd_checksum - page checksum, if set.
* pd_flags - flag bits.
* pd_lower - offset to start of free space.
* pd_upper - offset to end of free space.
* pd_special - offset to start of special space.
* pd_pagesize_version - size in bytes and page layout version number.
* pd_prune_xid - oldest XID among potentially prunable tuples on page.
~~~
好了,既然每次块的变化都包含了LSN的修改,那么也即是说,我们可以通过第一次备份开始时的全局LSN,以及当前需要备份的数据的page LSN来判断此页是否发生过修改。
如果修改了就备份,没修改,就不需要备份, 从而实现数据库的块级增量备份。
## pg_rman 介绍
pg_rman是一个[开源的PostgreSQL备份管理软件](https://github.com/ossc-db/pg_rman),类似Oracle的RMAN。
[documentation见](http://ossc-db.github.io/pg_rman/index.html)
pg_rman使用的是pg_start_backup(), copy, pg_stop_backup()的备份模式。
pg_rman跑的不是流复制协议,而是文件拷贝,所以pg_rman必须和数据库节点跑在一起。
如果在standby节点跑pg_rman,pg_rman则需要通过网络连接到主节点执行pg_start_backup和pg_stop_backup。
pg_rman的用法非常简单,支持以下几种运行模式。
| init | Initialize a backup catalog. |
| backup | Take an online backup. |
| restore | Do restore. |
| show | Show backup history. The detail option shows with additional information of each backups. |
| validate | Validate backup files. Backups without validation cannot be used for restore and incremental backup. |
| delete | Delete backup files. |
| purge | Remove deleted backups from backup catalog. |
## 使用pg_rman的前提
开启归档
配置csvlog
建议的配置
~~~
postgres=# show log_destination ;
log_destination
-----------------
csvlog
(1 row)
postgres=# SHOW log_directory ;
log_directory
---------------
pg_log
(1 row)
postgres=# SHOW archive_command ;
archive_command
--------------------------------------------
cp %p /data04/digoal/arc_log/%f
(1 row)
~~~
## 初始化pg_rman backup catalog
首先需要初始化一个backup catalog,实际上就是需要一个目录,这个目录将用于存放备份的文件。
同时这个目录也会存放一些元数据,例如备份的配置文件,数据库的systemid,时间线文件历史等等。
初始化命令需要两个参数,分别为备份目标目录,以及数据库的$PGDATA
~~~
$ mkdir /data05/digoal/pgbbk
$ /home/digoal/pgsql9.5/bin/pg_rman init -B /data05/digoal/pgbbk -D /data04/digoal/pg_root
INFO: ARCLOG_PATH is set to '/data04/digoal/arc_log'
INFO: SRVLOG_PATH is set to '/data04/digoal/pg_root/pg_log'
~~~
生成备份元数据如下
~~~
[digoal@iZ28tqoemgtZ ~]$ cd /data05/digoal/pgbbk/
[digoal@iZ28tqoemgtZ pgbbk]$ ll
total 16
drwx------ 4 digoal digoal 4096 Aug 26 19:29 backup
-rw-rw-r-- 1 digoal digoal 82 Aug 26 19:29 pg_rman.ini
-rw-rw-r-- 1 digoal digoal 40 Aug 26 19:29 system_identifier
drwx------ 2 digoal digoal 4096 Aug 26 19:29 timeline_history
~~~
生成的配置文件
~~~
$ cat pg_rman.ini
ARCLOG_PATH='/data04/digoal/arc_log'
SRVLOG_PATH='/data04/digoal/pg_root/pg_log'
~~~
你可以把将来要使用的配置写在这个配置文件中,或者写在pg_rman的命令行中。
我后面的测试会直接使用命令行参数。
生成的数据库system id,用于区分备份的数据库是不是一个数据库,防止被冲。
~~~
$ cat system_identifier
SYSTEM_IDENTIFIER='6318621837015461309'
~~~
与控制文件中存储的system id一致。
注意
pg_rman只从postgresql.conf取log_directory和archive_command参数的值。
如果你的PostgreSQL的配置文件是include的或者配置在postgresql.auto.conf中,这两个值将不准确。
所以建议你仅仅把参数配置在postgresql.conf中,而不要使用其他配置文件。
## pg_rman 命令行用法
~~~
pg_rman manage backup/recovery of PostgreSQL database.
Usage:
pg_rman OPTION init
pg_rman OPTION backup
pg_rman OPTION restore
pg_rman OPTION show [DATE]
pg_rman OPTION show detail [DATE]
pg_rman OPTION validate [DATE]
pg_rman OPTION delete DATE
pg_rman OPTION purge
Common Options:
-D, --pgdata=PATH location of the database storage area
-A, --arclog-path=PATH location of archive WAL storage area
-S, --srvlog-path=PATH location of server log storage area
-B, --backup-path=PATH location of the backup storage area
-c, --check show what would have been done
-v, --verbose show what detail messages
-P, --progress show progress of processed files
Backup options:
-b, --backup-mode=MODE full, incremental, or archive
-s, --with-serverlog also backup server log files
-Z, --compress-data compress data backup with zlib
-C, --smooth-checkpoint do smooth checkpoint before backup
-F, --full-backup-on-error switch to full backup mode
if pg_rman cannot find validate full backup
on current timeline
NOTE: this option is only used in --backup-mode=incremental or archive.
--keep-data-generations=NUM keep NUM generations of full data backup
--keep-data-days=NUM keep enough data backup to recover to N days ago
--keep-arclog-files=NUM keep NUM of archived WAL
--keep-arclog-days=DAY keep archived WAL modified in DAY days
--keep-srvlog-files=NUM keep NUM of serverlogs
--keep-srvlog-days=DAY keep serverlog modified in DAY days
--standby-host=HOSTNAME standby host when taking backup from standby
--standby-port=PORT standby port when taking backup from standby
Restore options:
--recovery-target-time time stamp up to which recovery will proceed
--recovery-target-xid transaction ID up to which recovery will proceed
--recovery-target-inclusive whether we stop just after the recovery target
--recovery-target-timeline recovering into a particular timeline
--hard-copy copying archivelog not symbolic link
Catalog options:
-a, --show-all show deleted backup too
Delete options:
-f, --force forcibly delete backup older than given DATE
Connection options:
-d, --dbname=DBNAME database to connect
-h, --host=HOSTNAME database server host or socket directory
-p, --port=PORT database server port
-U, --username=USERNAME user name to connect as
-w, --no-password never prompt for password
-W, --password force password prompt
Generic options:
-q, --quiet don't show any INFO or DEBUG messages
--debug show DEBUG messages
--help show this help, then exit
--version output version information, then exit
Read the website for details. <http://github.com/ossc-db/pg_rman>
Report bugs to <http://github.com/ossc-db/pg_rman/issues>.
~~~
## 全量备份
输入必要的参数或option
~~~
$ export PGPASSWORD=postgres
$ /home/digoal/pgsql9.5/bin/pg_rman backup \
-B /data05/digoal/pgbbk \
-D /data04/digoal/pg_root \
-b full \
-s \
-Z \
-C \
--keep-data-days=10 \
--keep-arclog-files=15 \
--keep-arclog-days=10 \
--keep-srvlog-files=10 \
--keep-srvlog-days=15 \
-h 127.0.0.1 -p 1921 -U postgres -d postgres
~~~
结果
~~~
INFO: copying database files
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
INFO: copying archived WAL files
INFO: copying server log files
INFO: backup complete
HINT: Please execute 'pg_rman validate' to verify the files are correctly copied.
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 15, keep days = 10)
INFO: the threshold timestamp calculated by keep days is "2016-08-16 00:00:00"
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10, keep days = 15)
INFO: the threshold timestamp calculated by keep days is "2016-08-11 00:00:00"
INFO: start deleting old backup (keep after = 2016-08-16 00:00:00)
INFO: does not include the backup just taken
WARNING: backup "2016-08-26 19:39:32" is not taken int account
DETAIL: This is not valid backup.
~~~
## 校验备份集
备份时pg_rman会记录每个备份文件的crc,以便validate进行校验。
例如某个备份集
~~~
$ less /data05/digoal/pgbbk/20160826/195809/file_database.txt
.s.PGSQL.1921 ? 0 0 0777 2016-08-26 19:27:05
.s.PGSQL.1921.lock f 55 590164837 0600 2016-08-26 19:27:05
PG_VERSION f 12 3872055064 0600 2016-07-28 10:03:42
backup_label f 167 2985542389 0600 2016-08-26 19:58:42
backup_label.old f 155 4273989468 0600 2016-08-23 19:43:32
base d 0 0 0700 2016-08-23 10:28:32
base/1 d 0 0 0700 2016-08-24 16:17:02
base/1/112 f 57 1147028285 0600 2016-07-28 10:03:42
base/1/113 f 57 1147028285 0600 2016-07-28 10:03:42
base/1/1247 F 8178 1875285513 0600 2016-07-29 13:51:29
base/1/1247_fsm f 139 3668812536 0600 2016-07-28 10:03:43
~~~
解释:
路径,文件类型,大小,CRC校验值,权限,时间,第四列即crc校验值
每次备份完,必须要做一次校验,否则备份集不可用用来恢复,增量备份时也不会用它来做增量比较。
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman validate -B /data05/digoal/pgbbk
INFO: validate: "2016-08-26 19:39:50" backup, archive log files and server log files by CRC
INFO: backup "2016-08-26 19:39:50" is valid
~~~
每个备份集都包含了一个备份状态文件,如下
~~~
cat /data05/digoal/pgbbk/20160826/201955/backup.ini
# configuration
BACKUP_MODE=INCREMENTAL
FULL_BACKUP_ON_ERROR=false
WITH_SERVERLOG=true
COMPRESS_DATA=true
# result
TIMELINEID=1
START_LSN=46/df000108
STOP_LSN=46/df000210
START_TIME='2016-08-26 20:19:55'
END_TIME='2016-08-26 20:20:48'
RECOVERY_XID=3896508593
RECOVERY_TIME='2016-08-26 20:20:47'
TOTAL_DATA_BYTES=6196524307
READ_DATA_BYTES=3199287520
READ_ARCLOG_BYTES=33554754
READ_SRVLOG_BYTES=0
WRITE_BYTES=125955
BLOCK_SIZE=8192
XLOG_BLOCK_SIZE=8192
STATUS=OK
~~~
这个文件中包含了很重要的信息,比如LSN,后面LSN将用于比对增量备份时对比数据块的LSN是否发生了变化,是否需要备份。
## 增量备份
~~~
$ export PGPASSWORD=postgres
$ /home/digoal/pgsql9.5/bin/pg_rman backup \
-B /data05/digoal/pgbbk \
-D /data04/digoal/pg_root \
-b incremental \
-s \
-Z \
-C \
--keep-data-days=10 \
--keep-arclog-files=15 \
--keep-arclog-days=10 \
--keep-srvlog-files=10 \
--keep-srvlog-days=15 \
-h 127.0.0.1 -p 1921 -U postgres -d postgres
~~~
增量备份输出
~~~
INFO: copying database files
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
INFO: copying archived WAL files
INFO: copying server log files
INFO: backup complete
HINT: Please execute 'pg_rman validate' to verify the files are correctly copied.
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 15, keep days = 10)
INFO: the threshold timestamp calculated by keep days is "2016-08-16 00:00:00"
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10, keep days = 15)
INFO: the threshold timestamp calculated by keep days is "2016-08-11 00:00:00"
INFO: start deleting old backup (keep after = 2016-08-16 00:00:00)
INFO: does not include the backup just taken
INFO: backup "2016-08-26 19:39:50" should be kept
DETAIL: This is taken after "2016-08-16 00:00:00".
WARNING: backup "2016-08-26 19:39:32" is not taken int account
DETAIL: This is not valid backup.
~~~
## 校验备份集
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman validate -B /data05/digoal/pgbbk
INFO: validate: "2016-08-26 19:43:20" backup, archive log files and server log files by CRC
INFO: backup "2016-08-26 19:43:20" is valid
~~~
## 列出备份集
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman show -B /data05/digoal/pgbbk
==========================================================
StartTime Mode Duration Size TLI Status
==========================================================
2016-08-26 19:43:20 INCR 0m 54kB 1 OK
2016-08-26 19:39:50 FULL 1m 245MB 1 OK
~~~
可以看到增量非常小,因为很少变化的块。
接下来更新一张大表的某一条记录,再看看。
~~~
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+----------+-------+----------+---------+-------------
public | hll_test | table | postgres | 208 kB |
public | t | table | postgres | 3050 MB |
public | tbl1 | table | postgres | 226 MB |
public | tbl2 | table | postgres | 63 MB |
public | test | table | postgres | 120 MB |
(5 rows)
postgres=# with t1 as (select id from t where id between 1 and 1000 limit 10) update t set info='new' where id in (select * from t1);
UPDATE 10
~~~
更新后做一个增量备份
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman backup -B /data05/digoal/pgbbk -D /data04/digoal/pg_root -b incremental -s -Z -C --keep-data-days=10 --keep-arclog-files=15 --keep-arclog-days=10 --keep-srvlog-files=10 --keep-srvlog-days=15 -h 127.0.0.1 -p 1921 -U postgres -d postgres
INFO: copying database files
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
INFO: copying archived WAL files
INFO: copying server log files
INFO: backup complete
HINT: Please execute 'pg_rman validate' to verify the files are correctly copied.
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 15, keep days = 10)
INFO: the threshold timestamp calculated by keep days is "2016-08-16 00:00:00"
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10, keep days = 15)
INFO: the threshold timestamp calculated by keep days is "2016-08-11 00:00:00"
INFO: start deleting old backup (keep after = 2016-08-16 00:00:00)
INFO: does not include the backup just taken
INFO: backup "2016-08-26 19:58:09" should be kept
DETAIL: This is taken after "2016-08-16 00:00:00".
WARNING: backup "2016-08-26 19:56:54" is not taken int account
DETAIL: This is not valid backup.
INFO: backup "2016-08-26 19:43:20" should be kept
DETAIL: This is taken after "2016-08-16 00:00:00".
INFO: backup "2016-08-26 19:39:50" should be kept
DETAIL: This is taken after "2016-08-16 00:00:00".
WARNING: backup "2016-08-26 19:39:32" is not taken int account
DETAIL: This is not valid backup.
~~~
校验备份集
~~~
[digoal@iZ28tqoemgtZ pg_rman]$ /home/digoal/pgsql9.5/bin/pg_rman validate -B /data05/digoal/pgbbk
INFO: validate: "2016-08-26 20:19:55" backup, archive log files and server log files by CRC
INFO: backup "2016-08-26 20:19:55" is valid
~~~
输出当前备份
~~~
[digoal@iZ28tqoemgtZ pg_rman]$ /home/digoal/pgsql9.5/bin/pg_rman show -B /data05/digoal/pgbbk
==========================================================
StartTime Mode Duration Size TLI Status
==========================================================
2016-08-26 20:19:55 INCR 0m 125kB 1 OK
2016-08-26 19:58:09 FULL 11m 3094MB 1 OK
2016-08-26 19:56:54 FULL 1m 0B 0 ERROR
2016-08-26 19:43:20 INCR 0m 54kB 1 OK
2016-08-26 19:39:50 FULL 1m 245MB 1 OK
2016-08-26 19:39:32 FULL 0m 0B 0 ERROR
~~~
增量备份的文件非常小,因为变化的数据块很少。
## 按指定时间从catalog删除备份集
例如我只需要我的备份集能恢复到2016-08-26 19:59:00,在这个时间点以前,不需要用来恢复到这个时间点的备份全删掉。
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman delete "2016-08-26 19:59:00" -B /data05/digoal/pgbbk
WARNING: cannot delete backup with start time "2016-08-26 19:58:09"
DETAIL: This is the latest full backup necessary for successful recovery.
INFO: delete the backup with start time: "2016-08-26 19:56:54"
INFO: delete the backup with start time: "2016-08-26 19:43:20"
INFO: delete the backup with start time: "2016-08-26 19:39:50"
INFO: delete the backup with start time: "2016-08-26 19:39:32"
~~~
保留的备份集合可以将数据库恢复到2016-08-26 19:59:00
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman show -B /data05/digoal/pgbbk
==========================================================
StartTime Mode Duration Size TLI Status
==========================================================
2016-08-26 20:19:55 INCR 0m 125kB 1 OK
2016-08-26 19:58:09 FULL 11m 3094MB 1 OK
~~~
## 物理删除已从catalog删除的备份集
~~~
$ /home/digoal/pgsql9.5/bin/pg_rman purge -B /data05/digoal/pgbbk
INFO: DELETED backup "2016-08-26 19:56:54" is purged
INFO: DELETED backup "2016-08-26 19:43:20" is purged
INFO: DELETED backup "2016-08-26 19:39:50" is purged
INFO: DELETED backup "2016-08-26 19:39:32" is purged
~~~
## 恢复
pg_rman数据恢复时的两个必要要素
1\. 新的$PGDATA
2\. 备份目录
命令的选项也很简单,甚至可以不指定任何option
~~~
Restore options:
The parameters which are started with –recovery are same as parameters in recovery.conf. See also “Recovery Configuration” for details.
~~~
接下来的几个配置,与recovery.conf的意思对齐。
如果不指定时间线,则使用$PGDATA/global/pg_control,如果没有$PGDATA/global/pg_control,则使用最新的全量备份集的时间线。
~~~
--recovery-target-timeline TIMELINE
Specifies recovering into a particular timeline. If not specified, the current timeline from ($PGDATA/global/pg_control) is used.
~~~
如果不指定,则恢复到最新时间
~~~
--recovery-target-time TIMESTAMP
This parameter specifies the time stamp up to which recovery will proceed. If not specified, continue recovery to the latest time.
~~~
如果不指定,则恢复到最新xid
~~~
--recovery-target-xid XID
This parameter specifies the transaction ID up to which recovery will proceed. If not specified, continue recovery to the latest xid.
~~~
如果不指定,则默认使用true,即恢复到包含恢复目标XID的commit record为止,或者第一笔commit record ts>指定ts的 commit redo record为止;
如果是false则不apply恢复目标XID的commit record,或者不apply第一笔commit record ts>=指定ts的 commit redo record。
~~~
--recovery-target-inclusive
Specifies whether we stop just after the specified recovery target (true), or just before the recovery target (false). Default is true.
~~~
是否使用硬链接复制archive log,而不需要拷贝文件
~~~
The following parameter determines the behavior of restore.
--hard-copy
The archive WAL are copied to archive WAL storage area. If not specified, pg_rman makes symbolic link to archive WAL where are in the backup catalog directory.
~~~
## 例子
恢复时,需要注意,你可以选择原地恢复(覆盖式),或者使用新的$PGDATA作为恢复目标。
但是请注意,不管是哪种恢复方式,如果在本机恢复的话,pg_rman可能会覆盖原有的数据文件,arch, pg_xlog目录中的文件,所以,如果你要保留原数据,建议先将原数据目录重命名。
1\. 原地恢复
2\. 使用新的$PGDATA恢复
在本机恢复的例子
1\. 停库
~~~
pg_ctl stop -m fast -D /data04/digoal/pg_root_1922
~~~
2\. 重命名原数据相关目录
~~~
$PGDATA
mv /data04/digoal/pg_root_1922 /data04/digoal/old_pg_root_1922
PG_XLOG
mv /data05/digoal/pg_xlog_1922 /data05/digoal/old_pg_xlog_1922
表空间
mv /data02/digoal/tbs1_1922 /data02/digoal/old_tbs1_1922
归档目录,除了要重命名,还需要新建一个原目录
mv /data04/digoal/arc_log1922 /data04/digoal/old_arc_log1922
mkdir /data04/digoal/arc_log1922
...
~~~
3\. pg_rman restore
~~~
pg_rman restore -B /data05/digoal/pgstdbak -D /data04/digoal/pg_restore_root
WARNING: could not open pg_controldata file "/data04/digoal/pg_restore_root/global/pg_control": No such file or directory
WARNING: could not open pg_controldata file "/data04/digoal/pg_restore_root/global/pg_control": No such file or directory
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2016-08-29 15:05:32"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2016-08-29 15:05:32" backup, archive log files and server log files by SIZE
INFO: backup "2016-08-29 15:05:32" is valid
INFO: restoring database files from the full mode backup "2016-08-29 15:05:32"
INFO: searching incremental backup to be restored
INFO: validate: "2016-08-29 15:13:10" backup, archive log files and server log files by SIZE
INFO: backup "2016-08-29 15:13:10" is valid
INFO: restoring database files from the incremental mode backup "2016-08-29 15:13:10"
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2016-08-29 15:13:10" is valid
INFO: restoring WAL files from backup "2016-08-29 15:13:10"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
~~~
4\. 修改postgresql.conf, recovery.conf
根据实际情况调整,本例不修改
~~~
cd /data04/digoal/pg_restore_root/
digoal@iZ28tqoemgtZ-> ll
total 124K
-rw------- 1 digoal digoal 193 Aug 29 17:05 backup_label.old
drwx------ 7 digoal digoal 4.0K Aug 29 17:05 base
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 global
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_clog
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_commit_ts
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_dynshmem
-rw------- 1 digoal digoal 4.4K Aug 29 17:05 pg_hba.conf
-rw------- 1 digoal digoal 1.6K Aug 29 17:05 pg_ident.conf
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_log
drwx------ 4 digoal digoal 4.0K Aug 29 17:05 pg_logical
drwx------ 4 digoal digoal 4.0K Aug 29 17:05 pg_multixact
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_notify
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_replslot
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_serial
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_snapshots
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_stat
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_stat_tmp
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_subtrans
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_tblspc
drwx------ 2 digoal digoal 4.0K Aug 29 17:05 pg_twophase
-rw------- 1 digoal digoal 4 Aug 29 17:05 PG_VERSION
lrwxrwxrwx 1 digoal digoal 27 Aug 29 17:05 pg_xlog -> /data05/digoal/pg_xlog_1922
-rw------- 1 digoal digoal 88 Aug 29 17:05 postgresql.auto.conf
-rw------- 1 digoal digoal 22K Aug 29 17:05 postgresql.conf
-rw------- 1 digoal digoal 44 Aug 29 17:05 postmaster.opts
-rw-r--r-- 1 digoal digoal 130 Aug 29 17:06 recovery.conf
digoal@iZ28tqoemgtZ-> cd pg_tblspc/
digoal@iZ28tqoemgtZ-> ll
total 0
lrwxrwxrwx 1 digoal digoal 24 Aug 29 17:05 16719 -> /data02/digoal/tbs1_1922
cd ..
vi postgresql.conf
archive_command = 'cp %p /data04/digoal/arc_log1922/%f'
vi recovery.conf
# recovery.conf generated by pg_rman 1.3.2
restore_command = 'cp /data04/digoal/arc_log1922/%f %p'
recovery_target_timeline = '1'
~~~
5\. 如果备份集的时间线发生了变化,需要先手工拷贝到归档目录,再执行restore
~~~
digoal@iZ28tqoemgtZ-> pg_rman restore -B /data05/digoal/pgstdbak -D /data04/digoal/pg_restore_root
WARNING: could not open pg_controldata file "/data04/digoal/pg_restore_root/global/pg_control": No such file or directory
WARNING: could not open pg_controldata file "/data04/digoal/pg_restore_root/global/pg_control": No such file or directory
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 2
INFO: calculating timeline branches to be used to recovery target point
ERROR: could not open destination file "/data04/digoal/arc_log1922/00000002.history": No such file or directory
digoal@iZ28tqoemgtZ-> mkdir /data04/digoal/arc_log1922
digoal@iZ28tqoemgtZ-> cp /data05/digoal/pgstdbak/timeline_history/00000002.history /data04/digoal/arc_log1922/
digoal@iZ28tqoemgtZ-> pg_rman restore -B /data05/digoal/pgstdbak -D /data04/digoal/pg_restore_root
WARNING: could not open pg_controldata file "/data04/digoal/pg_restore_root/global/pg_control": No such file or directory
WARNING: could not open pg_controldata file "/data04/digoal/pg_restore_root/global/pg_control": No such file or directory
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 2
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2016-08-29 17:14:20"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2016-08-29 17:14:20" backup and archive log files by SIZE
INFO: backup "2016-08-29 17:14:20" is valid
INFO: restoring database files from the full mode backup "2016-08-29 17:14:20"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2016-08-29 17:14:20" is valid
INFO: restoring WAL files from backup "2016-08-29 17:14:20"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
digoal@iZ28tqoemgtZ-> cd /data04/digoal/arc_log1922
digoal@iZ28tqoemgtZ-> ll
total 16K
lrwxrwxrwx 1 digoal digoal 71 Aug 29 17:18 00000002000000470000000C -> /data05/digoal/pgstdbak/20160829/171420/arclog/00000002000000470000000C
lrwxrwxrwx 1 digoal digoal 71 Aug 29 17:18 00000002000000470000000D -> /data05/digoal/pgstdbak/20160829/171420/arclog/00000002000000470000000D
lrwxrwxrwx 1 digoal digoal 87 Aug 29 17:18 00000002000000470000000D.00000028.backup -> /data05/digoal/pgstdbak/20160829/171420/arclog/00000002000000470000000D.00000028.backup
-rw------- 1 digoal digoal 42 Aug 29 17:18 00000002.history
~~~
6\. 启动恢复目标数据库
~~~
digoal@iZ28tqoemgtZ-> pg_ctl start -D /data04/digoal/pg_restore_root
server starting
digoal@iZ28tqoemgtZ-> LOG: 00000: redirecting log output to logging collector process
HINT: Future log output will appear in directory "pg_log".
LOCATION: SysLogger_Start, syslogger.c:622
digoal@iZ28tqoemgtZ-> psql -h 127.0.0.1 -p 1922
psql (9.5.3)
Type "help" for help.
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
postgres=# \l+
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
--------------------+----------+----------+---------+-------+-----------------------+---------+------------+--------------------------------------------
contrib_regression | postgres | UTF8 | C | C | | 7137 kB | pg_default |
db1 | postgres | UTF8 | C | C | | 111 MB | tbs1 |
postgres | postgres | UTF8 | C | C | | 797 MB | pg_default | default administrative connection database
template0 | postgres | UTF8 | C | C | =c/postgres +| 7137 kB | pg_default | unmodifiable empty database
| | | | | postgres=CTc/postgres | | |
template1 | postgres | UTF8 | C | C | =c/postgres +| 7137 kB | pg_default | default template for new databases
| | | | | postgres=CTc/postgres | | |
(5 rows)
~~~
## 注意事项
1. 小心覆盖原有的部分
2. 建议先将原有的目录重命名,或者在其他机器恢复,
3. 软链接的处理,会恢复到目标,并且重新建立软链接,所以目录结构必须与备份时保持一致。
4. 用户可以指定$PGDATA,恢复到新的目标目录,但是arch_log, 表空间, pg_xlog目录无法指定新的位置,所以原地还原时,必须注意这些目录可能被覆盖,先重命名是比较好的手段。
## pg_rman 可以优化的地方
1. 检查哪些要恢复的块与目标块的CRC是否一致,如果一致,不需要拷贝,减少WRITE。
## pg_rman 软件限制
### pg_rman的使用限制
pg_rman has the following restrictions.
1. Requires to read database cluster directory and write backup catalog directory.
For example, you need to mount the disk where backup catalog is placed with NFS from database server.
实际上不是必须的,如果没有指定原来的$PGDATA,则使用备份集的元数据。
2. Block sizes of pg_rman and server should be matched. BLCKSZ and XLOG_BLCKSZ also should be matched.
编译pg_rman时,最好使用启动数据集的集群软件的pg_config。 确保块大小一致。
因为需要做块的校验。 读取LSN等,都与块大小有关。
3. If there are some unreadable files/directories in database cluster directory, WAL directory or archived WAL directory, the backup or restore would be failed.
4. When taking an incremental backup, pg_rman check the timeline ID of the target database whether it is the same with the one of the full backup in backup list.
But, pg_rman does not check whether the data itself is same with the full backup in backup list.
So, you can take an incremental backup over the full backup against the database which has the same timeline ID but has different data.
### 从standby备份时的软件限制
Getting backup from standby-site, pg_rman has the follow restrictions too.
1. The environment of replication should be built right, or the backup will not finish.
2. You can’t get backups on master and standby at the same time.
因为pg_rman使用的是exclusive bakcup,(pg_start_backup),所以同一时间,只能跑一次pg_start_backup。
pg_basebackup则使用的是shared backup,可以跑多个。
3. You can’t get backups on multi standbys at the same time too.
道理同上
4. Basically, the backup from standby-site is used for restoring on MASTER.
pg_rman doesn’t treat the backup as restoring on standby automatically.
5. If you want to restore the backup on STANDBY, you have to manage archive logs with your self.
因为备库不归档,所以从standby备份时,需要解决归档备份的问题。
我在前面的文档中已经提及,包括解决思路。
### 如果使用快照备份,有哪些软件限制
When using storage snapshot, pg_rman has the following restrictions too.
1. If your snapshot does not have any file update time, incremental backup is same with full backup.
2. Because pg_rman judges performing full backup or incremental backup by update time for files.
If files don’t have update time because of storage snapshot specification, pg_rman performs full backup every time.
3. You can’t backup for one side works storage with split mirror snapshot.
4. Before you execute pg_rman, you should perform storage “RESYNC”.
5. After pg_rman performs backup with split mirror snapshot, storeage will be “SPLITTED”(works on one side).
pg_rman perform SPLIT command for getting snapshot, but doesn’t perform RESYNC command.
6. You cant’t get snapshot from different vendor storages in a time.
7. You cant’t use some vendor storages which have different commands for getting snapshot.
8. The script and commands for getting storage snapshot should be executable.
9. It’s expected to have authority of root for getting snapshot or mounting volumes.
So a user, performs pg_rman, is granted to execute any commands in the script.
10. If you use LVM(Logical Volume Manager), it’s needed root authority for mount, umount, lvcreate, lvremove, lvscan commands.
You should granted to these commands with sudo command to non-password executable.
## pg_rman 源码浅析
1. 增量备份代码
上次备份以来,数据块的LSN是否发生了变化,如果自从上次备份的start_lsn以来没有发生变化,则不备份。
代码举例
~~~
else
{
pgBackupGetPath(prev_backup, prev_file_txt, lengthof(prev_file_txt),
DATABASE_FILE_LIST);
prev_files = dir_read_file_list(pgdata, prev_file_txt);
/*
* Do backup only pages having larger LSN than previous backup.
*/
lsn = &prev_backup->start_lsn;
xlogid = (uint32) (*lsn >> 32);
xrecoff = (uint32) *lsn;
elog(DEBUG, _("backup only the page updated after LSN(%X/%08X)"),
xlogid, xrecoff);
}
...
/* backup files from non-snapshot */
pgBackupGetPath(¤t, path, lengthof(path), DATABASE_DIR);
backup_files(pgdata, path, files, prev_files, lsn, current.compress_data, NULL);
~~~
1. 备份结果backup.ini相关代码
~~~
# configuration
BACKUP_MODE=FULL
FULL_BACKUP_ON_ERROR=false
WITH_SERVERLOG=true
COMPRESS_DATA=true
# result
TIMELINEID=1
START_LSN=43/d5000028
STOP_LSN=43/d5000168
START_TIME='2016-08-26 15:43:39'
END_TIME='2016-08-26 15:44:27'
RECOVERY_XID=3896508572
RECOVERY_TIME='2016-08-26 15:44:18'
TOTAL_DATA_BYTES=823571731
READ_DATA_BYTES=823571731
READ_ARCLOG_BYTES=234881668
READ_SRVLOG_BYTES=218248
WRITE_BYTES=206009921
BLOCK_SIZE=8192
XLOG_BLOCK_SIZE=8192
STATUS=OK
~~~
对应的数据结构
~~~
/*
* pg_rman takes backup into the directory $BACKUP_PATH/<date>/<time>.
*
* status == -1 indicates the pgBackup is invalid.
*/
typedef struct pgBackup
{
/* Backup Level */
BackupMode backup_mode;
bool with_serverlog;
bool compress_data;
bool full_backup_on_error;
/* Status - one of BACKUP_STATUS_xxx */
BackupStatus status;
/* Timestamp, etc. */
TimeLineID tli;
XLogRecPtr start_lsn;
XLogRecPtr stop_lsn;
time_t start_time;
time_t end_time;
time_t recovery_time;
uint32 recovery_xid;
/* Size (-1 means not-backup'ed) */
int64 total_data_bytes;
int64 read_data_bytes;
int64 read_arclog_bytes;
int64 read_srvlog_bytes;
int64 write_bytes;
/* data/wal block size for compatibility check */
uint32 block_size;
uint32 wal_block_size;
/* if backup from standby or not */
bool is_from_standby;
} pgBackup;
~~~
备份开始时记录pg_start_backup调用返回的lsn,写入backup->start_lsn
~~~
/*
* Notify start of backup to PostgreSQL server.
*/
static void
pg_start_backup(const char *label, bool smooth, pgBackup *backup)
{
PGresult *res;
const char *params[2];
params[0] = label;
elog(DEBUG, "executing pg_start_backup()");
reconnect();
/* Assumes PG version >= 8.4 */
/* 2nd argument is 'fast' (IOW, !smooth) */
params[1] = smooth ? "false" : "true";
res = execute("SELECT * from pg_xlogfile_name_offset(pg_start_backup($1, $2))", 2, params);
if (backup != NULL)
get_lsn(res, &backup->tli, &backup->start_lsn);
elog(DEBUG, "backup start point is (WAL file: %s, xrecoff: %s)",
PQgetvalue(res, 0, 0), PQgetvalue(res, 0, 1));
PQclear(res);
disconnect();
}
~~~
备份停止,调用pg_stop_backup,从返回结果中取出LSN,写入backup->stop_lsn
~~~
/*
* Notify end of backup to PostgreSQL server.
*/
static void
pg_stop_backup(pgBackup *backup)
{
elog(DEBUG, "executing pg_stop_backup()");
wait_for_archive(backup,
"SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup())");
}
static void
wait_for_archive(pgBackup *backup, const char *sql)
{
PGresult *res;
char ready_path[MAXPGPATH];
int try_count;
reconnect();
res = execute(sql, 0, NULL);
if (backup != NULL)
{
get_lsn(res, &backup->tli, &backup->stop_lsn);
elog(DEBUG, "backup end point is (WAL file: %s, xrecoff: %s)",
PQgetvalue(res, 0, 0), PQgetvalue(res, 0, 1));
}
/* get filename from the result of pg_xlogfile_name_offset() */
elog(DEBUG, "waiting for %s is archived", PQgetvalue(res, 0, 0));
snprintf(ready_path, lengthof(ready_path),
"%s/pg_xlog/archive_status/%s.ready", pgdata, PQgetvalue(res, 0, 0));
PQclear(res);
res = execute(TXID_CURRENT_SQL, 0, NULL);
if(backup != NULL)
{
get_xid(res, &backup->recovery_xid);
backup->recovery_time = time(NULL);
}
disconnect();
/* wait until switched WAL is archived */
try_count = 0;
while (fileExists(ready_path))
{
sleep(1);
if (interrupted)
ereport(FATAL,
(errcode(ERROR_INTERRUPTED),
errmsg("interrupted during waiting for WAL archiving")));
try_count++;
if (try_count > TIMEOUT_ARCHIVE)
ereport(ERROR,
(errcode(ERROR_ARCHIVE_FAILED),
errmsg("switched WAL could not be archived in %d seconds",
TIMEOUT_ARCHIVE)));
}
elog(DEBUG, "WAL file contains backup end point is archived after %d seconds waiting",
try_count);
}
~~~
validate 时,改backup.ini的STATUS字段
~~~
validate.c
for (i = 0; i < parray_num(backup_list); i++)
{
pgBackup *backup = (pgBackup *)parray_get(backup_list, i);
/* clean extra backups (switch STATUS to ERROR) */
if(!another_pg_rman &&
(backup->status == BACKUP_STATUS_RUNNING ||
backup->status == BACKUP_STATUS_DELETING))
{
backup->status = BACKUP_STATUS_ERROR;
pgBackupWriteIni(backup);
}
/* Validate completed backups only. */
if (backup->status != BACKUP_STATUS_DONE)
continue;
/* validate with CRC value and update status to OK */
pgBackupValidate(backup, false, false, (HAVE_DATABASE(backup)));
}
...
/* update status to OK */
if (corrupted)
backup->status = BACKUP_STATUS_CORRUPT;
else
backup->status = BACKUP_STATUS_OK;
pgBackupWriteIni(backup);
~~~
## 注意
1.备份参数 -C 表示无缝checkpoint, 所以可能很慢,视checkpoint_completion_target和segment_size的配置。
如果你发现pg_rman开始很慢,可以把-C去掉,速度就快了,但是可能在高峰时,造成冲击。
建议高峰是不要备份。
2.BUG
unix socket 是$PGDATA时, validate会报错
~~~
pg_rman validate
INFO: validate: "2016-08-26 16:19:25" backup, archive log files and server log files by CRC
ERROR: invalid type '?' found in "/data05/digoal/pgbak/20160826/161925/file_database.txt"
vi /data05/digoal/pgbak/20160826/161925/file_database.txt
.s.PGSQL.1921 ? 0 0 0777 2016-08-26 15:35:05
~~~
修改一下dir.c的代码即可修复这个问题,修改如下
~~~
if (strncmp(path, ".s.PGSQL", 7) != 0 && type != 'f' && type != 'F' && type != 'd' && type != 'l')
ereport(ERROR,
(errcode(ERROR_CORRUPTED),
errmsg("invalid type '%c' found in \"%s\"", type, file_txt)));
~~~
祝大家玩得开心,欢迎随时来 阿里云促膝长谈业务需求 ,恭候光临。
阿里云的小伙伴们加油,努力 做好内核与服务,打造最贴地气的云数据库 。
- 数据库内核月报目录
- 数据库内核月报 - 2016/09
- MySQL · 社区贡献 · AliSQL那些事儿
- PetaData · 架构体系 · PetaData第二代低成本存储体系
- MySQL · 社区动态 · MariaDB 10.2 前瞻
- MySQL · 特性分析 · 执行计划缓存设计与实现
- PgSQL · 最佳实践 · pg_rman源码浅析与使用
- MySQL · 捉虫状态 · bug分析两例
- PgSQL · 源码分析 · PG优化器浅析
- MongoDB · 特性分析· Sharding原理与应用
- PgSQL · 源码分析 · PG中的无锁算法和原子操作应用一则
- SQLServer · 最佳实践 · TEMPDB的设计
- 数据库内核月报 - 2016/08
- MySQL · 特性分析 ·MySQL 5.7新特性系列四
- PgSQL · PostgreSQL 逻辑流复制技术的秘密
- MySQL · 特性分析 · MyRocks简介
- GPDB · 特性分析· Greenplum 备份架构
- SQLServer · 最佳实践 · RDS for SQLServer 2012权限限制提升与改善
- TokuDB · 引擎特性 · REPLACE 语句优化
- MySQL · 专家投稿 · InnoDB物理行中null值的存储的推断与验证
- PgSQL · 实战经验 · 旋转门压缩算法在PostgreSQL中的实现
- MySQL · 源码分析 · Query Cache并发处理
- PgSQL · 源码分析· pg_dump分析
- 数据库内核月报 - 2016/07
- MySQL · 特性分析 ·MySQL 5.7新特性系列三
- MySQL · 特性分析 · 5.7 代价模型浅析
- PgSQL · 实战经验 · 分组TOP性能提升44倍
- MySQL · 源码分析 · 网络通信模块浅析
- MongoDB · 特性分析 · 索引原理
- SQLServer · 特性分析 · XML与JSON应用比较
- MySQL · 最佳实战 · 审计日志实用案例分析
- MySQL · 性能优化 · 条件下推到物化表
- MySQL · 源码分析 · Query Cache内部剖析
- MySQL · 捉虫动态 · 备库1206错误问题说明
- 数据库内核月报 - 2016/06
- MySQL · 特性分析 · innodb 锁分裂继承与迁移
- MySQL · 特性分析 ·MySQL 5.7新特性系列二
- PgSQL · 实战经验 · 如何预测Freeze IO风暴
- GPDB · 特性分析· Filespace和Tablespace
- MariaDB · 新特性 · 窗口函数
- MySQL · TokuDB · checkpoint过程
- MySQL · 特性分析 · 内部临时表
- MySQL · 最佳实践 · 空间优化
- SQLServer · 最佳实践 · 数据库实现大容量插入的几种方式
- 数据库内核月报 - 2016/05
- MySQL · 引擎特性 · 基于InnoDB的物理复制实现
- MySQL · 特性分析 · MySQL 5.7新特性系列一
- PostgreSQL · 特性分析 · 逻辑结构和权限体系
- MySQL · 特性分析 · innodb buffer pool相关特性
- PG&GP · 特性分析 · 外部数据导入接口实现分析
- SQLServer · 最佳实践 · 透明数据加密在SQLServer的应用
- MySQL · TokuDB · 日志子系统和崩溃恢复过程
- MongoDB · 特性分析 · Sharded cluster架构原理
- PostgreSQL · 特性分析 · 统计信息计算方法
- MySQL · 捉虫动态 · left-join多表导致crash
- 数据库内核月报 - 2016/04
- MySQL · 参数故事 · innodb_additional_mem_pool_size
- GPDB · 特性分析 · Segment事务一致性与异常处理
- GPDB · 特性分析 · Segment 修复指南
- MySQL · 捉虫动态 · 并行复制外键约束问题二
- PgSQL · 性能优化 · 如何潇洒的处理每天上百TB的数据增量
- Memcached · 最佳实践 · 热点 Key 问题解决方案
- MongoDB · 最佳实践 · 短连接Auth性能优化
- MySQL · 最佳实践 · RDS 只读实例延迟分析
- MySQL · TokuDB · TokuDB索引结构--Fractal Tree
- MySQL · TokuDB · Savepoint漫谈
- 数据库内核月报 - 2016/03
- MySQL · TokuDB · 事务子系统和 MVCC 实现
- MongoDB · 特性分析 · MMAPv1 存储引擎原理
- PgSQL · 源码分析 · 优化器逻辑推理
- SQLServer · BUG分析 · Agent 链接泄露分析
- Redis · 特性分析 · AOF Rewrite 分析
- MySQL · BUG分析 · Rename table 死锁分析
- MySQL · 物理备份 · Percona XtraBackup 备份原理
- GPDB · 特性分析· GreenPlum FTS 机制
- MySQL · 答疑解惑 · 备库Seconds_Behind_Master计算
- MySQL · 答疑解惑 · MySQL 锁问题最佳实践
- 数据库内核月报 - 2016/02
- MySQL · 引擎特性 · InnoDB 文件系统之文件物理结构
- MySQL · 引擎特性 · InnoDB 文件系统之IO系统和内存管理
- MySQL · 特性分析 · InnoDB transaction history
- PgSQL · 会议见闻 · PgConf.Russia 2016 大会总结
- PgSQL · 答疑解惑 · PostgreSQL 9.6 并行查询实现分析
- MySQL · TokuDB · TokuDB之黑科技工具
- PgSQL · 性能优化 · PostgreSQL TPC-C极限优化玩法
- MariaDB · 版本特性 · MariaDB 的 GTID 介绍
- MySQL · 特性分析 · 线程池
- MySQL · 答疑解惑 · mysqldump tips 两则
- 数据库内核月报 - 2016/01
- MySQL · 引擎特性 · InnoDB 事务锁系统简介
- GPDB · 特性分析· GreenPlum Primary/Mirror 同步机制
- MySQL · 专家投稿 · MySQL5.7 的 JSON 实现
- MySQL · 特性分析 · 优化器 MRR & BKA
- MySQL · 答疑解惑 · 物理备份死锁分析
- MySQL · TokuDB · Cachetable 的工作线程和线程池
- MySQL · 特性分析 · drop table的优化
- MySQL · 答疑解惑 · GTID不一致分析
- PgSQL · 特性分析 · Plan Hint
- MariaDB · 社区动态 · MariaDB on Power8 (下)
- 数据库内核月报 - 2015/12
- MySQL · 引擎特性 · InnoDB 事务子系统介绍
- PgSQL · 特性介绍 · 全文搜索介绍
- MongoDB · 捉虫动态 · Kill Hang问题排查记录
- MySQL · 参数优化 ·RDS MySQL参数调优最佳实践
- PgSQL · 特性分析 · 备库激活过程分析
- MySQL · TokuDB · 让Hot Backup更完美
- PgSQL · 答疑解惑 · 表膨胀
- MySQL · 特性分析 · Index Condition Pushdown (ICP)
- MariaDB · 社区动态 · MariaDB on Power8
- MySQL · 特性分析 · 企业版特性一览
- 数据库内核月报 - 2015/11
- MySQL · 社区见闻 · OOW 2015 总结 MySQL 篇
- MySQL · 特性分析 · Statement Digest
- PgSQL · 答疑解惑 · PostgreSQL 用户组权限管理
- MySQL · 特性分析 · MDL 实现分析
- PgSQL · 特性分析 · full page write 机制
- MySQL · 捉虫动态 · MySQL 外键异常分析
- MySQL · 答疑解惑 · MySQL 优化器 range 的代价计算
- MySQL · 捉虫动态 · ORDER/GROUP BY 导致 mysqld crash
- MySQL · TokuDB · TokuDB 中的行锁
- MySQL · 捉虫动态 · order by limit 造成优化器选择索引错误
- 数据库内核月报 - 2015/10
- MySQL · 引擎特性 · InnoDB 全文索引简介
- MySQL · 特性分析 · 跟踪Metadata lock
- MySQL · 答疑解惑 · 索引过滤性太差引起CPU飙高分析
- PgSQL · 特性分析 · PG主备流复制机制
- MySQL · 捉虫动态 · start slave crash 诊断分析
- MySQL · 捉虫动态 · 删除索引导致表无法打开
- PgSQL · 特性分析 · PostgreSQL Aurora方案与DEMO
- TokuDB · 捉虫动态 · CREATE DATABASE 导致crash问题
- PgSQL · 特性分析 · pg_receivexlog工具解析
- MySQL · 特性分析 · MySQL权限存储与管理
- 数据库内核月报 - 2015/09
- MySQL · 引擎特性 · InnoDB Adaptive hash index介绍
- PgSQL · 特性分析 · clog异步提交一致性、原子操作与fsync
- MySQL · 捉虫动态 · BUG 几例
- PgSQL · 答疑解惑 · 诡异的函数返回值
- MySQL · 捉虫动态 · 建表过程中crash造成重建表失败
- PgSQL · 特性分析 · 谈谈checkpoint的调度
- MySQL · 特性分析 · 5.6 并行复制恢复实现
- MySQL · 备库优化 · relay fetch 备库优化
- MySQL · 特性分析 · 5.6并行复制事件分发机制
- MySQL · TokuDB · 文件目录谈
- 数据库内核月报 - 2015/08
- MySQL · 社区动态 · InnoDB Page Compression
- PgSQL · 答疑解惑 · RDS中的PostgreSQL备库延迟原因分析
- MySQL · 社区动态 · MySQL5.6.26 Release Note解读
- PgSQL · 捉虫动态 · 执行大SQL语句提示无效的内存申请大小
- MySQL · 社区动态 · MariaDB InnoDB表空间碎片整理
- PgSQL · 答疑解惑 · 归档进程cp命令的core文件追查
- MySQL · 答疑解惑 · open file limits
- MySQL · TokuDB · 疯狂的 filenum++
- MySQL · 功能分析 · 5.6 并行复制实现分析
- MySQL · 功能分析 · MySQL表定义缓存
- 数据库内核月报 - 2015/07
- MySQL · 引擎特性 · Innodb change buffer介绍
- MySQL · TokuDB · TokuDB Checkpoint机制
- PgSQL · 特性分析 · 时间线解析
- PgSQL · 功能分析 · PostGIS 在 O2O应用中的优势
- MySQL · 引擎特性 · InnoDB index lock前世今生
- MySQL · 社区动态 · MySQL内存分配支持NUMA
- MySQL · 答疑解惑 · 外键删除bug分析
- MySQL · 引擎特性 · MySQL logical read-ahead
- MySQL · 功能介绍 · binlog拉取速度的控制
- MySQL · 答疑解惑 · 浮点型的显示问题
- 数据库内核月报 - 2015/06
- MySQL · 引擎特性 · InnoDB 崩溃恢复过程
- MySQL · 捉虫动态 · 唯一键约束失效
- MySQL · 捉虫动态 · ALTER IGNORE TABLE导致主备不一致
- MySQL · 答疑解惑 · MySQL Sort 分页
- MySQL · 答疑解惑 · binlog event 中的 error code
- PgSQL · 功能分析 · Listen/Notify 功能
- MySQL · 捉虫动态 · 任性的 normal shutdown
- PgSQL · 追根究底 · WAL日志空间的意外增长
- MySQL · 社区动态 · MariaDB Role 体系
- MySQL · TokuDB · TokuDB数据文件大小计算
- 数据库内核月报 - 2015/05
- MySQL · 引擎特性 · InnoDB redo log漫游
- MySQL · 专家投稿 · MySQL数据库SYS CPU高的可能性分析
- MySQL · 捉虫动态 · 5.6 与 5.5 InnoDB 不兼容导致 crash
- MySQL · 答疑解惑 · InnoDB 预读 VS Oracle 多块读
- PgSQL · 社区动态 · 9.5 新功能BRIN索引
- MySQL · 捉虫动态 · MySQL DDL BUG
- MySQL · 答疑解惑 · set names 都做了什么
- MySQL · 捉虫动态 · 临时表操作导致主备不一致
- TokuDB · 引擎特性 · zstd压缩算法
- MySQL · 答疑解惑 · binlog 位点刷新策略
- 数据库内核月报 - 2015/04
- MySQL · 引擎特性 · InnoDB undo log 漫游
- TokuDB · 产品新闻 · RDS TokuDB小手册
- PgSQL · 社区动态 · 说一说PgSQL 9.4.1中的那些安全补丁
- MySQL · 捉虫动态 · 连接断开导致XA事务丢失
- MySQL · 捉虫动态 · GTID下slave_net_timeout值太小问题
- MySQL · 捉虫动态 · Relay log 中 GTID group 完整性检测
- MySQL · 答疑释惑 · UPDATE交换列单表和多表的区别
- MySQL · 捉虫动态 · 删被引用索引导致crash
- MySQL · 答疑释惑 · GTID下auto_position=0时数据不一致
- 数据库内核月报 - 2015/03
- MySQL · 答疑释惑· 并发Replace into导致的死锁分析
- MySQL · 性能优化· 5.7.6 InnoDB page flush 优化
- MySQL · 捉虫动态· pid file丢失问题分析
- MySQL · 答疑释惑· using filesort VS using temporary
- MySQL · 优化限制· MySQL index_condition_pushdown
- MySQL · 捉虫动态·DROP DATABASE外键约束的GTID BUG
- MySQL · 答疑释惑· lower_case_table_names 使用问题
- PgSQL · 特性分析· Logical Decoding探索
- PgSQL · 特性分析· jsonb类型解析
- TokuDB ·引擎机制· TokuDB线程池
- 数据库内核月报 - 2015/02
- MySQL · 性能优化· InnoDB buffer pool flush策略漫谈
- MySQL · 社区动态· 5.6.23 InnoDB相关Bugfix
- PgSQL · 特性分析· Replication Slot
- PgSQL · 特性分析· pg_prewarm
- MySQL · 答疑释惑· InnoDB丢失自增值
- MySQL · 答疑释惑· 5.5 和 5.6 时间类型兼容问题
- MySQL · 捉虫动态· 变量修改导致binlog错误
- MariaDB · 特性分析· 表/表空间加密
- MariaDB · 特性分析· Per-query variables
- TokuDB · 特性分析· 日志详解
- 数据库内核月报 - 2015/01
- MySQL · 性能优化· Group Commit优化
- MySQL · 新增特性· DDL fast fail
- MySQL · 性能优化· 启用GTID场景的性能问题及优化
- MySQL · 捉虫动态· InnoDB自增列重复值问题
- MySQL · 优化改进· 复制性能改进过程
- MySQL · 谈古论今· key分区算法演变分析
- MySQL · 捉虫动态· mysql client crash一例
- MySQL · 捉虫动态· 设置 gtid_purged 破坏AUTO_POSITION复制协议
- MySQL · 捉虫动态· replicate filter 和 GTID 一起使用的问题
- TokuDB·特性分析· Optimize Table
- 数据库内核月报 - 2014/12
- MySQL· 性能优化·5.7 Innodb事务系统
- MySQL· 踩过的坑·5.6 GTID 和存储引擎那会事
- MySQL· 性能优化·thread pool 原理分析
- MySQL· 性能优化·并行复制外建约束问题
- MySQL· 答疑释惑·binlog event有序性
- MySQL· 答疑释惑·server_id为0的Rotate
- MySQL· 性能优化·Bulk Load for CREATE INDEX
- MySQL· 捉虫动态·Opened tables block read only
- MySQL· 优化改进· GTID启动优化
- TokuDB· Binary Log Group Commit with TokuDB
- 数据库内核月报 - 2014/11
- MySQL· 捉虫动态·OPTIMIZE 不存在的表
- MySQL· 捉虫动态·SIGHUP 导致 binlog 写错
- MySQL· 5.7改进·Recovery改进
- MySQL· 5.7特性·高可用支持
- MySQL· 5.7优化·Metadata Lock子系统的优化
- MySQL· 5.7特性·在线Truncate undo log 表空间
- MySQL· 性能优化·hash_scan 算法的实现解析
- TokuDB· 版本优化· 7.5.0
- TokuDB· 引擎特性· FAST UPDATES
- MariaDB· 性能优化·filesort with small LIMIT optimization
- 数据库内核月报 - 2014/10
- MySQL· 5.7重构·Optimizer Cost Model
- MySQL· 系统限制·text字段数
- MySQL· 捉虫动态·binlog重放失败
- MySQL· 捉虫动态·从库OOM
- MySQL· 捉虫动态·崩溃恢复失败
- MySQL· 功能改进·InnoDB Warmup特性
- MySQL· 文件结构·告别frm文件
- MariaDB· 新鲜特性·ANALYZE statement 语法
- TokuDB· 主备复制·Read Free Replication
- TokuDB· 引擎特性·压缩
- 数据库内核月报 - 2014/09
- MySQL· 捉虫动态·GTID 和 DELAYED
- MySQL· 限制改进·GTID和升级
- MySQL· 捉虫动态·GTID 和 binlog_checksum
- MySQL· 引擎差异·create_time in status
- MySQL· 参数故事·thread_concurrency
- MySQL· 捉虫动态·auto_increment
- MariaDB· 性能优化·Extended Keys
- MariaDB·主备复制·CREATE OR REPLACE
- TokuDB· 参数故事·数据安全和性能
- TokuDB· HA方案·TokuDB热备
- 数据库内核月报 - 2014/08
- MySQL· 参数故事·timed_mutexes
- MySQL· 参数故事·innodb_flush_log_at_trx_commit
- MySQL· 捉虫动态·Count(Distinct) ERROR
- MySQL· 捉虫动态·mysqldump BUFFER OVERFLOW
- MySQL· 捉虫动态·long semaphore waits
- MariaDB·分支特性·支持大于16K的InnoDB Page Size
- MariaDB·分支特性·FusionIO特性支持
- TokuDB· 性能优化·Bulk Fetch
- TokuDB· 数据结构·Fractal-Trees与LSM-Trees对比
- TokuDB·社区八卦·TokuDB团队