
文章目录文档用途详细信息文档用途学习和使用数据库挖掘工具XlogMiner/WalMiner。详细信息1、介绍WalMiner是从PostgreSQL的WAL(write ahead logs)日志中解析出执行的SQL语句的工具并能生成出对应的undo SQL语句。referenceXlogMiner renamed to WalMinerXlogMiner Enhancements Released and Renamed to WalMinerPosted on 2019-02-22 by Highgo Softwarereference XLogMinerreference WalMiner注意walminer是Highgo开源的一款软件。2、walminer安装1配置要求需要将数据库日志级别配置需要大于minimal创建归档路径mkdir /home/hgdb565/archive/ -p必须设置如下三个参数据库日志级别配置需要大于minimalwal_level minimal, archive, hot_standby, or logical 若想做最完整的日志挖掘建议设置为logical。alter system set wal_level ‘logical’;alter system set archive_mode on;alter system set archive_directory ‘/home/hgdb565/archive’;修改后重启数据库生效。2版本查看查看本机环境数据库版本select kernel_version();PG版本支持PG9.5.0之前的版本没有做过测试PG9.5.0~PG10.X版本使用WalMiner_10_XPG11之后的版本使用WalMiner_11_X3编译安装下载软件读README.md参考安装和使用步骤。下载地址WalMinerunzip movead-XLogMiner-WalMiner_10_0_1.zipcp -rpi ./XLogMiner/walminer/ ./postgresql-10.6/contrib/cd ./postgresql-10.6/contrib/walminermake make installmake install/bin/mkdir -p ‘/opt/pg106/lib/postgresql’/bin/mkdir -p ‘/opt/pg106/share/postgresql/extension’/bin/mkdir -p ‘/opt/pg106/share/postgresql/extension’/bin/install -c -m 755 walminer.so ‘/opt/pg106/lib/postgresql/walminer.so’/bin/install -c -m 644 ./walminer.control ‘/opt/pg106/share/postgresql/extension/’/bin/install -c -m 644 ./walminer–1.0.sql ‘/opt/pg106/share/postgresql/extension/’注意将walminer目录放置到编译通过的PG源码的…/contrib/目录下hgdb无源码编译目录。可在对应版本的pg中编译好再将文件拷贝到hgdb的目录下。cd /opt/pg106/share/postgresql/extension/scp wal* hgdb565x.x.6.10:/opt/HighGo5.6.5/share/postgresql/extensioncd /opt/pg106/lib/postgresqlscp walminer.so hgdb565x.x.6.10:/opt/HighGo5.6.5/lib/postgresql/4创建扩展创建walminer的extensioncreate extension walminer;注意HGDBV5之前为 create extension xlogminer;2、使用方法场景一从WAL日志产生的数据库中直接执行解析1.创建walminer的extensioncreate extension walminer;注意已创建可忽略2 Add wal日志文件– 增加wal文件select walminer_wal_add(‘/opt/HighGo5.6.5/data/pg_wal’);– 注参数可以为目录或者文件3 Remove wal日志文件– 移除wal文件select walminer_wal_remove(‘/opt/HighGo5.6.5/data/pg_wal’);– 注参数可以为目录或者文件4 List wal日志文件– 列出wal文件select walminer_wal_list();5 执行解析select walminer_start(’START_TIMSTAMP’,’STOP_TIMESTAMP’,’START_XID’,’STOP_XID’)—如果分析全部日志select walminer_start(‘null’,‘null’,0,0);—将系统表修改结果输出到$PGDATA/walminer/temp下select walminer_start(‘null’,‘null’,0,0,true);START_TIMESTAMP指定输出结果中最早的记录条目即从该时间开始输出分析数据若该参数值为空则以分析日志列表中最早数据开始输出若该参数值指定时间没有包含在所分析xlog列表中即通过分析发现全部早于该参数指定时间则返回空值。STOP_TIMESTAMP指定数据结果中最晚的记录条目即输出结果如果大于该时间则停止分析不需要继续输出如果该参数值为空则从START_TIMESTAMP开始的所有日志都进行分析和输出。START_XID作用与START_TIMESTAMP相同指定开始的XID值STOP_XID作用与STOP_TIMESTAMP相同指定结束的XID值注意两组参数只能有一组为有效输入否则报错。解析结果查看select * from walminer_contents;7 结束walminer操作该函数作用为释放内存结束日志分析该函数没有参数。select walminer_stop();场景二从非WAL产生的数据库中执行WAL日志解析注意要求执行解析的PostgreSQL数据库和被解析的为同一版本于生产数据库1.创建walminer的extensioncreate extension walminer;2.生成数据字典select walminer_build_dictionary(‘/opt/HighGo5.6.5/store_dictionary’);– 注参数可以为目录或者文件于测试数据库1.创建walminer的extensioncreate extension walminer;将字典文件和日志传输到测试数据库将父目录创建出来mkdir -p /opt/pg106/data/archive_hgdb565cd /opt/pg106/scp pg10x.x.6.12:/opt/HighGo5.6.5/store_dictionary .copy wal or archivecd /opt/pg106/data/archive_hgdb565scp pg10x.x.6.12:/opt/HighGo5.6.5/data/pg_wal/0* .load数据字典select walminer_load_dictionary(‘/opt/pg106/store_dictionary’);– 注参数可以为目录或者文件如出现一下错误将父目录创建出来即可postgres# select walminer_load_dictionary(‘/opt/pg106/store_dictionary’);ERROR: It is failed to create dictionary “/opt/pg106/data/walminer/temp”.将父目录创建出来mkdir -p /opt/pg106/data/walmineradd wal日志文件– 增加wal文件select walminer_wal_add(‘/opt/pg106/data/archive_hgdb565’);– 注参数可以为目录或者文件remove wal日志文件– 移除wal文件select walminer_wal_remove(‘/opt/test/wal’);– 注参数可以为目录或者文件list wal日志文件– 列出wal文件select walminer_wal_list();– 注参数可以为目录或者文件执行解析select walminer_start(‘null’,‘null’,0,0);7. 解析结果查看select * from walminer_contents;select timestamptz,record_user,op_type,op_text,op_undo from walminer_contents where timestamptz ‘2020-05-12 14:10:00’::timestamp and timestamptz ‘2020-05-12 14:20:00’::timestamp and op_type‘DELETE’;timestamptz | record_user | op_type | op_text | op_undo-------------------------------±------------±--------±-------------------------------------------------------------------------------------------±-----------------------------------------------------------2020-05-12 14:14:49.18915608 | highgo | DELETE | DELETE FROM “public”.“test_t” WHERE “id”1 AND “name”‘4e4ff1685131a4cdc6643b523620e10f’; | INSERT INTO “public”.“test_t”(“id”, “name”) VALUES(1, 4e4ff1685131a4cdc6643b523620e10f’);2020-05-12 14:14:49.18915608 | highgo | DELETE | DELETE FROM “public”.“test_t” WHERE “id”2 AND “name”‘126ce4b149a49d1a68f6c2a0af4b2bdd’; | INSERT INTO “public”.“test_t”(“id”, “name”) VALUES(2, 126ce4b149a49d1a68f6c2a0af4b2bdd’);2020-05-12 14:14:49.18915608 | highgo | DELETE | DELETE FROM “public”.“test_t” WHERE “id”3 AND “name”‘dc65893ca95dee5e6547854d3fd4f505’; | INSERT INTO “public”.“test_t”(“id”, “name”) VALUES(3, dc65893ca95dee5e6547854d3fd4f505’);2020-05-12 14:14:49.18915608 | highgo | DELETE | DELETE FROM “public”.“test_t” WHERE “id”4 AND “name”‘f04daec7b38fe12ced141b1c6a01346a’; | INSERT INTO “public”.“test_t”(“id”, “name”) VALUES(4, f04daec7b38fe12ced141b1c6a01346a’);2020-05-12 14:14:49.18915608 | highgo | DELETE | DELETE FROM “public”.“test_t” WHERE “id”5 AND “name”‘05bba4c686505ec88bd65ddc8ff42498’; | INSERT INTO “public”.“test_t”(“id”, “name”) VALUES(5, 05bba4c686505ec88bd65ddc8ff42498’);2020-05-12 14:14:49.18915608 | highgo | DELETE | DELETE FROM “public”.“test_t” WHERE “id”6 AND “name”‘a8fa80d0188d37629e057aa6a52fe8d8’; | INSERT INTO “public”.“test_t”(“id”, “name”) VALUES(6, a8fa80d0188d37629e057aa6a52fe8d8’);…8.结束walminer操作该函数作用为释放内存结束日志分析该函数没有参数。select walminer_stop();注意walminer_contents是walminer自动生成的临时表因此当session断开再重新进入或其他session中解析数据不可见。这么做主要是基于安全考虑。如果希望保留解析结果可利用create xxx as select * from walminer_contents;写入普通表中。使用限制本版本只解析DML语句不处理DDL语句执行了删除表、truncate表、更改表的表空间、更改表字段的类型、vacuum full这样的DDL语句后发生DDL语句之前的此表相关的DML语句不会再被解析。解析结果依赖于数据字典。举例创建表t1,所有者为user1但是中间将所有者改为user2。那解析结果中所有t1相关操作所有者都将标示为user2wal日志如果发生缺失在缺失的wal日志中发生提交的数据都不会在解析结果中出现解析结果中undo字段的ctid属性是发生变更“当时”的值如果因为vacuum等操作导致ctid发生变更这个值将不准确。对于有可能存在重复行的数据我们需要通过这个值确定undo对应的tuple条数不代表可以直接执行该undo语句。执行了表字段drop的DDL语句后,发生DDL语句之前的这个字段相关的值都会被解析为encode‘AD976BC56F’hex的形式另外自定义类型也会解析为这种形式只能解析与数据字典时间线一致的wal文件不建议使用walminer解析大宗copy语句在同一个事务中插入大量数据行产生的wal日志,这会导致解析过程中的效率低下和内存占用过高