RAC日志收集和分析工具TFA

2023-12-05 05:58
文章标签 分析 工具 日志 收集 rac tfa

本文主要是介绍RAC日志收集和分析工具TFA,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

TFA是在11.2版本上推出的用来收集Grid Infrastructure/RAC环境下的诊断日志的工具,它可以用非常简单的命令协助用户收集RAC里的日志,下面从几个方面介绍:

1. 便捷的日志收集和分析工具Trace File Analyzer

客户在和技术支持的工程师解决GIRAC)问题的时候,一个最大的问题就是及时的收集各个节点上和问题相关的日志和诊断数据,特别是收集的数据还有跨节点。另外,RAC里的trace日志文件是轮循使用的,如果发生问题之后不及时收集日志就会被覆盖。对于单机的环境ADRAutomatic Diagnostic Repository)虽然可以很好的避免这个问题,它会对故障发生后对故障生成的文件进行打包,但是ADR并不能收集RAC的日志。对于Cluster的日志收集我们以前会经常使用diagcollection.pl这个脚本,但是这个脚本的弊端是它不会甄别日志里的内容,会把所有的RAC日志从头至尾都收集一遍。如果您曾经使用过diagcollection.pl一定会知道这个脚本收集的日志是非常大的,而且diagcollection.pl的脚本必须要在各个节点上分别使用root用户分别运行,使用不便利。
TFA基本上克服了上边的这些问题,TFA通过在每一个节点上运行一个Java的虚拟环境,来判断什么时候需要启动来收集,压缩日志,并且来判断哪些日志是解决问题必要,TFA是运行在GIRDBMS之外的产品,所以它甚至和当前使用的版本和平台都没有关系。
所以,在处理Oracle GI RAC问题时,使用 TFA可以一键收集所有需要的日志,而且会过滤掉不需要的日志。
也有客户担心使用TFA会对系统有影响,了解了上述它的功能之后,您就可以知道它只是一个日志收集工具,并不会对系统产生变更,他对OS的负载压力是轻量级的。

2.TFA的进程介绍和收集日志的便利方法:
TFA的功能是由一个TFA的进程和TFA的命令接口CLI构成,我们可以把它安装布置在任何环境里,TFA进程是个JAVA的进程,如下:

节点1上:

[grid@ogg01
~]$ ps -ef |grep java

root 3335 1 2 Feb26 ? 00:55:28 /u01/app/11.2.0.4/grid/jdk/jre/bin/java
-Xms128m -Xmx512m -classpath
/u01/app/11.2.0.4/grid/tfa/nascds10/tfa_home/jlib/RATFA.jar:/u01/app/11.2.0.4/grid/tfa/nascds10/tfa_home/jlib/je-5.0.84.jar:/u01/app/11.2.0.4/grid/tfa/nascds10/tfa_home/jlib/ojdbc5.jar:/u01/app/11.2.0.4/grid/tfa/nascds10/tfa_home/jlib/commons-io-2.1.jar
oracle.rat.tfa.TFAMain /u01/app/11.2.0.4/grid/tfa/nascds10/tfa_home

节点2上:

[grid@ogg02
~]$ ps -ef |grep TFA

root 3295 1 0 Feb25 ? 00:19:26
/u01/app/11.2.0.4/grid/jdk/jre/bin/java -Xms64m -Xmx256m -classpath
/u01/app/11.2.0.4/grid/tfa/nascds11/tfa_home/jar/RATFA.jar:/u01/app/11.2.0.4/grid/tfa/nascds11/tfa_home/jar/je-4.0.103.jar:/u01/app/11.2.0.4/grid/tfa/nascds11/tfa_home/jar/ojdbc6.jar
oracle.rat.tfa.TFAMain /u01/app/11.2.0.4/grid/tfa/nascds11/tfa_home

[grid@nascds11 ~]$

以上可以看到TFAMain是个由root用户启动的,它是个多线程的进程,同时会自动的完成对节点和节点之间以及CLI接口的驱动任务。节点和节点之间的TFAMain进程通过secure socket彼此进行监听并进行任务交互。
它的启动和RACohasd一样,也是配置在/etc/init.d中,如:
/etc/init.d/init.tfa
关于TFA的各种环境的安装,升级,卸载等管理,请参考下边的文档:
TFA Collector- The Preferred Tool for Automatic or ADHOC Diagnostic Gathering Across All Cluster Nodes [ID 1513912.2]
以及我之前写的博客:Oracle GI 日志收集工具 - TFA 简介

我们这里来介绍一下如何能更方便的使用TFA进行日志过滤和收集,以及新加的功能。
2.1 我们先看以下tfa管理的节点和目前的状态:
[root@ogg01 tmp]# tfactl print hosts
Host Name : ogg01
Host Name : ogg02
[root@ogg01 tmp]# tfactl print status
.---------------------------------------------------------------------------------------------.

| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |

+-------+---------------+-------+------+------------+----------------------+------------------+

| ogg01 | RUNNING | 18686 | 5000 | 12.1.2.6.3 | 12126320190104141621 | COMPLETE |

| ogg02 | RUNNING | 18030 | 5000 | 12.1.2.6.3 | 12126320190104141621 | COMPLETE |

'-------+---------------+-------+------+------------+----------------------+------------------'

2.2
如果我们安装了一些其它工具收集的日志,我们想让TFA来帮我们一同管理,我们也可以直接把对应的目录添加进来,语法查询,请使用以下命令:
[root@ogg01 tmp]# tfactl directory -h

如:
/u01/app/11.2.0/grid/bin/tfactl directory add /nmon/log/
[root@ogg01 oswbb]# mkdir -p /nmon/log
[root@ogg01 oswbb]# /u01/app/11.2.0/grid/bin/tfactl directory add /nmon/log
Unable to determine component for directory: /nmon/log
Please choose a component for this Directory [RDBMS|CRS|ASM|INSTALL|OS|CFGTOOLS|TNS|DBWLM|ACFS|ALL] : OS
Do you wish to assign more components to this Directory ? [Y/y/N/n] [N] n
Running Inventory ...
Successfully added directory to TFA

2.3
客户可能会经常碰见我们技术支持的人需要收集OS watcher ,最新版本的TFA在安装的过程会把OSW也进行封装,并且按照默认的方式进行启动,如:
[root@ogg01 tmp]# ps -ef |grep osw
grid 19047 1 0 12:20 ? 00:00:00 /bin/sh ./OSWatcher.sh 30 48 NONE /u01/app/grid/tfa/repository/suptools/ogg01/oswbb/grid/archive
grid 20199 19047 0 12:20 ? 00:00:00 /bin/sh ./OSWatcherFM.sh 48 /u01/app/grid/tfa/repository/suptools/ogg01/oswbb/grid/archive

但是如果您想让这个功能完全生效,您还是需要把Exampleprivate.net修改成private.net,并且修改连的私网和平台信息,才能对私网数据进行采集。

2.4 按照我们自己定制的规则进行日志收集
收集日志的语法如下,可以通过以下命令进行查询获取:
[root@ogg01 oswbb]# tfactl diagcollect -h
我们这里解释集中常用的方式:
2.4.1 收集2个小时之前的由TFA管理的所有的日志:
#tfactl diagcollect –all –since 2h
2.4.2
收集1天内由TFA管理的所有日志,并压缩存放在本地foo为后缀
#tfactl diagcollect -since 1d -z foo
[root@ogg01 oswbb]# tfactl diagcollect -since 1d -z foo
Collecting data for all nodes
Collection Id : 20190228124457ogg01
Repository Location in ogg01 : /u01/app/grid/tfa/repository
Collection monitor will wait up to 30 seconds for collections to start
2019/02/28 12:45:01 CST : Collection Name : tfa_foo.zip
2019/02/28 12:45:01 CST : Sending diagcollect request to host : ogg02
2019/02/28 12:45:01 CST : Scanning of files for Collection in progress...
2019/02/28 12:45:01 CST : Collecting extra files...
2019/02/28 12:45:06 CST : Getting list of files satisfying time range [02/27/2019 12:45:01 CST, 02/28/2019 12:45:06 CST]
2019/02/28 12:45:06 CST : Starting Thread to identify stored files to collect
2019/02/28 12:45:06 CST : Getting List of Files to Collect
2019/02/28 12:45:07 CST : Trimming file : ogg01/u01/app/11.2.0/grid/log/ogg01/client/olsnodes.log with original file size : 2.7MB
2019/02/28 12:45:07 CST : Finished Getting List of Files to Collect
2019/02/28 12:45:07 CST : Collecting ADR incident files...
2019/02/28 12:45:07 CST : Waiting for collection of extra files

Logs are being collected to: /u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_44_57_CST_2019_node_all
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_44_57_CST_2019_node_all/ogg01.tfa_foo.zip
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_44_57_CST_2019_node_all/ogg02.tfa_foo.zip
2.4.3
收集1个小时的所有节点上数据库相关的日志,并压缩放在本地,以test为后缀:
tfactl diagcollect -database orcl -since 1h -z test
[root@ogg01 oswbb]# tfactl diagcollect -database orcl -since 1h -z test
Collecting data for all nodes
Collection Id : 20190228124936ogg01
Repository Location in ogg01 : /u01/app/grid/tfa/repository
Collection monitor will wait up to 30 seconds for collections to start
2019/02/28 12:49:39 CST : Collection Name : tfa_test.zip
2019/02/28 12:49:39 CST : Sending diagcollect request to host : ogg02
2019/02/28 12:49:40 CST : Scanning of files for Collection in progress...
……
2019/02/28 12:50:01 CST : Total time taken : 22s
2019/02/28 12:50:01 CST : Remote Collection in Progress...
2019/02/28 12:50:20 CST : ogg02:Completed Collection
2019/02/28 12:50:20 CST : Completed collection of zip files.
Logs are being collected to: /u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_49_36_CST_2019_node_all
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_49_36_CST_2019_node_all/ogg01.tfa_test.zip
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_49_36_CST_2019_node_all/ogg02.tfa_test.zip
2.4.4
收集1个小时的节点ogg01上的日志
tfactl diagcollect -node ogg01 -since 1h
[root@ogg01 oswbb]# tfactl diagcollect -node ogg01 -since 1h
Collecting data for ogg01 node(s)
Collection Id : 20190228125644ogg01
Repository Location in ogg01 : /u01/app/grid/tfa/repository
Collection monitor will wait up to 30 seconds for collections to start
....
2019/02/28 12:56:48 CST : Collection Name : tfa_Sun_Feb_28_12_56_44_CST_2019.zip
2019/02/28 12:56:48 CST : Scanning of files for Collection in progress...
2019/02/28 12:56:48 CST : Collecting extra files...
.....
Logs are being collected to: /u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_56_44_CST_2019_node_ogg01
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_56_44_CST_2019_node_ogg01/ogg01.tfa_Sun_Feb_28_12_56_44_CST_2019.zip

2.4.5
收集所有节点上在"Feb/28/2019"发生的日志
tfactl diagcollect -for "Feb/28/2019"
[root@ogg01 oswbb]# tfactl diagcollect -for "Feb/28/2019"
Collecting data for all nodes
Scanning files for Feb/28/2019 00:00:00
Collection Id : 20190228125814ogg01
Repository Location in ogg01 : /u01/app/grid/tfa/repository
Collection monitor will wait up to 30 seconds for collections to start
2019/02/28 12:58:20 CST : Collection Name : tfa_Sun_Feb_28_12_58_14_CST_2019.zip
2019/02/28 12:58:20 CST : Sending diagcollect request to host : ogg02
2019/02/28 12:58:20 CST : Scanning of files for Collection in progress...
2019/02/28 12:58:20 CST : Collecting extra files...
.....
Logs are being collected to: /u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_58_14_CST_2019_node_all
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_12_58_14_CST_2019_node_all/ogg01.tfa_Sun_Feb_28_12_58_14_CST_2019.zip

2.4.6
指定时间区域,对节点1上的ASM的日志进行收集

tfactl diagcollect -asm -node ogg01 -from "Feb/27/2019" -to "Feb/28/2019 01:00:00"
[root@ogg01 oswbb]# tfactl diagcollect -asm -node ogg01 -from "Feb/27/2019" -to "Feb/28/2019 01:00:00"
Collecting data for ogg01 node(s)
Scanning files from Feb/27/2019 00:00:00 to Feb/28/2019 01:00:00
Collection Id : 20190228130124ogg01
Repository Location in ogg01 : /u01/app/grid/tfa/repository
Collection monitor will wait up to 30 seconds for collections to start
2019/02/28 13:01:28 CST : Collection Name : tfa_Sun_Feb_28_13_01_24_CST_2019.zip
2019/02/28 13:01:28 CST : Scanning of files for Collection in progress...
2019/02/28 13:01:28 CST : Collecting extra files...
Logs are being collected to: /u01/app/grid/tfa/repository/collection_Sun_Feb_28_13_01_24_CST_2019_node_ogg01
/u01/app/grid/tfa/repository/collection_Sun_Feb_28_13_01_24_CST_2019_node_ogg01/ogg01.tfa_Sun_Feb_28_13_01_24_CST_2019.zip


通过以上例子,我们可以看到我们可以通过指定一些我们需要的参数来进行日志的过滤和收集,更准确的收集我们需要的日志,避免了通过diagcollection.pl收集上来冗余的无用的日志文件,提高了沟通的效率,简化了分析难度。
除了我们上述介绍的内容之外,TFA还有一些其它的功能:如,采集AWR的功能,自动收集日志的功能,对非root用户进行权限控制的功能等等。

3.封装的新功能
TFA版本从12.1.2.3.0之后封装了很多现有的Oracle问题分析的工具,包括ORAchk EXAchkOSWatcherProcwatcherORATOPSQLTDARDAalertsummary等等,这些工具我们都可以通过TFACL的接口进行调用,我们可以通过以下方式查看这些封装的工具以及状态:
tfactl> toolstatus
.------------------------------------.
| External Support Tools |
+-------+--------------+-------------+
| Host | Tool | Status |
+-------+--------------+-------------+
| ogg01 | alertsummary | DEPLOYED |
| ogg01 | exachk | DEPLOYED |
| ogg01 | ls | DEPLOYED |
| ogg01 | pstack | DEPLOYED |
| ogg01 | orachk | DEPLOYED |
| ogg01 | sqlt | DEPLOYED |
| ogg01 | grep | DEPLOYED |
| ogg01 | summary | DEPLOYED |
| ogg01 | prw | NOT RUNNING |
| ogg01 | vi | DEPLOYED |
| ogg01 | tail | DEPLOYED |
| ogg01 | param | DEPLOYED |
| ogg01 | dbglevel | DEPLOYED |
| ogg01 | darda | DEPLOYED |
| ogg01 | history | DEPLOYED |
| ogg01 | oratop | DEPLOYED |
| ogg01 | oswbb | RUNNING |
| ogg01 | changes | DEPLOYED |
| ogg01 | events | DEPLOYED |
| ogg01 | ps | DEPLOYED |
'-------+--------------+-------------'

这里给大家演示几个常用的调用方式:
3.1 调用orachk
[root@ogg01 oswbb]# tfactl
tfactl> orachk
This computer is for [S]ingle instance database or part of a [C]luster to run RAC database [S|C] [C]:C
Unable to determine nodes in cluster. Do you want to enter manually.[y/n][y]y
Enter cluster node names delimited by comma.by defalut localhost will be printed. (eg. node2,node3,node4)
ogg01,ogg01,ogg02
Checking ssh user equivalency settings on all nodes in cluster
Node ogg02 is configured for ssh user equivalency for root user
CRS binaries found at /u01/app/11.2.0/grid. Do you want to set CRS_HOME to /u01/app/11.2.0/grid?[y/n][y]
Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
Parsing file ogg01_iostat_16.02.28.1200.dat ...
Parsing file ogg01_iostat_16.02.28.1300.dat ...
Parsing file ogg01_vmstat_16.02.28.1200.dat ...
Parsing file ogg01_vmstat_16.02.28.1300.dat ...
Parsing file ogg01_netstat_16.02.28.1200.dat ...
Parsing file ogg01_netstat_16.02.28.1300.dat ...
Parsing file ogg01_top_16.02.28.1200.dat ...
Parsing file ogg01_top_16.02.28.1300.dat ...
Parsing file ogg01_ps_16.02.28.1200.dat ...
Parsing file ogg01_ps_16.02.28.1300.dat ...
Parsing Completed.
Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs
Enter 5 to Display Disk IO Graphs
Enter 6 to Generate All CPU Gif Files
Enter 7 to Generate All Memory Gif Files
Enter 8 to Generate All Disk Gif Files
Enter L to Specify Alternate Location of Gif Directory
Enter T to Alter Graph Time Scale Only (Does not change analysis dataset)
Enter D to Return to Default Graph Time Scale
Enter R to Remove Currently Displayed Graphs
Enter A to Analyze Data
Enter S to Analyze Subset of Data(Changes analysis dataset including graph time scale)
Enter P to Generate A Profile
Enter X to Export Parsed Data to File
Enter Q to Quit Program
Please Select an Option:1


 

3.3调用 Procwatcher
tfactl> prw deploy
Sun Feb 28 13:26:15 CST 2019: Building default prwinit.ora at /u01/app/grid/tfa/repository/suptools/prw/root/prwinit.ora
Clusterware must be running with adequate permissions to deploy, exiting
tfactl> prw start
Sun Feb 28 13:27:00 CST 2019: Starting Procwatcher as user root
Sun Feb 28 13:27:00 CST 2019: Thank you for using Procwatcher.
Sun Feb 28 13:27:00 CST 2019: Please add a comment to Oracle Support Note 459694.1
Sun Feb 28 13:27:00 CST 2019: if you have any comments, suggestions, or issues with this tool.
Procwatcher files will be written to: /u01/app/grid/tfa/repository/suptools/prw/root
Sun Feb 28 13:27:00 CST 2019: Started Procwatcher
tfactl> prw stop
Sun Feb 28 13:27:20 CST 2019: Stopping Procwatcher
Sun Feb 28 13:27:20 CST 2019: Checking for stray debugging sessions...(waiting 1 second)
Sun Feb 28 13:27:21 CST 2019: No debugging sessions found, all good, exiting...
Sun Feb 28 13:27:21 CST 2019: Thank you for using Procwatcher.
Sun Feb 28 13:27:21 CST 2019: Please add a comment to Oracle Support Note 459694.1
Sun Feb 28 13:27:21 CST 2019: if you have any comments, suggestions, or issues with this tool.
Sun Feb 28 13:27:21 CST 2019: Procwatcher Stopped

由于封装功能比较多,而且Oracle还在进一步增强,我们没办法一一列出,但是以上的这些工具的植入调用,我们都可以通过tfactl的接口来简单的实现。

这篇关于RAC日志收集和分析工具TFA的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/456416

相关文章

Python按照24个实用大方向精选的上千种工具库汇总整理

《Python按照24个实用大方向精选的上千种工具库汇总整理》本文整理了Python生态中近千个库,涵盖数据处理、图像处理、网络开发、Web框架、人工智能、科学计算、GUI工具、测试框架、环境管理等多... 目录1、数据处理文本处理特殊文本处理html/XML 解析文件处理配置文件处理文档相关日志管理日期和

使用Python开发一个Ditto剪贴板数据导出工具

《使用Python开发一个Ditto剪贴板数据导出工具》在日常工作中,我们经常需要处理大量的剪贴板数据,下面将介绍如何使用Python的wxPython库开发一个图形化工具,实现从Ditto数据库中读... 目录前言运行结果项目需求分析技术选型核心功能实现1. Ditto数据库结构分析2. 数据库自动定位3

python使用Akshare与Streamlit实现股票估值分析教程(图文代码)

《python使用Akshare与Streamlit实现股票估值分析教程(图文代码)》入职测试中的一道题,要求:从Akshare下载某一个股票近十年的财务报表包括,资产负债表,利润表,现金流量表,保存... 目录一、前言二、核心知识点梳理1、Akshare数据获取2、Pandas数据处理3、Matplotl

python panda库从基础到高级操作分析

《pythonpanda库从基础到高级操作分析》本文介绍了Pandas库的核心功能,包括处理结构化数据的Series和DataFrame数据结构,数据读取、清洗、分组聚合、合并、时间序列分析及大数据... 目录1. Pandas 概述2. 基本操作:数据读取与查看3. 索引操作:精准定位数据4. Group

Spring Boot集成/输出/日志级别控制/持久化开发实践

《SpringBoot集成/输出/日志级别控制/持久化开发实践》SpringBoot默认集成Logback,支持灵活日志级别配置(INFO/DEBUG等),输出包含时间戳、级别、类名等信息,并可通过... 目录一、日志概述1.1、Spring Boot日志简介1.2、日志框架与默认配置1.3、日志的核心作用

MySQL中EXISTS与IN用法使用与对比分析

《MySQL中EXISTS与IN用法使用与对比分析》在MySQL中,EXISTS和IN都用于子查询中根据另一个查询的结果来过滤主查询的记录,本文将基于工作原理、效率和应用场景进行全面对比... 目录一、基本用法详解1. IN 运算符2. EXISTS 运算符二、EXISTS 与 IN 的选择策略三、性能对比

MySQL 内存使用率常用分析语句

《MySQL内存使用率常用分析语句》用户整理了MySQL内存占用过高的分析方法,涵盖操作系统层确认及数据库层bufferpool、内存模块差值、线程状态、performance_schema性能数据... 目录一、 OS层二、 DB层1. 全局情况2. 内存占js用详情最近连续遇到mysql内存占用过高导致

深度解析Nginx日志分析与499状态码问题解决

《深度解析Nginx日志分析与499状态码问题解决》在Web服务器运维和性能优化过程中,Nginx日志是排查问题的重要依据,本文将围绕Nginx日志分析、499状态码的成因、排查方法及解决方案展开讨论... 目录前言1. Nginx日志基础1.1 Nginx日志存放位置1.2 Nginx日志格式2. 499

Olingo分析和实践之EDM 辅助序列化器详解(最佳实践)

《Olingo分析和实践之EDM辅助序列化器详解(最佳实践)》EDM辅助序列化器是ApacheOlingoOData框架中无需完整EDM模型的智能序列化工具,通过运行时类型推断实现灵活数据转换,适用... 目录概念与定义什么是 EDM 辅助序列化器?核心概念设计目标核心特点1. EDM 信息可选2. 智能类

Olingo分析和实践之OData框架核心组件初始化(关键步骤)

《Olingo分析和实践之OData框架核心组件初始化(关键步骤)》ODataSpringBootService通过初始化OData实例和服务元数据,构建框架核心能力与数据模型结构,实现序列化、URI... 目录概述第一步:OData实例创建1.1 OData.newInstance() 详细分析1.1.1