30–hs_err_pid
1、介绍
1、 当jvm出现致命错误时,会生成一个错误文件hs_err_pid.log;
2、 hs_err_pid.log文件默认会生成到工作目录下;
3、 hs_err_pid.log包括了导致jvm崩溃的重要信息,可以通过分析该文件定位到导致崩溃的根源,从而改善以保证系统稳定;
1.1、设置 hs_err_pid.log 文件的保存位置
-XX:ErrorFile=./hs_err_pid<pid>.log
1.2、 hs_err_pid.log 包含的 几类关键信息
1、 日志头文件;
2、 导致崩溃的线程信息;
3、 所有线程信息;
4、 安全点和锁信息;
5、 堆信息;
6、 本地代码缓存;
7、 编译事件;
8、 gc相关记录;
9、 jvm内存映射;
10、 jvm启动参数;
11、 服务器信息;
2、日志头文件
2.1、介绍
1、 日志头文件包含概要信息,简述了导致崩溃的原因;
2、 导致崩溃的原因很多,常见的原因有jvm自身的bug,应用程序错误,jvm参数配置不当,服务器资源不足,jni调用错误等;
2.2、字段解析
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fc23dd5dae6, pid=1, tid=0x00007fc2009fdb20
#
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.15.0
# Distribution: Custom build (Wed Jan 29 10:43:50 UTC 2020)
# Problematic frame:
# J 77871 C2 com.hd.rcugrc.security.user.db.CachedUserInfoManagerImpl.getUser(Ljava/lang/String;Ljava/lang/String;)Lcom/hd/rcugrc/security/user/User; (104 bytes) @ 0x00007fc23dd5dae6 [0x00007fc23dd5da60+0x86]
#
# Core dump written. Default location: /opt/app/core or core.1
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
# https://icedtea.classpath.org/bugzilla
#
2.2.1、内容
SIGSEGV (0xb) at pc=0x00007fc23dd5dae6, pid=1, tid=0x00007fc2009fdb20
SIGSEGV
1、 信号名称;
2、 除了"SIGSEGV(0xb)"以外,常见的描述有以下几种;
1、 “EXCEPTION_ACCESS_VIOLATION”:该描述表示jvm崩溃时正在执行jvm自身的代码,这往往是因为jvm的bug导致的崩溃;
2、 “EXCEPTION_STACK_OVERFLOW”:该描述表示这是个栈溢出导致的错误,这往往是应用程序中存在深层递归导致的;
0xb
信号码
pc=0x00007fc23dd5dae6
程序计数器的值
pid=1
进程ID
tid=0x00007fc2009fdb20
线程号
2.2.2、内容
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 compressed oops)
1、 JRE和JVM的版本信息;
2、 mixedmode:运行在mixed模式下;
2.2.3、内容
# Problematic frame:
# J 77871 C2 com.hd.rcugrc.security.user.db.CachedUserInfoManagerImpl.getUser(Ljava/lang/String;Ljava/lang/String;)Lcom/hd/rcugrc/security/user/User; (104 bytes) @ 0x00007fc23dd5dae6 [0x00007fc23dd5da60+0x86]
1、 这是问题帧的信息,这表示出现崩溃时jvm正在执行的代码,这里的"J"表示正在执行java代码,后面的表示执行的方法栈;
2、 "J"属于帧的一种类型;
帧的类型,包括如下
1、 C:本地C帧;
2、 j:解释的Java帧;
3、 V:虚拟机帧;
4、 v:虚拟机生成的存根栈帧;
5、 J:其他帧类型,包括编译后的Java帧;
3、导致 崩溃 的线程信息
Current thread (0x000055bf34361800): JavaThread "http-nio-8888-exec-34" daemon [_thread_in_Java, id=535, stack(0x00007fc2008fd000,0x00007fc2009fdae0)]
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000000c
上面是导致崩溃 的线程信息和该线程栈信息
Current thread (0x000055bf34361800): JavaThread
1、 导致出错的线程是0x000055bf34361800(指针),线程类型是java线程;
2、 线程类型;
1、 JavaThread:表示执行的是java线程;
2、 VMThread:jvm的内部线程;
3、 CompilerThread:;
1. 用来调用JITing,实时编译装卸class。
2. 通常,jvm会启动多个线程来处理这部分工作,线程名称后面的数字也会累加,例如:CompilerThread1
4、 GCTaskThread:执行gc的线程;
5、 WatcherThread:;
1. jvm周期性任务调度的线程,是一个单例对象。
2. 该线程在JVM内使用得比较频繁,比如:定期的内存监控、JVM运行状况监控,还有我们经常需要去执行一些jstat 这类命令查看gc的情况
6、 ConcurrentMarkSweepThread:;
1. jvm在进行CMS GC的时候,会创建一个该线程去进行GC,该线程被创建的同时会创建一个SurrogateLockerThread(简称SLT)线程并且启动它,SLT启动之后,处于等待阶段。CMST开始GC时,会发一个消息给SLT让它去获取Java层Reference对象的全局锁:Lock
“http-nio-8888-exec-34”
表示线程名
daemon
1、 表示该线程为守护线程,再后面的"[_thread_in_Java"表示线程正在执行解释或者编译后的Java代码;
2、 daemon后面的值有以下几种;
1、 _thread_in_native:线程当前状态;
2、 _thread_uninitialized:线程还没有创建,它只在内存原因崩溃的时候才出现;
3、 _thread_new:线程已经被创建,但是还没有启动;
4、 _thread_in_native:线程正在执行本地代码,一般这种情况很可能是本地代码有问题;
5、 _thread_in_vm:线程正在执行虚拟机代码;
6、 _thread_in_Java:线程正在执行解释或者编译后的Java代码;
7、 _thread_blocked:线程处于阻塞状态;
8、 …_trans:以_trans结尾,线程正处于要切换到其它状态的中间状态;
id=535
表示线程ID
stack(0x00007fc2008fd000,0x00007fc2009fdae0)]
表示栈区间
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000000c
这部分是导致虚拟机终止的非预期的信号信息:其中si_signo和si_code是Linux下用来鉴别异常的,Windows下是一个ExceptionCode
4、所有线程信息
Java Threads: ( => current thread )
0x000055bf37336800 JavaThread "ForkJoinPool.commonPool-worker-1" daemon [_thread_blocked, id=20558, stack(0x00007fc204bf8000,0x00007fc204cf8ae0)]
0x000055bf3269b800 JavaThread "boundedElastic-171" daemon [_thread_blocked, id=20553, stack(0x00007fc1fde9a000,0x00007fc1fdf9aae0)]
0x000055bf36df9000 JavaThread "pool-11-thread-286" [_thread_blocked, id=20544, stack(0x00007fc202e52000,0x00007fc202f52ae0)]
0x000055bf35cdf000 JavaThread "boundedElastic-170" daemon [_thread_blocked, id=20114, stack(0x00007fc1fd595000,0x00007fc1fd695ae0)]
0x000055bf36a11000 JavaThread "Timer-94" [_thread_blocked, id=20110, stack(0x00007fc1feca6000,0x00007fc1feda6ae0)]
0x000055bf358bc800 JavaThread "Keep-Alive-Timer" daemon [_thread_blocked, id=19503, stack(0x00007fc204dfa000,0x00007fc204efaae0)]
0x000055bf36342800 JavaThread "http-nio-8888-exec-174" daemon [_thread_blocked, id=15760, stack(0x00007fc1fad44000,0x00007fc1fae44ae0)]
0x000055bf2eb5b800 JavaThread "http-nio-8888-exec-173" daemon [_thread_blocked, id=15759, stack(0x00007fc1faf46000,0x00007fc1fb046ae0)]
0x000055bf3778f000 JavaThread "http-nio-8888-exec-172" daemon [_thread_blocked, id=15758, stack(0x00007fc1fb1d7000,0x00007fc1fb2d7ae0)]
0x000055bf2e2fc000 JavaThread "http-nio-8888-exec-171" daemon [_thread_blocked, id=15757, stack(0x00007fc1fb2d8000,0x00007fc1fb3d8ae0)]
... ...(省略)
Other Threads:
0x000055bf2635a800 VMThread [stack: 0x00007fc237f4b000,0x00007fc23804bae0] [id=14]
0x000055bf27b82000 WatcherThread [stack: 0x00007fc232b9b000,0x00007fc232c9bae0] [id=33]
内容和第3点 内容的类似
5、安全点和锁信息
# 安全线信息为正常运行
VM state:not at safepoint (normal execution)
# 锁信息为未被线程持有
VM Mutex/Monitor currently owned by a thread: None
VM state:安全线信息
1、 notatasafepoint:正常运行状态;
2、 atsafepoint:所有线程都因为虚拟机等待状态而阻塞,等待一个虚拟机操作完成;
3、 synchronizing:一个特殊的虚拟机操作,要求虚拟机内的其它线程保持等待状态;
VM Mutex/Monitor:锁信息
1、 Mutex是虚拟机内部的锁;
2、 Monitor是synchronized锁或者其它关联到的Java对象;
6、堆信息
Heap:
def new generation total 1850624K, used 1161138K [0x0000000647800000, 0x00000006c5000000, 0x00000006c5000000)
eden space 1645056K, 64% used [0x0000000647800000, 0x000000068842c580, 0x00000006abe80000)
from space 205568K, 48% used [0x00000006b8740000, 0x00000006be9004b8, 0x00000006c5000000)
to space 205568K, 0% used [0x00000006abe80000, 0x00000006abe80000, 0x00000006b8740000)
tenured generation total 4112384K, used 1768079K [0x00000006c5000000, 0x00000007c0000000, 0x00000007c0000000)
the space 4112384K, 42% used [0x00000006c5000000, 0x0000000730ea3f30, 0x0000000730ea4000, 0x00000007c0000000)
Metaspace used 384636K, capacity 428345K, committed 428504K, reserved 1431552K
class space used 39929K, capacity 46997K, committed 47064K, reserved 1048576K
Card table byte_map: [0x00007fc239314000,0x00007fc239ed9000] byte_map_base: 0x00007fc2360d8000
1、 堆信息包括:新生代、老生代、信息;
2、 "Cardtable"表示一种卡表,是jvm维护的一种数据结构,用于记录更改对象时的引用,以便gc时遍历更少的table和root;
7、本地代码缓存
CodeCache: size=245760Kb used=150371Kb max_used=181162Kb free=95388Kb
bounds [0x00007fc23a299000, 0x00007fc2456b9000, 0x00007fc249299000]
total_blobs=35255 nmethods=34107 adapters=1057
compilation: enabled
1、 这是一块用于编译和保存本地代码的内存;
2、 注意是本地代码;
8、编译事件
Compilation events (10 events):
Event: 88196.068 Thread 0x000055bf27b58000 111971 3 java.util.Set::spliterator (6 bytes)
Event: 88196.069 Thread 0x000055bf27b58000 nmethod 111971 0x00007fc23ca56650 code [0x00007fc23ca567e0, 0x00007fc23ca56b78]
Event: 88196.071 Thread 0x000055bf27b53000 111972 4 sun.util.calendar.ZoneInfoFile::getZoneInfo (30 bytes)
Event: 88196.074 Thread 0x000055bf27b53000 nmethod 111972 0x00007fc23af41410 code [0x00007fc23af41580, 0x00007fc23af416c8]
Event: 88203.481 Thread 0x000055bf27b53000 111973 4 com.hd.bpm.engine.db.ReversePolishNotation::infixToRPN (242 bytes)
Event: 88203.688 Thread 0x000055bf27b53000 nmethod 111973 0x00007fc2416aae10 code [0x00007fc2416ab2e0, 0x00007fc2416af2b8]
Event: 88206.608 Thread 0x000055bf27b58000 111974 3 org.hibernate.action.internal.BulkOperationCleanupAction::getAfterTransactionCompletionProcess (9 bytes)
Event: 88206.608 Thread 0x000055bf27b58000 nmethod 111974 0x00007fc23c8b9790 code [0x00007fc23c8b9900, 0x00007fc23c8b9b70]
Event: 88206.614 Thread 0x000055bf27b58000 111975 3 org.hibernate.action.internal.BulkOperationCleanupAction$1::<init> (10 bytes)
Event: 88206.614 Thread 0x000055bf27b58000 nmethod 111975 0x00007fc23c2e6390 code [0x00007fc23c2e6500, 0x00007fc23c2e66d0]
1、 一共编译了10次;
2、 举例:可以看到com.hd.bpm.engine.db.ReversePolishNotation被编译了1次;
9、gc相关记录
GC Heap History (10 events):
Event: 87508.136 GC heap before
{Heap before GC invocations=498 (full 8):
def new generation total 1850624K, used 1750484K [0x0000000647800000, 0x00000006c5000000, 0x00000006c5000000)
eden space 1645056K, 100% used [0x0000000647800000, 0x00000006abe80000, 0x00000006abe80000)
from space 205568K, 51% used [0x00000006abe80000, 0x00000006b2575308, 0x00000006b8740000)
to space 205568K, 0% used [0x00000006b8740000, 0x00000006b8740000, 0x00000006c5000000)
tenured generation total 4112384K, used 1750262K [0x00000006c5000000, 0x00000007c0000000, 0x00000007c0000000)
the space 4112384K, 42% used [0x00000006c5000000, 0x000000072fd3da70, 0x000000072fd3dc00, 0x00000007c0000000)
Metaspace used 384571K, capacity 428291K, committed 428504K, reserved 1431552K
class space used 39924K, capacity 46988K, committed 47064K, reserved 1048576K
Event: 87508.423 GC heap after
Heap after GC invocations=499 (full 8):
def new generation total 1850624K, used 97090K [0x0000000647800000, 0x00000006c5000000, 0x00000006c5000000)
eden space 1645056K, 0% used [0x0000000647800000, 0x0000000647800000, 0x00000006abe80000)
from space 205568K, 47% used [0x00000006b8740000, 0x00000006be610a10, 0x00000006c5000000)
to space 205568K, 0% used [0x00000006abe80000, 0x00000006abe80000, 0x00000006b8740000)
tenured generation total 4112384K, used 1755185K [0x00000006c5000000, 0x00000007c0000000, 0x00000007c0000000)
the space 4112384K, 42% used [0x00000006c5000000, 0x000000073020c6b0, 0x000000073020c800, 0x00000007c0000000)
Metaspace used 384571K, capacity 428291K, committed 428504K, reserved 1431552K
class space used 39924K, capacity 46988K, committed 47064K, reserved 1048576K
}
... ...(省略)
1、 可以看到gc次数为10次(fullgc);
2、 这里描述了每次gc前后的内存信息,这里没有内存不足等问题;
10、jvm内存映射
1、 下面是jvm加载的库信息,这些信息是虚拟机崩溃时的虚拟内存列表区域;
2、 它可以告诉你崩溃原因时哪些类库正在被使用,位置在哪里,还有堆栈和守护页信息;
Dynamic libraries:
00400000-00401000 r-xp 00000000 08:02 39454583 /home/service/jdk1.7.0_55/bin/java
00600000-00601000 rw-p 00000000 08:02 39454583 /home/service/jdk1.7.0_55/bin/java
013cd000-013ee000 rw-p 00000000 00:00 0 [heap]
6f0000000-800000000 rw-p 00000000 00:00 0
3056400000-3056416000 r-xp 00000000 08:02 57409539 /lib64/libgcc_s-4.4.7-20120601.so.1
3056416000-3056615000 ---p 00016000 08:02 57409539 /lib64/libgcc_s-4.4.7-20120601.so.1
3056615000-3056616000 rw-p 00015000 08:02 57409539 /lib64/libgcc_s-4.4.7-20120601.so.1
353be00000-353be20000 r-xp 00000000 08:02 57409933 /lib64/ld-2.12.so
353c01f000-353c020000 r--p 0001f000 08:02 57409933 /lib64/ld-2.12.so
353c020000-353c021000 rw-p 00020000 08:02 57409933 /lib64/ld-2.12.so
... ...(省略)
00400000-00401000
内存区域
r-xp
1、 权限;
2、 r/w/x/p/s分别表示读/写/执行/私有/共享;
00000000
文件内的偏移量
08:02
1、 文件位置的majorID和minorID;
39454583
索引节点号
/home/service/jdk1.7.0_55/bin/java
文件位置
11、jvm启动参数
# jvm参数
VM Arguments:
jvm_args: -Xms6024m -Xms6024m -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintHeapAtGC -XX:+PrintTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:ErrorFile=/var/log/xcoa/hs_err_pid_%p.log -Xloggc:/var/log/xcoa/gc-2022-12-02_20-06-51-100.100.17.7.log -XX:HeapDumpPath=/var/log/xcoa/oom-2022-12-02_20-06-51-100.100.17.7.dump -Dspring.profiles.active=redis_sentinel,workflow_save_mq,security_oauth,search_es_service -Dspring.cloud.consul.host=crld-consul01 -Dspring.cloud.consul.config.prefix=xcoa/dev -javaagent:/pt-agent/pinpoint-bootstrap.jar -Dpinpoint.agentId=100.100.17.7 -Dpinpoint.applicationName=crld-oa-server-prod -Duser.timezone=GMT+8 -Dlogging.file.name=/var/log/xcoa/hd-2022-12-02_20-06-51-100.100.17.7.log -Dlogging.file.path=/var/log/xcoa/
java_command: org.springframework.boot.loader.WarLauncher
java_class_path (initial): /opt/app:/pt-agent/pinpoint-bootstrap.jar
Launcher Type: SUN_STANDARD
# 系统的环境配置。
Environment Variables:
JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk
CLASSPATH=
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-1.8-openjdk/bin
LD_LIBRARY_PATH=/usr/lib/jvm/java-1.8-openjdk/jre/lib/amd64/server:/usr/lib/jvm/java-1.8-openjdk/jre/lib/amd64:/usr/lib/jvm/java-1.8-openjdk/jre/../lib/amd64
12、服务器信息
# 内存信息,主要关注下swap信息,看看有没有使用虚拟内存
/proc/meminfo:
MemTotal: 65916492 kB
MemFree: 14593468 kB
Buffers: 222452 kB
Cached: 28502452 kB
SwapTotal: 0 kB
SwapFree: 0 kB
... ...(省略)
# cpu信息
/proc/cpuinfo:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
stepping : 4
... ...(省略)
13、寄存器上下文
Registers:
RAX=0x00000006c5005228, RBX=0x00000000dcbec61f, RCX=0x000000067b88ff08, RDX=0x00000006c5005228
RSP=0x00007fc2009fa4a0, RBP=0x00000006c5005228, RSI=0x00000006ca8055c0, RDI=0x00000000974b19c3
R8 =0x0000000000000006, R9 =0x00000006c5005228, R10=0x000000000000000a, R11=0x0000000000000000
R12=0x0000000000000000, R13=0x000000067b88ff08, R14=0x0000000000000003, R15=0x000055bf34361800
RIP=0x00007fc23dd5dae6, EFLAGS=0x0000000000010202, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
TRAPNO=0x000000000000000e
14、栈顶程序计数器旁的操作码
Top of Stack: (sp=0x00007fc2009fa4a0)
0x00007fc2009fa4a0: 00000006ca8055c0 00007fc23cc5ddb4
0x00007fc2009fa4b0: 00000006883ed7d0 00000006883edb20
0x00007fc2009fa4c0: 00000006ca8056d0 00007fc23c755ab4
0x00007fc2009fa4d0: 00007fc2009fa540 00007fc23a2a0d80
0x00007fc2009fa4e0: 00000006ca8056d0 00007fc242aeacd4
0x00007fc2009fa4f0: 00007fc23a2a0d80 00000006883ed7d0
0x00007fc2009fa500: 00000000d9500ab8 00007fc242ffabf0
0x00007fc2009fa510: 00000006d1982fd8 00007fc2cf711fe1
0x00007fc2009fa520: 00007fc100000000 00000006883cb590
0x00007fc2009fa530: 0000000000000000 00007fc2009fa580
0x00007fc2009fa540: 00000000d1079670 00007fc23bdab304
0x00007fc2009fa550: deaddeaf00000003 00000006883ed7d0
Instructions: (pc=0x00007fc23dd5dae6)
0x00007fc23dd5dac6: 45 85 d2 0f 8c 9a 0c 00 00 48 85 c9 0f 85 b9 0c
0x00007fc23dd5dad6: 00 00 49 bd 10 3b 00 c5 06 00 00 00 45 8b 5d 0c
0x00007fc23dd5dae6: 47 8b 54 dc 0c 45 03 d0 41 8b d2 83 c2 04 85 d2
0x00007fc23dd5daf6: 0f 8c 6d 0c 00 00 81 fa 00 00 08 00 0f 87 59 0c
栈顶程序计数器旁的操作码,可以被反汇编成系统崩溃前执行的指令。
15、寄存器和内存映射信息
Register to memory mapping:
RAX=0x00000006c5005228 is an oop
java.lang.String
- klass: 'java/lang/String'
RBX=0x00000000dcbec61f is an unknown value
RCX=
[error occurred during error reporting (printing register info), id 0xb]
16、信号句柄
对于Linux 下的信号机制
Signal Handlers:
SIGSEGV: [libjvm.so+0x5fcb50], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGBUS: [libjvm.so+0x5fcb50], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGFPE: [libjvm.so+0x535b51], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGPIPE: SIG_IGN, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGXFSZ: [libjvm.so+0x535b51], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGILL: [libjvm.so+0x535b51], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGUSR1: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGUSR2: [libjvm.so+0x535cf9], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO
SIGHUP: [libjvm.so+0x537d11], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGINT: [libjvm.so+0x537d11], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGTERM: [libjvm.so+0x537d11], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGQUIT: [libjvm.so+0x537d11], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO