一、ANR说明和原因
1.1 简介
ANR全称:Application Not Responding,也就是应用程序无响应。
1.2 原因
Android系统中,ActivityManagerService(简称AMS)和WindowManagerService(简称WMS)会检测App的响应时间,如果App在特定时间无法相应屏幕触摸或键盘输入时间,或者特定事件没有处理完毕,就会出现ANR。
以下四个条件都可以造成ANR发生:
- InputDispatching Timeout:5秒内无法响应屏幕触摸事件或键盘输入事件
-
BroadcastQueue Timeout :在执行前台广播(BroadcastReceiver)的
onReceive()
函数时10秒没有处理完成,后台为60秒。 - Service Timeout :前台服务20秒内,后台服务在200秒内没有执行完毕。
- ContentProvider Timeout :ContentProvider的publish在10s内没进行完。
1.3 避免
尽量避免在主线程(UI线程)中作耗时操作。
那么耗时操作就放在子线程中。
关于多线程可以参考:Android多线程:理解和简单使用总结
二.ANR一般分析步骤
1.首先从main.log找到进程出现anr对应的大体时间,如在log中查询"anr in"字段
2.根据出现anr的进程名到anr文件夹中找出trance.txt文件,根据文件的信息首先判断anr的类型,是app自身还是系统问题,如果是app应用问题,则根据对用的调用解决问题
3.如不是app应用问题,则从main.log中判断anr的类型,如是keydispatch time out,则应该根据anr准确的时间点上推5s钟,看看此时对用进程正在进程何操作(具体anr的准确时间可以在event.log中搜索anr)
4.trace中无明显异常,可以从下面的情况考虑
a.是否由于io,数据库处理导致cpu使用率过高从而导致其他应用进程无法抢占cpu时间片
b.是否是低内存导致anr(如低内存,可以从system.log中查看进程被kill, 输入某某进程died)
c.是否由于输入法交互处理不当导致不能返回出现anr
d.是否由于进程锁等待,死锁情况出现anr
三.实例分析
例子一
例子二
例子三
看 /data/anr/traces.tx. 文件
直接看main thread
----- pid 20663 at 2020-08-25 18:11:47 -----
Cmd line: com.xxxxxing.xxxxxsettings
Build fingerprint: 'Android/xxx/xxx:7.1.2/NHG47L/xxx:user/test-keys'
ABI: 'arm'
Build type: optimized
Zygote loaded classes=4381 post zygote classes=88
Intern table: 41765 strong; 330 weak
JNI: CheckJNI is off; globals=458 (plus 132 weak)
// ...
"main" prio=5 tid=1 Sleeping
| group="main" sCount=1 dsCount=0 obj=0x74751f58 self=0xf4785400
| sysTid=20663 nice=-10 cgrp=default sched=0/0 handle=0xf7437534
| state=S schedstat=( 179097380 5860498 155 ) utm=9 stm=8 core=2 HZ=100
| stack=0xff77c000-0xff77e000 stackSize=8MB
| held mutexes=
at java.lang.Thread.sleep!(Native method)
- sleeping on <0x09d961ef> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:371)
- locked <0x09d961ef> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:313)
at com.xxxxxing.xxxxxsettings.CrashHandler.uncaughtException(CrashHandler.java:81)
at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1068)
at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1063)
其实这份log 比较容易看出来是哪里有问题了,是在调用了thread sleep 造成的无响应
at com.xxxxxing.xxxxxsettings.CrashHandler.uncaughtException(CrashHandler.java:81)
看logcat, 直接在log中搜索ANR关键字
[2020/8/25 18:13:42] (20663): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:42] 08-25 18:11:48.197 I/art ( 4713): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:42] 08-25 18:11:48.339 I/art ( 4701): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.411 I/art ( 4688): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.467 I/art ( 4668): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.530 I/art ( 4653): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.588 I/art ( 4638): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.701 I/art ( 4461): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.878 I/art ( 4379): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] o=700386 scontext=u:r:untrusted_app:s0:c512,c768 tcontext=u:object_r:anr_data_file:s0 tclass=file permissive=1
[2020/8/25 18:13:43] mcblk0p14" ino=700386 scontext=u:r:untrusted_app:s0:c512,c768 tcontext=u:object_r:anr_data_file:s0 tclass=file permissive=1
[2020/8/25 18:13:44] 08-25 18:11:51.547 E/ActivityManager( 4271): ANR in com.xxxxxing.xxxxxsettings(com.xxxxxing.xxxxxsettings/.ConnectionInfoActivity)
看ANR 附近log 信息
[2020/8/25 18:13:44] 08-25 18:11:51.547 E/ActivityManager( 4271): PID: 20663
直接搜20663 这个进程做了什么,如下log 可以看到是 signal 9 信号kill 掉的,关于signal 更新介绍可以看 FreeBSD Manual Pages
同时我们先看到anr 的打印是在process killed 之前打印的,说明是先发生了ANR后再杀掉了进程
Line 1637: [2020/8/25 18:13:42] 08-25 18:11:47.492 I/art (20663):
Line 1642: [2020/8/25 18:13:42] (20663): Wrote stack traces to '/data/anr/traces.txt'
Line 1651: [2020/8/25 18:13:42] 08-25 18:11:47.844 I/Process (20663): Sending signal. PID: 20663 SIG: 9
Line 1651: [2020/8/25 18:13:42] 08-25 18:11:47.844 I/Process (20663): Sending signal. PID: 20663 SIG: 9
Line 1663: [2020/8/25 18:13:42] 08-25 18:11:48.114 I/ActivityManager( 4271): Process com.xxxxxing.xxxxxsettings (pid 20663) has died
Line 1664: [2020/8/25 18:13:42] 08-25 18:11:48.115 D/ActivityManager( 4271): cleanUpApplicationRecord -- 20663
Line 1866: [2020/8/25 18:13:44] 08-25 18:11:51.547 E/ActivityManager( 4271): PID: 20663
结合trace 和 logcat 以及源码 得出的结论为
1、程序封装的exception 类,收到了exception 后进入到重载的函数
使用方法参考 UncaughtException 的使用
2、重载的函数中实现 调用了sleep 后导致了ANR,然后在kill process
@Override
public void uncaughtException(Thread thread, Throwable ex) {
if(!handleException(thread) && mDefaultHandler != null){
mDefaultHandler.uncaughtException(thread, ex);// 系统默认异常处理器
} else {
Thread.sleep(3000);
android.os.Process.killProcess(android.os.Process.myPid);
System.exit(1);
}
}
例子四 Android ANR 处理之Handler
背景:播放多媒体的时候seek 卡住了,操作其他按键导致ANR
这里有两个问题要分析,一个是为什么seek 会卡住,另一个问题是即使seek卡住了也不应该导致ANR
先看ANR的main thread 如下
main" prio=5 tid=1 Native
| group="main" sCount=1 dsCount=0 flags=1 obj=0x74d45000 self=0xed1d1000
| sysTid=13857 nice=-10 cgrp=default sched=0/0 handle=0xf13ee494
| state=S schedstat=( 0 0 0 ) utm=156 stm=22 core=1 HZ=100
| stack=0xff34a000-0xff34c000 stackSize=8MB
| held mutexes=
kernel: __switch_to+0xa4/0xc4
kernel: binder_thread_read+0x3e0/0x1104
kernel: binder_ioctl+0x8e0/0xac4
kernel: compat_SyS_ioctl+0xd4/0xed8
kernel: el0_svc_naked+0x34/0x38
native: #00 pc 00053b8c /system/lib/libc.so (__ioctl+8)
native: #01 pc 00021b63 /system/lib/libc.so (ioctl+30)
native: #02 pc 0003d3f5 /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+204)
native: #03 pc 0003dde3 /system/lib/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+26)
native: #04 pc 0003713d /system/lib/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+36)
native: #05 pc 00039f19 /system/lib/libmedia.so (android::BpMediaPlayer::seekTo(int, android::MediaTrack::ReadOptions::SeekMode)+84)
native: #06 pc 00033025 /system/lib/libmedia.so (android::MediaPlayer::seekTo_l(int, android::MediaTrack::ReadOptions::SeekMode)+184)
native: #07 pc 00033081 /system/lib/libmedia.so (android::MediaPlayer::seekTo(int, android::MediaTrack::ReadOptions::SeekMode)+40)
native: #08 pc 0003049b /system/lib/libmedia_jni.so (android_media_MediaPlayer_seekTo(_JNIEnv*, _jobject*, long long, int)+82)
at android.media.MediaPlayer._seekTo(Native method)
at android.media.MediaPlayer.seekTo(MediaPlayer.java:1961)
at android.media.MediaPlayer.seekTo(MediaPlayer.java:1973)
at com.xxx.media.xxxAndroidMediaPlayer.seekTo(xxxAndroidMediaPlayer.kt:310)
at com.xxx.media.xxx.seekTo(xxxVideoView.kt:883)
at com.xxx.media.MediaController$5.handleMessage(MediaController.java:587)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loop(Looper.java:193)
at android.app.ActivityThread.main(ActivityThread.java:6669)
at java.lang.reflect.Method.invoke(Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:866)
Apk的代码时别人写,一时看代码感觉没有发现问题,收到按键后,通过Handler 发送消息处理
private Handler mHandler = new Handler() {
@Override
public void handleMessage(Message msg) {
super.handleMessage(msg);
switch (msg.what) {
case SEEK_MSG:
if (mPlayer != null) {
Log.d(LOG_TAG, "msg.arg1=="+msg.arg1);
mDragging = false;
seekSave = -1;
mPlayer.seekTo(msg.arg1); // 播放器seek
}
break;
}
}
};
@Override
public boolean dispatchKeyEvent(KeyEvent event) {
if (event.getAction() == KeyEvent.ACTION_DOWN ) {
Log.d(LOG_TAG, "dispatchKeyEvent: SEEK_MSG");
mHandler.removeMessages(SEEK_MSG);
mHandler.sendMessageDelayed(mHandler.obtainMessage(SEEK_MSG, mNewPosition, 0),1000);
}
}
为了模拟这个问题很简单,在 mPlayer.seekTo(msg.arg1); // 播放器seek 前面加一段sleep 30s,seek后再操作按键就好
同样能够模拟出问题来,堆栈基本也一样
"main" prio=5 tid=1 Sleeping
| group="main" sCount=1 dsCount=0 flags=1 obj=0x742ee000 self=0xeffd1000
| sysTid=9639 nice=-10 cgrp=default sched=0/0 handle=0xf41e2494
| state=S schedstat=( 0 0 0 ) utm=38 stm=12 core=3 HZ=100
| stack=0xff1b0000-0xff1b2000 stackSize=8MB
| held mutexes=
at java.lang.Thread.sleep(Native method)
- sleeping on <0x06bc3a7d> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:373)
- locked <0x06bc3a7d> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:314)
at com.xxx.media.MediaController$5.handleMessage(MediaController.java:591)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loop(Looper.java:193)
at android.app.ActivityThread.main(ActivityThread.java:6669)
at java.lang.reflect.Method.invoke(Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:866)
然后又再细看了一遍,其实是自己对Android Handler的理解不够,上面apk的代码是有问题的,虽然handler处了理seek msg,但是这个是主线程的looper,其实就是在主线程的looper中做了耗时的操作,这时候有按键进来是无法响应的
更多关于hanlder的知识可以看这里 https://www.tqwba.com/x_d/jishu/269458.html