初次在SF发求助帖,希望各位多多支持,具体问题内容如下:
项目过程中遇到一个很棘手的问题,详细如下:
手机进行自动开关机压力测试(高通8937)400-1000次左右大概会有一次卡住导致不关机(并没有kernel panic),现象就是卡在关机动画,adb还可以连,uart log也正常输出,我的做法就是把kernel 关机流程全部加上printk,然后弄10台机器进行压力测试,等到出现问题后,触发ramdump,再解开ramdump,查看kmsg中加的trace走到哪一步进一步分析卡在哪里。
但是这个情况很特殊!!!!加了kernel log后,浮现概率大大降低,就算进行2000次测试也不一定可以出现问题(进一步怀疑是某一个dev时序问题卡住,但是不知道凶手是谁..),好不容易抓到一次,大致定位到了出问题就是卡在函数就是 device_shutdown函数,可惜log太少没有找出卡在具体哪一个device,现在还在继续压力测试中,我们分析极有可能是卡在pm_runtime_barrier 函数中,但是这个内核函数有点复杂,不太懂这块。集思广益,希望大家不吝给出各种调试分析手段!
void device_shutdown(void)
{
struct device *dev, *parent;
spin_lock(&devices_kset->list_lock);
/*
* Walk the devices list backward, shutting down each in turn.
* Beware that device unplug events may also start pulling
* devices offline, even as the system is shutting down.
*/
while (!list_empty(&devices_kset->list)) {
dev = list_entry(devices_kset->list.prev, struct device,
kobj.entry);
**dev_p = (char *)dev->kobj.name;
step1++;**
/*
* hold reference count of device's parent to
* prevent it from being freed because parent's
* lock is to be held
*/
parent = get_device(dev->parent);
get_device(dev);
/*
* Make sure the device is off the kset list, in the
* event that dev->*->shutdown() doesn't remove it.
*/
list_del_init(&dev->kobj.entry);
spin_unlock(&devices_kset->list_lock);
/* hold lock to avoid race with probe/release */
if (parent)
device_lock(parent);
device_lock(dev);
/* Don't allow any more runtime suspends */
pm_runtime_get_noresume(dev);
pm_runtime_barrier(dev);
**step2++;**
if (dev->bus && dev->bus->shutdown) {
if (initcall_debug)
dev_info(dev, "shutdown\n");
dev->bus->shutdown(dev);
} else if (dev->driver && dev->driver->shutdown) {
if (initcall_debug)
dev_info(dev, "shutdown\n");
dev->driver->shutdown(dev);
}
device_unlock(dev);
if (parent)
device_unlock(parent);
put_device(dev);
put_device(parent);
spin_lock(&devices_kset->list_lock);
}
spin_unlock(&devices_kset->list_lock);
}
上面的 dev_p = (char *)dev->kobj.name; 和step1,step2是我加的计数器,用于记录出问题时候跑了哪一步,是哪个dev卡住,可惜目前还没出结果,只能从代码层级深入挖掘o(╯□╰)o sad
后面经过反复大量的压力测试,浮现了一次,dev_p = (char *)dev->kobj.name;
用adb拉出来了,发现卡住的dev是 :bam_dmux_ch_0.2 , 但是不了解这个东西是干什么用的.
暂时陷入分析僵局,无法进一步分析了..
希望有经验的同学帮忙分析下,多谢!
欢迎选择我的课程,让我们一起见证您的进步~~