unknown
1970-01-01 00:00:00 UTC
-----Original Message-----
From: Smith, Stan
Sent: Thursday, February 02, 2012 10:55 AM
To: Leonid Keller; Hefty, Sean; Tzachi Dar
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: RE: opensm stuck upon kill
=20
Leo,
What are you saying exactly by 'opensm stuck on kill'? More kill info
please.
=20
Was OpenSM running as a service and via service control you said stop?
OpenSM running as a console application '--console local' and you typed t=
heFrom: Smith, Stan
Sent: Thursday, February 02, 2012 10:55 AM
To: Leonid Keller; Hefty, Sean; Tzachi Dar
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: RE: opensm stuck upon kill
=20
Leo,
What are you saying exactly by 'opensm stuck on kill'? More kill info
please.
=20
Was OpenSM running as a service and via service control you said stop?
OpenSM running as a console application '--console local' and you typed t=
'exit' command?
OpenSM running and you just killed the process?
=20
Killed how?
=20
Thanks,
=20
Stan.
=20
d andOpenSM running and you just killed the process?
=20
Killed how?
=20
Thanks,
=20
Stan.
=20
-----Original Message-----
Sent: Thursday, February 02, 2012 6:42 AM
To: Leonid Keller; Hefty, Sean; Tzachi Dar; Smith, Stan
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: opensm stuck upon kill
Hi guys,
opensm got stuck upon kill
I'll try to keep the full dump and will send you if you are interested.
The stuck happens in IBAL upon releasing PD.
nt!DbgBreakPoint
ibbus!sync_destroy_obj+0xa61
ibbus!destroy_obj+0x8ad
ibbus!async_destroy_obj+0xa4
ibbus!ib_dealloc_pd+0x2b6
winmad!WmRegRemoveHandler+0xae
...
// from ibbus!sync_destroy_obj
1: kd> ?? p_obj
struct _al_obj * 0xa970fbbc
...
+0x080 ref_cnt : 1
...
+0x0a4 type : 3 //it's AV
+0x0a8 state : 3 ( CL_DESTROYING )
...
There are 227 children (AVs), which - as far as I understand, are create=
Sent: Thursday, February 02, 2012 6:42 AM
To: Leonid Keller; Hefty, Sean; Tzachi Dar; Smith, Stan
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: opensm stuck upon kill
Hi guys,
opensm got stuck upon kill
I'll try to keep the full dump and will send you if you are interested.
The stuck happens in IBAL upon releasing PD.
nt!DbgBreakPoint
ibbus!sync_destroy_obj+0xa61
ibbus!destroy_obj+0x8ad
ibbus!async_destroy_obj+0xa4
ibbus!ib_dealloc_pd+0x2b6
winmad!WmRegRemoveHandler+0xae
...
// from ibbus!sync_destroy_obj
1: kd> ?? p_obj
struct _al_obj * 0xa970fbbc
...
+0x080 ref_cnt : 1
...
+0x0a4 type : 3 //it's AV
+0x0a8 state : 3 ( CL_DESTROYING )
...
There are 227 children (AVs), which - as far as I understand, are create=
attached to PD upon send_mad.
+0x57There were several applications, that were running at the time of stuck,
opensm was one of them.[cda39020 opensm.exe]
83c.0003a8 9af686f0 0000002 RUNNING nt!DbgBreakPoint
ibbus!sync_destroy_obj+0xa61
ibbus!destroy_obj+0x8ad
ibbus!async_destroy_obj+0xa4
ibbus!ib_dealloc_pd+0x2b6
winmad!WmRegRemoveHandler+0xae
winmad!WmRegFree+0xe
winmad!WmProviderCleanup+0x24
winmad!WmFileCleanup+0x3a
Wdf01000!FxFileObjectFileCleanup::Invoke+0x2483c.0003a8 9af686f0 0000002 RUNNING nt!DbgBreakPoint
ibbus!sync_destroy_obj+0xa61
ibbus!destroy_obj+0x8ad
ibbus!async_destroy_obj+0xa4
ibbus!ib_dealloc_pd+0x2b6
winmad!WmRegRemoveHandler+0xae
winmad!WmRegFree+0xe
winmad!WmProviderCleanup+0x24
winmad!WmFileCleanup+0x3a
Wdf01000!FxPkgGeneral::OnCleanup=
Wdf01000!FxPkgGeneral::Dispatch+=
Wdf01000!FxDevice::Dispatch+0x7f
nt!IovCallDriver+0x23f
nt!IofCallDriver+0x1b
nt!IopCloseFile+0x387
nt!ObpDecrementHandleCount+0x146
nt!ObpCloseHandleTableEntry+0x23=
nt!IovCallDriver+0x23f
nt!IofCallDriver+0x1b
nt!IopCloseFile+0x387
nt!ObpDecrementHandleCount+0x146
nt!ObpCloseHandleTableEntry+0x23=
nt!ExSweepHandleTable+0x5f
nt!ObKillProcess+0x54
nt!PspExitThread+0x5b6
nt!PsExitSpecialApc+0x22
nt!KiDeliverApc+0x1dc
nt!KiServiceExit+0x56
ntdll!KiFastSystemCallRet
ntdll!ZwWaitForWorkViaWorkerFactory+0xcnt!ObKillProcess+0x54
nt!PspExitThread+0x5b6
nt!PsExitSpecialApc+0x22
nt!KiDeliverApc+0x1dc
nt!KiServiceExit+0x56
ntdll!KiFastSystemCallRet
ntdll!TppWorkerThread+0x1f6
kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x23
ntdll!_RtlUserThreadStart+
WmProviderDeregister(pRegistration->pProvider, pRegistration);
pRegistration->pDevice->IbInterface.destroy_qp(pRegistration->hQp,
NULL);kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x23
ntdll!_RtlUserThreadStart+
WmProviderDeregister(pRegistration->pProvider, pRegistration);
pRegistration->pDevice->IbInterface.destroy_qp(pRegistration->hQp,
pRegistration->pDevice->IbInterface.dealloc_pd(pRegistration->hPd,
NULL);pRegistration->pDevice->IbInterface.close_ca(pRegistration->hCa, NULL)=
Could you suggest some idea ?
Thank you.
-----Original Message-----
From: Leonid Keller
Sent: Tuesday, January 31, 2012 1:15 PM
To: 'Hefty, Sean'; Tzachi Dar; Smith, Stan
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: RE: Opensm & WinMad: a race, cauing BSOD722
Thank you, Sean.
Some comments.
We do not think that this additional validation is necessary.
It's hard to believe - unless you saw that - that Windows can call
close(handle) after open(&handle) has failed.Thank you.
-----Original Message-----
From: Leonid Keller
Sent: Tuesday, January 31, 2012 1:15 PM
To: 'Hefty, Sean'; Tzachi Dar; Smith, Stan
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: RE: Opensm & WinMad: a race, cauing BSOD722
Thank you, Sean.
Some comments.
We do not think that this additional validation is necessary.
It's hard to believe - unless you saw that - that Windows can call
As to the patch to winverbs - it causes a crash, because WvProviderGet i=
called at DISPATCH level.
ine.ATTEMPTED_SWITCH_FROM_DPC (b8)
A wait operation, attach process, or yield was attempted from a DPC rout=
A wait operation, attach process, or yield was attempted from a DPC rout=
This is an illegal operation and the stack track will lead to the offend=
code and original DPC routine.
nt!KiSwapContext+0x7f
nt!KiSwapThread+0x2fa
nt!KeWaitForGate+0x22a
nt!KiAcquireGuardedMutex+0x35
nt!KeAcquireGuardedMutex+0x39
winverbs!WvProviderGet+0x1d
winverbs!WvEpCompleteDisconnect+0x113
winverbs!WvEpIbCmHandler+0x26a
ibbus!cm_cep_handler+0x99
ibbus!__process_cep+0x10f
ibbus!__drep_handler+0x6ea
ibbus!__cep_mad_recv_cb+0x246
ibbus!__mad_svc_recv_done+0xb58
ibbus!mad_disp_recv_done+0x1650
ibbus!process_mad_recv+0x3bf
ibbus!spl_qp_comp+0x3d2
ibbus!spl_qp_recv_dpc_cb+0x112
nt!KiRetireDpcList+0x117
nt!KyRetireDpcList+0x5
nt!KiDispatchInterruptContinue
I've replaced mutex by spinlock - see below.
I did it also for WinMad, albeit it has no asynchronous callbacks like
WinVerbs.nt!KiSwapContext+0x7f
nt!KiSwapThread+0x2fa
nt!KeWaitForGate+0x22a
nt!KiAcquireGuardedMutex+0x35
nt!KeAcquireGuardedMutex+0x39
winverbs!WvProviderGet+0x1d
winverbs!WvEpCompleteDisconnect+0x113
winverbs!WvEpIbCmHandler+0x26a
ibbus!cm_cep_handler+0x99
ibbus!__process_cep+0x10f
ibbus!__drep_handler+0x6ea
ibbus!__cep_mad_recv_cb+0x246
ibbus!__mad_svc_recv_done+0xb58
ibbus!mad_disp_recv_done+0x1650
ibbus!process_mad_recv+0x3bf
ibbus!spl_qp_comp+0x3d2
ibbus!spl_qp_recv_dpc_cb+0x112
nt!KiRetireDpcList+0x117
nt!KyRetireDpcList+0x5
nt!KiDispatchInterruptContinue
I've replaced mutex by spinlock - see below.
I did it also for WinMad, albeit it has no asynchronous callbacks like
The main reason is to keep it similar to WinVerbs as it is today.
A minor, mostly theoretical one: there are other functions, which are us=
A minor, mostly theoretical one: there are other functions, which are us=
today the provider mutex. It seems for me worthful to keep for
cthem possibility to call a low-level WvProviderGet function.
What's your opinion ?
Index: B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.=
What's your opinion ?
Index: B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.c
(revision 9686)+++ B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.c
(revision 9687)@@ -44,14 +44,15 @@
LONG WvProviderGet(WV_PROVIDER *pProvider)
{
LONG val;
+ KIRQL irql;
- KeAcquireGuardedMutex(&pProvider->Lock);
+ KeAcquireSpinLock(&pProvider->SpinLock, &irql);
val =3D InterlockedIncrement(&pProvider->Ref);
if (val =3D=3D 1) {
pProvider->Ref =3D 0;
val =3D 0;
}
- KeReleaseGuardedMutex(&pProvider->Lock);
+ KeReleaseSpinLock(&pProvider->SpinLock, irql);
return val;
}
@@ -119,6 +120,7 @@
KeInitializeEvent(&pProvider->SharedEvent, NotificationEvent, FALSE);
pProvider->Exclusive =3D 0;
KeInitializeEvent(&pProvider->ExclusiveEvent, SynchronizationEvent,
FALSE);LONG WvProviderGet(WV_PROVIDER *pProvider)
{
LONG val;
+ KIRQL irql;
- KeAcquireGuardedMutex(&pProvider->Lock);
+ KeAcquireSpinLock(&pProvider->SpinLock, &irql);
val =3D InterlockedIncrement(&pProvider->Ref);
if (val =3D=3D 1) {
pProvider->Ref =3D 0;
val =3D 0;
}
- KeReleaseGuardedMutex(&pProvider->Lock);
+ KeReleaseSpinLock(&pProvider->SpinLock, irql);
return val;
}
@@ -119,6 +120,7 @@
KeInitializeEvent(&pProvider->SharedEvent, NotificationEvent, FALSE);
pProvider->Exclusive =3D 0;
KeInitializeEvent(&pProvider->ExclusiveEvent, SynchronizationEvent,
+ KeInitializeSpinLock(&pProvider->SpinLock);
return STATUS_SUCCESS;
}
Index: B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.=
return STATUS_SUCCESS;
}
Index: B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.h
(revision 9686)+++ B:/users/leonid/svn/winib/trunk/core/winverbs/kernel/wv_provider.h
(revision 9687)@@ -80,6 +80,7 @@
KEVENT ExclusiveEvent;
WORK_QUEUE WorkQueue;
+ KSPIN_LOCK SpinLock;
} WV_PROVIDER;
Index: B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
KEVENT ExclusiveEvent;
WORK_QUEUE WorkQueue;
+ KSPIN_LOCK SpinLock;
} WV_PROVIDER;
Index: B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.h
(revision 9687)+++ B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.h
(revision 9688)@@ -57,6 +57,7 @@
KEVENT SharedEvent;
LONG Exclusive;
KEVENT ExclusiveEvent;
+ KSPIN_LOCK SpinLock;
} WM_PROVIDER;
Index: B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
KEVENT SharedEvent;
LONG Exclusive;
KEVENT ExclusiveEvent;
+ KSPIN_LOCK SpinLock;
} WM_PROVIDER;
Index: B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.c
(revision 9687)+++ B:/users/leonid/svn/winib/trunk/core/winmad/kernel/wm_provider.c
(revision 9688)@@ -36,14 +36,15 @@
LONG WmProviderGet(WM_PROVIDER *pProvider)
{
LONG val;
+ KIRQL irql;
- KeAcquireGuardedMutex(&pProvider->Lock);
+ KeAcquireSpinLock(&pProvider->SpinLock, &irql);
val =3D InterlockedIncrement(&pProvider->Ref);
if (val =3D=3D 1) {
pProvider->Ref =3D 0;
val =3D 0;
}
- KeReleaseGuardedMutex(&pProvider->Lock);
+ KeReleaseSpinLock(&pProvider->SpinLock, irql);
return val;
}
@@ -72,6 +73,7 @@
KeInitializeEvent(&pProvider->SharedEvent, NotificationEvent, FALSE);
pProvider->Exclusive =3D 0;
KeInitializeEvent(&pProvider->ExclusiveEvent, SynchronizationEvent,
FALSE);LONG WmProviderGet(WM_PROVIDER *pProvider)
{
LONG val;
+ KIRQL irql;
- KeAcquireGuardedMutex(&pProvider->Lock);
+ KeAcquireSpinLock(&pProvider->SpinLock, &irql);
val =3D InterlockedIncrement(&pProvider->Ref);
if (val =3D=3D 1) {
pProvider->Ref =3D 0;
val =3D 0;
}
- KeReleaseGuardedMutex(&pProvider->Lock);
+ KeReleaseSpinLock(&pProvider->SpinLock, irql);
return val;
}
@@ -72,6 +73,7 @@
KeInitializeEvent(&pProvider->SharedEvent, NotificationEvent, FALSE);
pProvider->Exclusive =3D 0;
KeInitializeEvent(&pProvider->ExclusiveEvent, SynchronizationEvent,
+ KeInitializeSpinLock(&pProvider->SpinLock);
ASSERT(ControlDevice !=3D NULL);
-----Original Message-----
Sent: Tuesday, January 31, 2012 12:08 AM
To: Leonid Keller; Tzachi Dar; Smith, Stan
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: RE: Opensm & WinMad: a race, cauing BSOD722
ASSERT(ControlDevice !=3D NULL);
-----Original Message-----
Sent: Tuesday, January 31, 2012 12:08 AM
To: Leonid Keller; Tzachi Dar; Smith, Stan
Cc: Uri Habusha; ofw_list; Irena Gannon
Subject: RE: Opensm & WinMad: a race, cauing BSOD722
WmProviderInit() is called without checking the return status. Is ther=
reason ?
Seems like the similar patch is needed for WvIoDeviceControl().
I can't tell whether IOCTLs suffer from the same problem or not. But si=Seems like the similar patch is needed for WvIoDeviceControl().
Windows is stupid, I went ahead and added the same protection
entto winverbs, plus some additional validation in case we get a cleanup ev=
for a file for which we failed to create.
- Sean