Monday, October 11, 2010

How to debug VBScript (.vbs)

In order to debug .vbs file, you can specify "//D //X" options in script host. Both Windows-based script host (wscript.exe) and Command-based script host (cscript.exe) supports the same options. When //D //X options are specified, Visual Studio debugger will be launched and start debugging for the VB script. wscript displays output in Window form such as message box while cscript displays output into the console.

This command invokes VS debugger for test.vbs script file.

C:\Temp>wscript //D //X test.vbs
or
C:\Temp>cscript  //D //X test.vbs


[F10] key executes the VB script line by line.

Sunday, October 10, 2010

Brute force object search using C++ vtable

In C++, object layout in memory depends on compiler and there is no strict standard layout, unfortunately. But, we can look into some details by using the debugger.

Assuming we use VC++, let's take some examples. Two classes are defined as below. Each class has one data member and Dog class inherits from Animal class. What is the object layout for Dog class? In windbg, we can find the object layout by using dt command (dt <classname>). As you can see below, the Dog object has 2 string objects in its object layout. Data members in base class comes first and then data members in derived class comes after, all in the same order (of data fields) as defined in the class.

class Animal
{
public:
       Animal() {}  
protected:
       string name;
};

class Dog : public Animal
{
public:
       Dog() {}
protected:
       string petOwner;
};

0:000:x86> dt Dog
MyTest!Dog
   +0x000 name             : std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x020 petOwner         : std::basic_string<char,std::char_traits<char>,std::allocator<char> >

That one is easy. What if we have virtual function?
Does it affect object layout? Let's see another example.
The below example has 2 virtual function in base class and overrode by subclass.

#include <iostream>
#include <string>
using namespace std;
class Animal
{
public:
       Animal() { name = "Animal"; }    
       virtual void DisplayInfo()
       {
              cout << name << endl;
       }
       virtual void Run() {}
protected:
       string name;
};

class Dog : public Animal
{
public:
       Dog() { name = "Dog"; petOwner = "N/A"; }
       void DisplayInfo()
       {
              cout << name << ":" << petOwner << endl;
       }
       void Run()
       {
              cout << "Run" << endl;
       }
protected:
       string petOwner;
};

int _tmain(int argc, _TCHAR* argv[])
{
       Dog* pDog = new Dog();
       Dog* pDog2 = new Dog();

       Animal* pA = pDog; // <== breakpoint
       pA->DisplayInfo();
       pA = pDog2;
       pA->Run();

       return 0;
}

If we check the Dog object layout, we can see there is 4 byte vtable pointer
in the first position.

0:000:x86> dt Dog
MyTest!Dog
   +0x000 __VFN_table : Ptr32 
   +0x004 name             : std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x024 petOwner         : std::basic_string<char,std::char_traits<char>,std::allocator<char> >

Now to investigate a little more, I set a breakpoint in main().
When the debugger broke into the breakpoint, two Dog object can be found using dv command.
This is easiest way of finding Dog object in the current process. 

0:000:x86> dv /i
prv param             argc = 0n1
prv param             argv = 0x00585170
prv local               pA = 0xcccccccc
prv local            pDog2 = 0x00585288
prv local             pDog = 0x00589f78

But what if the application is very complex and we're in the middle of nowhere
but want to find all Dog objects in memory? Well, one way we can try is to search vtable
in the whole memory. Since vtable comes first in the object layout, we can look for it in memory
and find a clue for object instance. It is brute force search but sometimes can be useful.

So in order to do that, first, we find vftable by examining (x command) Dog class.

0:000:x86> x MyTest!Dog::*
012729b0          MyTest!Dog::Dog (void)
01271a90          MyTest!Dog::DisplayInfo (void)
012720a0          MyTest!Dog::Run (void)
01271d50          MyTest!Dog::~Dog = <no type information>
01279680          MyTest!Dog::'RTTI Base Class Array' = <no type information>
01279670          MyTest!Dog::'RTTI Class Hierarchy Descriptor' = <no type information>
01279658          MyTest!Dog::'RTTI Complete Object Locator' = <no type information>
01278854          MyTest!Dog::'vftable' = <no type information>
01279690          MyTest!Dog::'RTTI Base Class Descriptor at (0,-1,0,64)' = <no type information>

Then search (s command) the memory space for the vtable value.

0:000:x86> s -d 0 L?0xffffffff 01278854
00585288  01278854 00585308 00676f44 cd006c61  T.'..SX.Dog.al..
00589f78  01278854 005851f8 00676f44 cd006c61  T.'..QX.Dog.al..

Above result shows 2 Dog objects found. Now we can examine the Dog objects
by using dt command. The second dt /b command shows the content of name field in Dog object.

0:000:x86> dt 00585288 Dog
MyTest!Dog
   +0x000 __VFN_table : 0x01278854
   +0x004 name             : std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x024 petOwner         : std::basic_string<char,std::char_traits<char>,std::allocator<char> >

0:000:x86> dt /b 00585288+4 std::basic_string<char,std::char_traits<char>,std::allocator<char> >
MyTest!std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x000 _Myproxy         : 0x00585308
   +0x004 _Bx              : std::_String_val<char,std::allocator<char> >::_Bxty
      +0x000 _Buf             :  "Dog"
       [00] 68 'D'
       [01] 111 'o'
       [02] 103 'g'
       [03] 0 ''
       [04] 97 'a'
       [05] 108 'l'
       [06] 0 ''
       [07] -51 ''
       [08] -51 ''
       [09] -51 ''
       [10] -51 ''
       [11] -51 ''
       [12] -51 ''
       [13] -51 ''
       [14] -51 ''
       [15] -51 ''
      +0x000 _Ptr             : 0x00676f44  "--- memory read error at address 0x00676f44 ---"
      +0x000 _Alias           :  "Dog"
       [00] 68 'D'
       [01] 111 'o'
       [02] 103 'g'
       [03] 0 ''
       [04] 97 'a'
       [05] 108 'l'
       [06] 0 ''
       [07] -51 ''
       [08] -51 ''
       [09] -51 ''
       [10] -51 ''
       [11] -51 ''
       [12] -51 ''
       [13] -51 ''
       [14] -51 ''
       [15] -51 ''
   +0x014 _Mysize          : 3
   +0x018 _Myres           : 0xf
   +0x01c _Alval           : std::allocator<char>
   =012788f4 npos             : 0xffffffff

 

Wednesday, September 22, 2010

SeDebugPrivilege and Integrity Level

Windows Integrity mechanism was introduced in Vista/Win 2008. With this feature, operating system assigns so-called integrity level to process or thread. There are 5 integrity levels - Untrusted level (0x0000), Low integrity level (0x1000), Medium integrity level (0x2000), High integrity level (0x3000), System integrity level (0x4000). An interesting point is any process or thread with lower integrity level cannot access higher integrity level process or thread.

When an administrator runs an appliication with normal mode, its integrity level is Medium. If an administrator runs the application in elevated mode, its integrity level becomes High integrity level. Then can the elevated application access any process having System integrity level? The anwser is no. The attempt to access system integrity level process or thread will return Access Denied exception.

So if we cannot access any System integrity level process even if the current user is administrator, how can we solve this problem? We know there are various user scenarios that need to access system process.

There is a way. If SeDebugPrivilege is set in elevated process, the process/thread can access System integrity level process or thread. The below code shows one way of enabling (or disabling) SeDebugPrivilege in the access token.

BOOL EnableDebugPrivilege(BOOL bEnable)
{
HANDLE hToken = NULL;

if (!OpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES, &hToken))
return FALSE;

LUID luid;
if (!LookupPrivilegeValue(NULL, SE_DEBUG_NAME, &luid ))
return FALSE;

TOKEN_PRIVILEGES tokenPriv;
tokenPriv.PrivilegeCount = 1;
tokenPriv.Privileges[0].Luid = luid;
tokenPriv.Privileges[0].Attributes = bEnable ? SE_PRIVILEGE_ENABLED : 0;

if (!AdjustTokenPrivileges(hToken, FALSE, &tokenPriv, sizeof(TOKEN_PRIVILEGES), NULL, NULL))
return FALSE;

return TRUE;
}

Once the code is executed, we can check the SeDebugPrivilege
by running !token in the debugger.

0:011> !token –n
Privs:
00 0x000000005 SeIncreaseQuotaPrivilege Attributes -
01 0x000000007 SeTcbPrivilege Attributes -
02 0x000000008 SeSecurityPrivilege Attributes -
03 0x000000009 SeTakeOwnershipPrivilege Attributes -
04 0x00000000a SeLoadDriverPrivilege Attributes -
05 0x00000000b SeSystemProfilePrivilege Attributes -
06 0x00000000c SeSystemtimePrivilege Attributes -
07 0x00000000d SeProfileSingleProcessPrivilege Attributes -
08 0x00000000e SeIncreaseBasePriorityPrivilege Attributes -
09 0x00000000f SeCreatePagefilePrivilege Attributes -
10 0x000000011 SeBackupPrivilege Attributes -
11 0x000000012 SeRestorePrivilege Attributes -
12 0x000000013 SeShutdownPrivilege Attributes -
13 0x000000014 SeDebugPrivilege Attributes - Enabled
14 0x000000016 SeSystemEnvironmentPrivilege Attributes -
15 0x000000017 SeChangeNotifyPrivilege Attributes - Enabled Default
16 0x000000018 SeRemoteShutdownPrivilege Attributes -
17 0x000000019 SeUndockPrivilege Attributes -
18 0x00000001c SeManageVolumePrivilege Attributes -
19 0x00000001d SeImpersonatePrivilege Attributes - Enabled Default
20 0x00000001e SeCreateGlobalPrivilege Attributes - Enabled Default
21 0x000000021 SeIncreaseWorkingSetPrivilege Attributes -
22 0x000000022 SeTimeZonePrivilege Attributes -
23 0x000000023 SeCreateSymbolicLinkPrivilege Attributes -

Auth ID: 0:19c236
Impersonation Level: Impersonation
TokenType: Impersonation
Is restricted token: no.

One more thing. If a process sets debug privilege and calls some function in another process - through impersonation - the debug privilege can be propagated to called process. For instance, if a process with debug privilege calls a method in WMI process, the WMI thread will have the same privilege in its thread.

Wednesday, September 1, 2010

STA Reentrancy

I recently observed stack overflow issue due to STA reentrancy issue. The issue occurred when a slew of COM clients called the STA object concurrently and the STA object in question made outgoing cross-apartment (or cross-process) call. When STA COM call is made, the call is sent to a hidden window in STA COM and translated to window message. When making an out-of-apartment call from an STA apartment, STA COM spins a modal message pump while waiting for the call to return. Many calls that arrived at the message loop can now be dispatched, causing reentrance. Given a lot of calls keep entering to the STA object, the STA thread reached the max limit of the thread stack. Hence the stack overflow.

Looking at the debugger, I observed that many threads showed the same pattern as shown below.

46 Id: 2444.1a18 Suspend: 1 Teb: 000007ff`fff5e000 Unfrozen
Child-SP RetAddr Call Site
00000000`03c3dce8 00000000`76f3c0b0 ntdll!ZwWaitForSingleObject+0xa
00000000`03c3dcf0 000007fe`fdca86b2 kernel32!WaitForSingleObjectEx+0x9c
00000000`03c3ddb0 000007fe`fddc9d80 ole32!GetToSTA+0x8a
00000000`03c3de00 000007fe`fddc9375 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x100
00000000`03c3de50 000007fe`fdc9f436 ole32!CRpcChannelBuffer::SendReceive2+0xf1
00000000`03c3e010 000007fe`fdc9f398 ole32!CAptRpcChnl::SendReceive+0x52
00000000`03c3e0d0 000007fe`fd7d603a ole32!CCtxComChnl::SendReceive+0x6c
00000000`03c3e180 000007fe`fd7cef90 RPCRT4!NdrProxySendReceive+0x4a
00000000`03c3e1b0 000007fe`fd7d6157 RPCRT4!NdrpClientCall3+0x246
00000000`03c3e400 000007fe`fd728772 RPCRT4!ObjectStublessClient+0xa7
00000000`03c3e770 000007fe`f99a80e8 RPCRT4!ObjectStubless+0x42
00000000`03c3e7c0 00000000`ffb329cc FastProx!CWbemSvcWrapper::XWbemServices::ExecQueryAsync+0xd4
00000000`03c3e830 00000000`ffb265ba wmiprvse!CServerObject_StaThread::ExecQueryAsync+0xd4
00000000`03c3e8a0 00000000`ffb268a8 wmiprvse!CInterceptor_IWbemSyncProvider::Helper_ExecQueryAsync+0x54a
00000000`03c3e950 000007fe`fd735ec5 wmiprvse!CInterceptor_IWbemSyncProvider::ExecQueryAsync+0x138
00000000`03c3e9f0 000007fe`fd711f46 RPCRT4!Invoke+0x65
00000000`03c3ea60 000007fe`fd7d5cae RPCRT4!NdrStubCall2+0x348
00000000`03c3f040 000007fe`f998412d RPCRT4!CStdStubBuffer_Invoke+0x66
00000000`03c3f070 000007fe`fddc89b9 FastProx!CBaseStublet::Invoke+0x19
00000000`03c3f0a0 000007fe`fddc892b ole32!SyncStubInvoke+0x5d
00000000`03c3f110 000007fe`fdc9d633 ole32!StubInvoke+0xdf
00000000`03c3f1b0 000007fe`fddc87c6 ole32!CCtxComChnl::ContextInvoke+0x19f
00000000`03c3f330 000007fe`fddc855f ole32!AppInvoke+0xc2
00000000`03c3f3a0 000007fe`fddc7314 ole32!ComInvokeWithLockAndIPID+0x407
00000000`03c3f520 000007fe`fd7368d4 ole32!ThreadInvoke+0x1f0
00000000`03c3f5d0 000007fe`fd7369f0 RPCRT4!DispatchToStubInCNoAvrf+0x14
00000000`03c3f600 000007fe`fd70b042 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x100
00000000`03c3f6f0 000007fe`fd70afbb RPCRT4!RPC_INTERFACE::DispatchToStub+0x62
00000000`03c3f730 000007fe`fd70af4a RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0x5b
00000000`03c3f7b0 000007fe`fd737080 RPCRT4!LRPC_SCALL::DispatchRequest+0x436
00000000`03c3f820 000007fe`fd7362bb RPCRT4!LRPC_SCALL::HandleRequest+0x200
00000000`03c3f940 000007fe`fd735e1a RPCRT4!LRPC_ADDRESS::ProcessIO+0x44a
00000000`03c3fa60 000007fe`fd717769 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0x24a
00000000`03c3fb10 000007fe`fd717714 RPCRT4!ProcessIOEventsWrapper+0x9
00000000`03c3fb40 000007fe`fd7177a4 RPCRT4!BaseCachedThreadRoutine+0x94
00000000`03c3fb80 00000000`76f2be3d RPCRT4!ThreadStartRoutine+0x24
00000000`03c3fbb0 00000000`77136a51 kernel32!BaseThreadInitThunk+0xd
00000000`03c3fbe0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

When checking the GetToSTA, I found the COM call was made to STA thread where WMI component did reside. I learned the client created a lot of multiple async threads and sent OLE requests simultaneously. All those COM calls were entered into STA thread message loop.

Looking at the STA thread, there is repeating pattern of the below red part. That is, the incoming call is processed by the STA thread -> STA thread calls another component and wait for reply -> STA thread peeks message queue to see if there is any other incoming message. If any, the STA thread processes the message. This is STA reentrancy and it is a COM feature.

…..
00000000`013da270 000007fe`fdca80c9 ole32!ComInvoke+0x85
00000000`013da2a0 000007fe`fdca7eae ole32!ThreadDispatch+0x29
00000000`013da2d0 00000000`7705d53e ole32!ThreadWndProc+0xaa
00000000`013da350 00000000`7705d7c6 USER32!UserCallWinProcCheckWow+0x1ad
00000000`013da410 000007fe`fdd31433 USER32!DispatchMessageWorker+0x389
00000000`013da490 000007fe`fdcb11e0 ole32!CCliModalLoop::PeekRPCAndDDEMessage+0x73
00000000`013da500 000007fe`fdc7b093 ole32!CCliModalLoop::BlockFn+0x36100
00000000`013da540 000007fe`fddc7689 ole32!ModalLoop+0x6f
…..
00000000`013dc360 000007fe`f4d50263 framedyn!Provider::CreateInstanceEnum+0x34
….
00000000`013dd5d0 000007fe`fdca80c9 ole32!ComInvoke+0x85
00000000`013dd600 000007fe`fdca7eae ole32!ThreadDispatch+0x29
00000000`013dd630 00000000`7705d53e ole32!ThreadWndProc+0xaa
00000000`013dd6b0 00000000`7705d7c6 USER32!UserCallWinProcCheckWow+0x1ad
00000000`013dd770 000007fe`fdd31433 USER32!DispatchMessageWorker+0x389
00000000`013dd7f0 000007fe`fdcb11e0 ole32!CCliModalLoop::PeekRPCAndDDEMessage+0x73
00000000`013dd860 000007fe`fdc7b093 ole32!CCliModalLoop::BlockFn+0x36100
00000000`013dd8a0 000007fe`fddb4cb0 ole32!ModalLoop+0x6f
00000000`013dd8f0 000007fe`fddcb946 ole32!SwitchSTA+0x20
00000000`013dd920 000007fe`fddc9375 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x24a6
00000000`013dd970 000007fe`fdc7b3be ole32!CRpcChannelBuffer::SendReceive2+0xf1
….
00000000`013de310 000007fe`f4d50d1a FastProx!CWbemSvcWrapper::XWbemServices::GetObjectW+0x95
00000000`013de370 000007fe`f4d527d5 framedyn!Provider::GetClassObjectInterface+0xda
00000000`013de420 000007fe`fd735ec5 framedyn!CWbemProviderGlue::ExecQueryAsync+0x2ad
….
00000000`013df620 000007fe`fdca80c9 ole32!ComInvoke+0x85
00000000`013df650 000007fe`fdca7eae ole32!ThreadDispatch+0x29
00000000`013df680 00000000`7705d53e ole32!ThreadWndProc+0xaa
00000000`013df700 00000000`7705d7c6 USER32!UserCallWinProcCheckWow+0x1ad
00000000`013df7c0 00000000`ffb133f3 USER32!DispatchMessageWorker+0x389
00000000`013df840 00000000`ffb12eb8 wmiprvse!WmiThread<unsigned long>::ThreadWait+0x11b
00000000`013dfac0 00000000`ffb11aa8 wmiprvse!WmiThread<unsigned long>::ThreadDispatch+0xf4
00000000`013dfb20 00000000`76f2be3d wmiprvse!WmiThread<unsigned long>::ThreadProc+0x30
00000000`013dfb50 00000000`77136a51 kernel32!BaseThreadInitThunk+0xd
00000000`013dfb80 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Tuesday, July 6, 2010

COM: GetToSTA

When a COM call is made to STA (single-threaded apartment) COM, ole32!GetToSTA is called to send the COM call to target STA thread. An example below shows that a thread was created for RPC server call (SCALL) and its call was dispatched to a WMI provider which is a (STA) COM component. Once the COM call is made to STA thread, the thread (tid=4) is waiting for a event which will be set by STA thread when the COM call is finished.
0:004> kL
ChildEBP RetAddr 
00c9e5b0 77175e6c ntdll!KiFastSystemCallRet
00c9e5b4 7532179c ntdll!ZwWaitForSingleObject+0xc
00c9e620 75a0f003 KERNELBASE!WaitForSingleObjectEx+0x98
00c9e638 75a0efb2 kernel32!WaitForSingleObjectExImplementation+0x75
00c9e64c 75ca88df kernel32!WaitForSingleObject+0x12
00c9e670 75dca819 ole32!GetToSTA+0xad
00c9e6a0 75dcc05f ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x140
00c9e780 75cbd0e5 ole32!CRpcChannelBuffer::SendReceive2+0xef
00c9e7fc 75cbcb09 ole32!CAptRpcChnl::SendReceive+0xaf
00c9e850 75dcbf75 ole32!CCtxComChnl::SendReceive+0x1c5
00c9e86c 76c5178b ole32!NdrExtpProxySendReceive+0x49
00c9e878 76cc5744 RPCRT4!NdrpProxySendReceive+0xe
00c9ec90 75dcba02 RPCRT4!NdrClientCall2+0x1a6
00c9ecb0 75cbc8b5 ole32!ObjectStublessClient+0xa2
00c9ecc0 6b23f6a3 ole32!ObjectStubless+0xf
00c9ecf8 0027d9d9 FastProx!CWbemSvcWrapper::XWbemServices::CreateInstanceEnumAsync+0x6e
00c9ed28 00263778 wmiprvse!CServerObject_StaThread::CreateInstanceEnumAsync+0x92
00c9ed68 002635dc wmiprvse!CInterceptor_IWbemSyncProvider::Helper_CreateInstanceEnumAsync+0x159
00c9edac 76c5fc8f wmiprvse!CInterceptor_IWbemSyncProvider::CreateInstanceEnumAsync+0xf4
00c9edd4 76cc4c53 RPCRT4!Invoke+0x2a
00c9f1dc 75dcd936 RPCRT4!NdrStubCall2+0x2d6
00c9f224 6b234f55 ole32!CStdStubBuffer_Invoke+0xb6
00c9f238 75dcd9c6 FastProx!CBaseStublet::Invoke+0x29
00c9f280 75dcdf1f ole32!SyncStubInvoke+0x3c
00c9f2cc 75ce213c ole32!StubInvoke+0xb9
00c9f3a8 75ce2031 ole32!CCtxComChnl::ContextInvoke+0xfa
00c9f3c4 75dca754 ole32!MTAInvoke+0x1a
00c9f3f4 75dcdcbb ole32!AppInvoke+0xab
00c9f4d4 75dca773 ole32!ComInvokeWithLockAndIPID+0x372
00c9f520 76c5f34a ole32!ThreadInvoke+0x302
00c9f55c 76c5f4da RPCRT4!DispatchToStubInCNoAvrf+0x4a
00c9f5b4 76c5f3c6 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x16c
00c9f5dc 76c60cef RPCRT4!RPC_INTERFACE::DispatchToStub+0x8b
00c9f614 76c5f882 RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0xb2
00c9f660 76c5f7a4 RPCRT4!LRPC_SCALL::DispatchRequest+0x23b
00c9f680 76c5f763 RPCRT4!LRPC_SCALL::QueueOrDispatchCall+0xbd
00c9f69c 76c5f5ff RPCRT4!LRPC_SCALL::HandleRequest+0x34f
00c9f6d0 76c5f573 RPCRT4!LRPC_SASSOCIATION::HandleRequest+0x144
00c9f708 76c5ee4f RPCRT4!LRPC_ADDRESS::HandleRequest+0xbd
00c9f780 76c5ece7 RPCRT4!LRPC_ADDRESS::ProcessIO+0x50a
00c9f78c 76c61357 RPCRT4!LrpcServerIoHandler+0x16
00c9f79c 7715d3a3 RPCRT4!LrpcIoComplete+0x16
00c9f7c4 77160748 ntdll!TppAlpcpExecuteCallback+0x1c5
00c9f92c 75a11194 ntdll!TppWorkerThread+0x5a4
00c9f938 7718b3f5 kernel32!BaseThreadInitThunk+0xe
00c9f978 7718b3c8 ntdll!__RtlUserThreadStart+0x70
00c9f990 00000000 ntdll!_RtlUserThreadStart+0x1b
0:004> kpL
ChildEBP RetAddr 
00c9e5b0 77175e6c ntdll!KiFastSystemCallRet(void)
00c9e5b4 7532179c ntdll!ZwWaitForSingleObject(void)+0xc
.......
00c9e670 75dca819 ole32!GetToSTA(class OXIDEntry * pOXIDEntry = 0x00333918, class CMessageCall * pCall = 0x0036af18)+0xad
.......
As you see above, GetToSTA takes 2 parameters. One for OXIDEntry object pointer and the other for CMessageCall object pointer. If we look into the first parameter, it provides some useful information.
0:004> dt OXIDEntry 0x00333918
ole32!OXIDEntry
   +0x000 _pNext           : 0x75dd68f8 OXIDEntry
   +0x004 _pPrev           : 0x00333898 OXIDEntry
   +0x008 _dwPid           : 0x204
   +0x00c _dwTid           : 0x16bc
   +0x010 _moxid           : _GUID {9feeca0e-6667-eb70-3925-cbaa316f4a29}
   +0x020 _mid             : 0x294a6f31`aacb2539
   +0x028 _ipidRundown     : _GUID {0000380d-0204-16bc-3099-d83e8ae297f6}
   +0x038 _dwFlags         : 0x303
   +0x03c _hServerSTA      : 0x0c4400f0 HWND__
   +0x040 _pParentApt      : 0x003498e0 CComApartment
   +0x044 _pRpc            : (null)
   +0x048 _pAuthId         : (null)
   +0x04c _pBinding        : (null)
   +0x050 _dwAuthnHint     : 1
   +0x054 _dwAuthnSvc      : 0xffffffff
   +0x058 _pMIDEntry       : 0x003336f0 MIDEntry
   +0x05c _pRUSTA          : 0x0035255c IRemUnknown
   +0x060 _cRefs           : 0n13
   +0x064 _hComplete       : (null)
   +0x068 _cCalls          : 0n1
   +0x06c _cResolverRef    : 0n0
   +0x070 _dwExpiredTime   : 0
   ......
dwPid and dwTid indicate the process id and thread id to which the COM call is made. In this case thread 4 is making a COM call to thread 7 (PID 0x204, TID 0x16bc) where the STA COM object resides.
0:007> ~
.......

.  7  Id: 204.16bc Suspend: 1 Teb: 7ffd8000 Unfrozen
.......
When a COM call is made, STA thread receives the call, translates it to window message and put it into hidden window message queue. STA thread processes the message from the queue and set event when the COM method is done. This will release Wait function in caller thread (id=4 in above case).

Sunday, June 13, 2010

64bit Calling Convention

64 bit calling convention - what it means to debugging.

While 32 bit (x86) has multiple calling conventions such as cdecl, stdcall, fastcall, thiscall, 64 bit (x64) only has single calling convention which has unique characteristics. Some important characteristics are
  • 64 bit calling converntion passes first 4 parameters to 4 registers (RCX, RDX, R8, R9) and additional parameters to stack (similar to fastcall calling convention). And even if parameters are less than 4, stack space for 4 parameters are always reserved (this area is called home space or home area). (Note: Fastcall calling convnetions pass one or more parameters by using registers to make a fast function call. x86 fastcall calling convention passes first 2 parameters to ECX, EDX registers.)
  • Stack will have 16 bytes alignment to aid performance. This means if there are 5 parameters, there will be 48 bytes reserved for parameters (5 params x 8 bytes + 8 bytes for alignment)
  • Stack pointer (rsp) typically does not change within a given function. Stack size for a function code is pre-calculated and so stack pointer does not change once prolog is done.
Understanding 64 bit calling convention is important for debugging since depending on whether one has optimized or non-optimized build, parameters in call stack can be useless or often misleading. For non-optimization build (ex: when compiled with /Od option in C++), called function, through its prolog code, copies all 4 parameters saved in registers (RCX,RDX,R8,R9) to stack home area. So parameters inspection through dv, kP debug command displays correct parameter values. However, optimization build does not save those parameters in registers to stack area and, even worse, those stack home area are used for other purpose. This behavior in optimziation build can often mislead developers to wrong parameter values. So developer shouldn't trust call stack parameter values (kP) or display variables (dv) results when debugging againt 64bit optimized build.

Let's look at a small sample.
int Calc(int a, int b, int c, int d, int e)
{                              // <= breakpoint 1
    int result = 0;            // <= breakpoint 2
    for(int i=0; i<10; i++)
    {
       result += a*i + b - c + d * 2 + e;
       printf("%d : %d\n", i, result);
    }
    result += a - b + c -d + e;
    return result;
}

int _tmain(int argc, _TCHAR* argv[])
{
    int s1,s2,s3,s4,s5;
    scanf("%d %d %d %d %d", &s1, &s2, &s3, &s4, &s5);

    int result = Calc(s1,s2,s3,s4,s5); // <= breakpoint 0
    printf("Result = %d", result);

    return 0;
}
I set 3 breakpoints as marked above.
0:000> bl
 0 e 00000001`3f5e10ed     0001 (0001)  0:**** Simple!wmain+0x3d
 1 e 00000001`3f5e1000     0001 (0001)  0:**** Simple!Calc
 2 e 00000001`3f5e1016     0001 (0001)  0:**** Simple!Calc+0x16
Right before calling a function at breakpoint 0, we can inspect the assembly code to see how the parameters are passed. Basically what it does is to pass first 4 parameters (I entered 1,2,3,4,5 for scanf()) to ECX, EDX, R8D, R9D registers. (Since passing parameters are int32, ECX register is used instead of RCX). The last 5th parameter is passed to stack (rsp+20h).
0:000> u .
Simple!wmain+0x3d [c:\temp\simple\simple.cpp @ 18]:
00000001`3fd910ed 8b442434        mov     eax,dword ptr [rsp+34h]
00000001`3fd910f1 89442420        mov     dword ptr [rsp+20h],eax  //5th param: 5
00000001`3fd910f5 448b4c2440      mov     r9d,dword ptr [rsp+40h]  // 4
00000001`3fd910fa 448b442430      mov     r8d,dword ptr [rsp+30h]  // 3
00000001`3fd910ff 8b542438        mov     edx,dword ptr [rsp+38h]  // 2
00000001`3fd91103 8b4c243c        mov     ecx,dword ptr [rsp+3Ch]  //1st param: 1
00000001`3fd91107 e8f4feffff      call    Simple!Calc (00000001`3fd91000)
Now let's continue to reach breakpoint 1 at the begining of Calc() function. This is the point where we can check prolog assembly code of the function. For non-optimzition build, here you can see that those registers for parameters are copied to stack home area.
0:000> uf .
Simple!Calc [c:\temp\simple\simple.cpp @ 4]:
    4 00000001`3f5d1000 44894c2420      mov     dword ptr [rsp+20h],r9d
    4 00000001`3f5d1005 4489442418      mov     dword ptr [rsp+18h],r8d
    4 00000001`3f5d100a 89542410        mov     dword ptr [rsp+10h],edx
    4 00000001`3f5d100e 894c2408        mov     dword ptr [rsp+8],ecx
Once those function prolog codes are executed, that is, when we move to breakpoint 2, the stack has correct 5 parameters and thus kP call stack command or dv command displays correct parameter values. Below we can check 5 parameters in stack address 00000000`0026feb0 ~ 00000000`0026fed0. Stack slot 00000000`0026fed8 has garbage value, just for 16 bytes alignment.
0:000> p
Breakpoint 2 hit
Simple!Calc+0x16:
00000001`3f5d1016 c744242000000000 mov     dword ptr [rsp+20h],0
0:000> dq /c 1 @rsp
00000000`0026fe70  00000000`00000000
00000000`0026fe78  00000000`5fca10b1
00000000`0026fe80  00000000`00000001
00000000`0026fe88  00000000`00000000
00000000`0026fe90  00000000`00000000
00000000`0026fe98  00000001`3f5d11ac
00000000`0026fea0  00000001`3f5d2150
00000000`0026fea8  00000001`3f5d110c //return address
00000000`0026feb0  00000001`00000001 //param 1
00000000`0026feb8  00000000`00000002
00000000`0026fec0  00000000`00000003
00000000`0026fec8  00000000`00000004
00000000`0026fed0  00000000`00000005 //param 5
00000000`0026fed8  00000000`0026fee4 //for alignment
And here is what I got when running kP and dv command.
0:000> kP
Child-SP          RetAddr           Call Site
00000000`0026fe70 00000001`3f5d110c Simple!Calc(
   int a = 0n1,
   int b = 0n2,
   int c = 0n3,
   int d = 0n4,
   int e = 0n5)+0x16 [c:\temp\simple\simple.cpp @ 5]
0:000> dv /i /V
prv param  00000000`0026feb0 @rsp+0x0040                     a = 0n1
prv param  00000000`0026feb8 @rsp+0x0048                     b = 0n2
prv param  00000000`0026fec0 @rsp+0x0050                     c = 0n3
prv param  00000000`0026fec8 @rsp+0x0058                     d = 0n4
prv param  00000000`0026fed0 @rsp+0x0060                     e = 0n5
prv local  00000000`0026fe90 @rsp+0x0020                result = 0n0
Now what if we have optimized build? I recompiled the source code with Maxmimum Speed optimization (/O2). For optimized build, the prolog of Calc() function starts like this.
0:000> uf Simple!Calc
Simple!Calc [c:\temp\simple\simple.cpp @ 4]:
    4 00000001`3ff51000 48895c2408      mov     qword ptr [rsp+8],rbx
    4 00000001`3ff51005 48896c2410      mov     qword ptr [rsp+10h],rbp
    4 00000001`3ff5100a 4889742418      mov     qword ptr [rsp+18h],rsi
    4 00000001`3ff5100f 57              push    rdi
    4 00000001`3ff51010 4154            push    r12
    4 00000001`3ff51012 4155            push    r13
    4 00000001`3ff51014 4156            push    r14
    4 00000001`3ff51016 4157            push    r15
    4 00000001`3ff51018 4883ec20        sub     rsp,20h
As you can see here, there is no mov command for parameter copy. By the time I reached breakpoint 2 where prolog codes are all executed, the first 4 parameter values were not copied at all and only registers held the parameter values.
0:000> p
Breakpoint 2 hit
Simple!Calc+0x1c:
00000001`3ff5101c 448b6c2470      mov     r13d,dword ptr [rsp+70h] ss:00000000`0022f8f0=00000005
0:000> kP L1
Child-SP          RetAddr           Call Site
00000000`0022f880 00000001`3ff510e1 Simple!Calc(
   int a = 0n1,
   int b = 0n0,
   int c = 0n0,
   int d = 0n2291968,
   int e = 0n5)+0x1c [c:\temp\simple\simple.cpp @ 5]
0:000> dv /i
prv param                a = 0n1
prv param                b = 0n0
prv param                c = 0n0
prv param                d = 0n2291968
prv param                e = 0n5
0:000> r rcx
rcx=0000000000000001
0:000> r rdx
rdx=0000000000000002
0:000> r r8
r8=0000000000000003
0:000> r r9
r9=0000000000000004
As you might already notice, this behavior of optimized build can cause a lot of headache for 64 bit debugging. The behavior means that the call stack parameter information in 64 bit optimization build is completely useless. It will be much painful if we need to analyze regular dump file or Watson dump file which has less debugging information. So then how can we find correct parameter values? We know from the previous inspection that only registers hold those 4 parameter values. Starting from this point, we can think we have to trace down what parameter values were entered from previous call frame. When caller calls a function, it saves 4 parameters to registers. Since we can see this in assembly code, we unassmeble the code and can track down the parameter value. But what if the caller doesn't pass constant value as a parameter? Well, then, it will be much more tedious investigation since we have to dig into the history of the registers or stack area. For unfortunate cases, we might need to inspect many call stack frames and the assmebly codes to figure out how the parameters were passed all the way up to current stack frame.

Thursday, June 10, 2010

How To Dump

How to dump user process [101]

There are many ways to dump the user process. I introduce here some commonly used methods of how to dump a process.

A. Using CDB

CDB is console based general purpose debugging tool and it's also good tool to dump a process. When dumping a process, we normally want to be "non-invasive" which means we don't want to ruin the process and just take a snapshot of the process. This can be done by specifying -pv option. If the process name is unique, you can use -pn option with exe file name. But if there are several processes having the same process name, typically we check process PID of interest and use -p option. The -c option below is actual debugger command that the CDB is going to run. The .dump command below dumps the process to specified file.
C> cdb -pv –pn myApp.exe -c ".dump /ma /u c:\tmp\myApp.dmp;q"   
  C> cdb -pv –p 500 -c ".dump /ma c:\tmp\myApp.dmp;q"   

B. Using ADPLUS

ADPLUS is the tool that Microsoft CSS often uses to take a dump. There are 2 dump modes in this tool - one for hang and the other for crash dump.

HANG : to capture hang dump, you run ADPLUS with -hang option after hang occurred. It will take a dump and leave the process intact (meaning non-invasive dump). Need to specify -p with PID and -o with output folder.

C:\Debuggers> adplus -hang -p 433 -o c:\Test (PID=433)

Logs and memory dumps will be placed in c:\Test\20100127_111336_Hang_Mode

CRASH : the other ADPLUS mode is crash mode, which takes a dump when the process is crashed. Since we never know when the crash occurs, the ADPLUS command - of course - shoud be run before the crash occurs. If you're using remote connection (mstsc.exe) , you should use /console. Crash mode is very handy since adplus will wait until the crash occurs.

C:\Debuggers> adplus -crash -pn App.exe -o c:\test

Logs and memory dumps will be placed in c:\test\20100127_111828_Crash_Mode

Note: adplus was originally written in VBScript but they wrote exe version in recent version. By the way, adplus internally uses CDB to capture dump.

C. Using Task Manager

Since Vista OS, Task Manager has new context menu called "Create Dump File." In order to create a dump for the specific process, you select a process and rightclick and then choose 'Create Dump File" menu. Here is an example of Windows 7 Task Manager.


Create Dump File From Task Manager
After dumping is done, it shows the dumpe file location in the message box.

Friday, April 2, 2010

Thread Stack

Thread Stack

Each thread has two stacks – one stack for kernel mode and the other for user mode. Where can we find those stacks? Well, let’s quickly take a look.

First, run Calc.exe and attached debugger (my favorite Windbg) to the Calc process. Once the debugger is attached, switch thread to 0 and run !teb to display Thread Environment Block(TEB). By looking at TEB, we can figure out the user mode stack area.
0:004> ~0s
eax=0012ed84 ebx=00000000 ecx=0012ed84 edx=779764f4 esi=0012ed84 edi=77399442
eip=779764f4 esp=0012ec80 ebp=0012ec9c iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
779764f4 c3              ret
0:000> !teb
TEB at 7ffdf000
ExceptionList:        0012fa98
    StackBase:            00130000
    StackLimit:           0012a000
SubSystemTib:         00000000
FiberData:            00001e00
ArbitraryUserPointer: 00000000
Self:                 7ffdf000
EnvironmentPointer:   00000000
……
In TEB above, we see there is user mode stack base and upper limit of the stack. That is, the stack ranges from 0x00130000 – 0x0012a000 (stack grows from high to low memory).

Secondly, then how can we find kernel mode stack? As you guess, we have to use (local) kernel debugger to find that out. Let’s see local kernel debugger to get this handy. In order to get thread object address, !process command was used as follows.
lkd> !process 0 4 calc.exe
PROCESS 86ac47e8  SessionId: 1  Cid: 0558    Peb: 7ff
DirBase: ce18ea00  ObjectTable: a3b59140  HandleC
Image: calc.exe
THREAD 85b92938  Cid 0558.1db8  Teb: 7ffdf000
THREAD 85cf19b0  Cid 0558.1768  Teb: 7ffdd000
THREAD 86b3f128  Cid 0558.1a3c  Teb: 7ffdc000
THREAD 880b1d48  Cid 0558.1bf0  Teb: 7ffdb000
THREAD 871c05c8  Cid 0558.1e90  Teb: 7ffda000


Once the thread object (85b92938 ) is found, the kernel thread block (_KTHREAD) can be displayed with dt command. Bold-face part shows initial stack, max limit and current stack position (KernelStack at 0x30).
lkd> dt nt!_KTHREAD 85b92938
+0x000 Header           : _DISPATCHER_HEADER
+0x010 CycleTime        : 0x14b1bef8
+0x018 HighCycleTime    : 0
+0x020 QuantumTarget    : 0x174fbc90
   +0x028 InitialStack     : 0x8f7f0fd0 Void
   +0x02c StackLimit       : 0x8f7ee000 Void
   +0x030 KernelStack      : 0x8f7f09b0 Void
+0x034 ThreadLock       : 0
……
+0x086 SpecialApcDisable : 0n0
+0x084 CombinedApcDisable : 0
   +0x088 Teb              : 0x7ffdf000 Void
+0x090 Timer            : _KTIMER
……


KTHREAD also includes TEB pointer information, so we can query the TEB with its address.
lkd> !teb 0x7ffdf000
TEB at 7ffdf000
ExceptionList:        00078914
 StackBase:            00080000
    StackLimit:           00069000
SubSystemTib:         00000000
FiberData:            00001e00
ArbitraryUserPointer: 00000000
Self:                 7ffdf000
EnvironmentPointer:   00000000
ClientId:             00000ea8 . 0000131c
RpcHandle:            00000000
Tls Storage:          7ffdf02c
PEB Address:          7ffd6000
LastErrorValue:       0
LastStatusValue:      c0000139
Count Owned Locks:    0
HardErrorMode:        0


If we compare this TEB output with user mode TEB data, we find something wrong. This is because TEB data is located in user address space, not kernel address space. So to retrieve correct thread data, we have to set thread context and prepare physical memory before access (by using .thread command).
lkd> .thread /p /r 85b92938
Implicit thread is now 85b92938
Implicit process is now 86ac47e8
Loading User Symbols
................................
lkd> !teb 0x7ffdf000
TEB at 7ffdf000
ExceptionList:        0012fa98
 StackBase:            00130000
    StackLimit:           0012a000
SubSystemTib:         00000000
FiberData:            00001e00
ArbitraryUserPointer: 00000000
Self:                 7ffdf000
EnvironmentPointer:   00000000
ClientId:             00000558 . 00001db8
RpcHandle:            00000000
Tls Storage:          7ffdf02c
PEB Address:          7ffde000
LastErrorValue:       0
LastStatusValue:      c0150008
Count Owned Locks:    0
HardErrorMode:        0
Plesae note that the thread object (ex:85b92938) points to an executive thread block (ETHREAD) which includes its kernel thread block (KTHREAD) as its first member of the ETHREAD structure. KTHREAD contains TEB pointer in its structure.



By the way, another easy way to discover kernel thread stack is to simply use !thread command. In the middle of the output below, there is kernel stack information. And as seen below, ChildEBP addresses are all within kernel stack range.
lkd> !thread 85b92938
THREAD 85b92938  Cid 0558.1db8  Teb: 7ffdf000 Win32Thread: fe5a34f8 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
85b92b00  Semaphore Limit 0x2
Not impersonating
DeviceMap                 bf992858
Owning Process            86ac47e8       Image:         calc.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      13525527       Ticks: 79995 (0:00:20:47.929)
Context Switch Count      1580
UserTime                  00:00:00.031
KernelTime                00:00:00.078
Win32 Start Address 0x00959768
Stack Init 8f7f0fd0 Current 8f7f09b0 Base 8f7f1000 Limit 8f7ee000 Call 0
Priority 11 BasePriority 8 UnusualBoost 0 ForegroundBoost 2 IoPriority 2 PagePriority 5
ChildEBP RetAddr  Args to Child
8f7f09c8 82a71c15 85b92938 00000000 807c8120 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
8f7f0a00 82a704f3 85b929f8 85b92938 85b92b00 nt!KiSwapThread+0x266 (CONV: fastcall)
8f7f0a28 82a6a3cf 85b92938 85b929f8 00000000 nt!KiCommitThreadWait+0x1df (CONV: stdcall)
8f7f0aa4 82aad0d6 85b92b00 00000005 00000000 nt!KeWaitForSingleObject+0x393 (CONV: stdcall)
8f7f0abc 82aab117 00000000 00000000 00000000 nt!KiSuspendThread+0x18 (FPO: [3,0,0]) (CONV: stdcall)
8f7f0b04 82a71bfd 00000000 00000000 00000000 nt!KiDeliverApc+0x17f (CONV: stdcall)
8f7f0b48 82a704f3 85b929f8 85b92938 87965ff0 nt!KiSwapThread+0x24e (CONV: fastcall)
8f7f0b70 82a6a3cf 85b92938 85b929f8 00000000 nt!KiCommitThreadWait+0x1df (CONV: stdcall)
8f7f0be8 9af10d75 87965ff0 0000000d 00000001 nt!KeWaitForSingleObject+0x393 (CONV: stdcall)
WARNING: Frame IP not in any known module. Following frames may be wrong.
8f7f0d1c 82a4647a 0012ed84 00000000 00000000 0x9af10d75
8f7f0d1c 00000000 0012ed84 00000000 00000000 nt!KiFastCallEntry+0x12a (FPO: [0,3] TrapFrame @ 8f7f0c
60)
8f7f0ce8 00000000 9af152a2 001b0b8e 0000000f 0x0
More topics to come:
  • Calling convention and stack
  • 64bit calling convention and stack

Tuesday, March 23, 2010

Understanding Impersonation

Understanding Impersonation


Process Token

Whenever a process is created, its process access token is also created in the kernel. This process token is used when access permission check is required. Roughly speaking, process token is the user identity of the process. For example, if the process tries to access system resources such as registry or file, the process shows its process token and the operating system checks access permission by using security descriptor (SD) of the system resources. SD contains the complete list of who is allowed and who is denied. Generally, all threads in the process inherit the process token, “unless” thread is impersonating.

Let’s take an example. I ran SQL Configuration Manager, attached the debugger and picked one of threads (thread#1). To inspect the thread token, switch to the thread#1 and run !token –n.
0:000>  ~1s
0:001>  !token -n
Thread is not impersonating. Using process token...
TS Session ID: 0x1
User: S-1-5-21-2127521184-1604012920-1887927527-570548 (User: TDomain\yongslee)
Groups:
00 S-1-5-21-2127521184-1604012920-1887927527-513 (Group: TDomain\Domain Users)
Attributes - Mandatory Default Enabled
01 S-1-1-0
Attributes - Mandatory Default Enabled
.....
.....   
Impersonation Level: Anonymous
TokenType: Primary
Is restricted token: no.
As bold-face text says, the thread is not impersonating any and using primary access token which is the process token.

Impersonation and Thread Token

Sometimes thread might need to impersonate other user. This basically means that the thread does not use process access token and rather uses different user token. This scenario often occurs when client is accessing server resources. To access server resources, the server code impersonates (and acts as) the client identity and performs resource access with it. If the client user doesn’t have permission to access server resource, it throws access denied.

To see how it works, let’s run SQL Configuration Manager and invoke SQL WMI provider. SQL WMI provider is run as Network Service account in the wmiprvse.exe process and thus its process token is representing Network Service. WMI providers are typically using impersonation so we expect that worker thread in WMI provider is using client account, not Network Service. If worker thread uses Network Service, it might be a big security hole.

(1) Run SQL Configuration Manager (SQLCM)

=> Whenever SQLCM is launched, new SQL WMI provider process (wmiprvse) will be created (if already doesn’t exist)

=> Run tlist.exe to find SQL WMI provider

C> tlist –m sqlmgmprovider.dll



(2) Attach to SQL WMI provider by using windbg

C> windbg –p 2202 (ex: 2202 = pid of wmiprvse.exe)

(3) Set breakpoint in one of SQL WMI classes. Let’s try SqlServiceAdvancedProperty.

To break in when Advanced properties is clicked, set bp against this method and keep debugger going.
0:011> bp sqlmgmprovider!SqlServiceAdvancedProperty::EnumerateInstances
0:011> g
(4) In SQLCM, select SQL Server Services -> doubleclick SQL Server (MSSQLSERVER) to bring up the Properties page -> Click Advanced tab to display advanced properties. (This will call SqlServiceAdvancedProperty:: EnumerateInstances method in SQL WMI provider)

Now, breakpoint will be hit and we can check thread token by using !token command.
Breakpoint 0 hit
eax=541c1d58 ebx=80041024 ecx=54214588 edx=6d599bc9 esi=54214588 edi=0095e3c8
eip=541def70 esp=00deefb8 ebp=00deefc8 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
sqlmgmprovider!SqlServiceAdvancedProperty::EnumerateInstances:
541def70 8bff            mov     edi,edi
0:007> !token -n
TS Session ID: 0x1
User: S-1-5-21-2127521184-1604012920-1887927527-570548 (User: TDomain\yongslee)
Groups:
00 S-1-5-21-2127521184-1604012920-1887927527-513 (Group: TDomain\Domain Users)
Attributes - Mandatory Default Enabled
....
Primary Group: S-1-5-21-2127521184-1604012920-1887927527-513 (Group: TDomain\Domain Users)
Privs:
15 0x000000017 SeChangeNotifyPrivilege           Attributes - Enabled Default
19 0x00000001d SeImpersonatePrivilege            Attributes - Enabled Default
20 0x00000001e SeCreateGlobalPrivilege           Attributes - Enabled Default
Auth ID: 0:5eeba
Impersonation Level: Impersonation
TokenType: Impersonation
Is restricted token: no.
The thread token here is impersonating and acts as TDomain\yongslee, not using Network Service. Please note this user is the same one that invoked SQLCM process. So even if the SQL WMI provider process is run as Network Service, actual worker thread is using the client user principal that makes WMI request. If the client application is run in low privilege account and the account cannot access system resource such as registry, the WMI request accessing registry resource won’t be successful.

Impersonation in SQL WMI

Then how can the worker thread in SQL WMI provider impersonate the client principal? It is using internally WbemCoImpersonateClient method in WMI framework API. Before this method is called, the worker thread is using process token (Network Service). Once the WbemCoImpersonateClient is executed, the thread acquires impersonation token.
0:010> bp framedyn!WbemCoImpersonateClient
0:010> g
framedyn!WbemCoImpersonateClient:
0:007> !token –n    //Check token before impersonation
Thread is not impersonating. Using process token...
TS Session ID: 0
User: S-1-5-20 (Well Known Group: NT AUTHORITY\NETWORK SERVICE)
.....
Impersonation Level: Anonymous
TokenType: Primary
Is restricted token: no.
0:007> gu            // Execute WbemCoImpersonateClient()
framedyn!CWbemProviderGlue::CheckImpersonationLevel+0x39:
0:007> !token –n     // Check token after impersonation
TS Session ID: 0x1
User: S-1-5-21-2127521184-1604012920-1887927527-570548 (User: TDomain\yongslee)
.....
Impersonation Level: Impersonation
TokenType: Impersonation