Sunday, October 10, 2010

Brute force object search using C++ vtable

In C++, object layout in memory depends on compiler and there is no strict standard layout, unfortunately. But, we can look into some details by using the debugger.

Assuming we use VC++, let's take some examples. Two classes are defined as below. Each class has one data member and Dog class inherits from Animal class. What is the object layout for Dog class? In windbg, we can find the object layout by using dt command (dt <classname>). As you can see below, the Dog object has 2 string objects in its object layout. Data members in base class comes first and then data members in derived class comes after, all in the same order (of data fields) as defined in the class.

class Animal
{
public:
       Animal() {}  
protected:
       string name;
};

class Dog : public Animal
{
public:
       Dog() {}
protected:
       string petOwner;
};

0:000:x86> dt Dog
MyTest!Dog
   +0x000 name             : std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x020 petOwner         : std::basic_string<char,std::char_traits<char>,std::allocator<char> >

That one is easy. What if we have virtual function?
Does it affect object layout? Let's see another example.
The below example has 2 virtual function in base class and overrode by subclass.

#include <iostream>
#include <string>
using namespace std;
class Animal
{
public:
       Animal() { name = "Animal"; }    
       virtual void DisplayInfo()
       {
              cout << name << endl;
       }
       virtual void Run() {}
protected:
       string name;
};

class Dog : public Animal
{
public:
       Dog() { name = "Dog"; petOwner = "N/A"; }
       void DisplayInfo()
       {
              cout << name << ":" << petOwner << endl;
       }
       void Run()
       {
              cout << "Run" << endl;
       }
protected:
       string petOwner;
};

int _tmain(int argc, _TCHAR* argv[])
{
       Dog* pDog = new Dog();
       Dog* pDog2 = new Dog();

       Animal* pA = pDog; // <== breakpoint
       pA->DisplayInfo();
       pA = pDog2;
       pA->Run();

       return 0;
}

If we check the Dog object layout, we can see there is 4 byte vtable pointer
in the first position.

0:000:x86> dt Dog
MyTest!Dog
   +0x000 __VFN_table : Ptr32 
   +0x004 name             : std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x024 petOwner         : std::basic_string<char,std::char_traits<char>,std::allocator<char> >

Now to investigate a little more, I set a breakpoint in main().
When the debugger broke into the breakpoint, two Dog object can be found using dv command.
This is easiest way of finding Dog object in the current process. 

0:000:x86> dv /i
prv param             argc = 0n1
prv param             argv = 0x00585170
prv local               pA = 0xcccccccc
prv local            pDog2 = 0x00585288
prv local             pDog = 0x00589f78

But what if the application is very complex and we're in the middle of nowhere
but want to find all Dog objects in memory? Well, one way we can try is to search vtable
in the whole memory. Since vtable comes first in the object layout, we can look for it in memory
and find a clue for object instance. It is brute force search but sometimes can be useful.

So in order to do that, first, we find vftable by examining (x command) Dog class.

0:000:x86> x MyTest!Dog::*
012729b0          MyTest!Dog::Dog (void)
01271a90          MyTest!Dog::DisplayInfo (void)
012720a0          MyTest!Dog::Run (void)
01271d50          MyTest!Dog::~Dog = <no type information>
01279680          MyTest!Dog::'RTTI Base Class Array' = <no type information>
01279670          MyTest!Dog::'RTTI Class Hierarchy Descriptor' = <no type information>
01279658          MyTest!Dog::'RTTI Complete Object Locator' = <no type information>
01278854          MyTest!Dog::'vftable' = <no type information>
01279690          MyTest!Dog::'RTTI Base Class Descriptor at (0,-1,0,64)' = <no type information>

Then search (s command) the memory space for the vtable value.

0:000:x86> s -d 0 L?0xffffffff 01278854
00585288  01278854 00585308 00676f44 cd006c61  T.'..SX.Dog.al..
00589f78  01278854 005851f8 00676f44 cd006c61  T.'..QX.Dog.al..

Above result shows 2 Dog objects found. Now we can examine the Dog objects
by using dt command. The second dt /b command shows the content of name field in Dog object.

0:000:x86> dt 00585288 Dog
MyTest!Dog
   +0x000 __VFN_table : 0x01278854
   +0x004 name             : std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x024 petOwner         : std::basic_string<char,std::char_traits<char>,std::allocator<char> >

0:000:x86> dt /b 00585288+4 std::basic_string<char,std::char_traits<char>,std::allocator<char> >
MyTest!std::basic_string<char,std::char_traits<char>,std::allocator<char> >
   +0x000 _Myproxy         : 0x00585308
   +0x004 _Bx              : std::_String_val<char,std::allocator<char> >::_Bxty
      +0x000 _Buf             :  "Dog"
       [00] 68 'D'
       [01] 111 'o'
       [02] 103 'g'
       [03] 0 ''
       [04] 97 'a'
       [05] 108 'l'
       [06] 0 ''
       [07] -51 ''
       [08] -51 ''
       [09] -51 ''
       [10] -51 ''
       [11] -51 ''
       [12] -51 ''
       [13] -51 ''
       [14] -51 ''
       [15] -51 ''
      +0x000 _Ptr             : 0x00676f44  "--- memory read error at address 0x00676f44 ---"
      +0x000 _Alias           :  "Dog"
       [00] 68 'D'
       [01] 111 'o'
       [02] 103 'g'
       [03] 0 ''
       [04] 97 'a'
       [05] 108 'l'
       [06] 0 ''
       [07] -51 ''
       [08] -51 ''
       [09] -51 ''
       [10] -51 ''
       [11] -51 ''
       [12] -51 ''
       [13] -51 ''
       [14] -51 ''
       [15] -51 ''
   +0x014 _Mysize          : 3
   +0x018 _Myres           : 0xf
   +0x01c _Alval           : std::allocator<char>
   =012788f4 npos             : 0xffffffff

 

No comments:

Post a Comment