Back in June 2006 I wrote about caching issues when PhysicalMemory is used for memory dumping
. PhysicalMemory is not the only option for memory dumping from User Mode though. There is a system call named NtSystemDebugControl which has a few well known control codes as well as a few less well known ones. One of the less well known control codes is number 10, which is used to copy contents from the physical memory. I believe this technique was first implemented in a dumping tool called kntlist coded by George M. Garner Jr. some years ago. Here I will once again investigate possible caching issues, but this time for NtSystemDebugControl control code 10.
Using NtSystemDebugControl for dumping
For control code 10 we use a struct with the following layout as input buffer:
Next we can for example malloc a page-sized buffer which we point Buffer to, write 4096 in Length, and point PhysicalAddress to the address of the page we wish to copy the contents of. Then we execute NtSystemDebugControl and if all goes well our buffer now contains a copy of the data in the physical page. Quite trivial actually.
To understand if, and if so how, NtSystemDebugControl handles caching issues we have to take a dive into the kernel. The kernel exports the routine called NtSystemDebugControl, which contains a switch statement tasked with dispatching to the correct functionality based on the control code. In the following I will not describe every tiny little detail of the code. The first reason being that there are many details which are not relevant to what we are looking at. Second I have not decompiled the code, but only followed the relevant parts of the disassembly (from a Windows Server 2003 SP0 kernel).
The code handling control code 10 calls the undocumented routine _ExLockUserBuffer which makes the pages in our buffer (*Buffer) resident, locks them in memory, and returns (in a pointer passed as a parameter) a system space virtual address pointing to it. Then another undocumented function named _KdpCopyMemoryChunks is called. The parameters passed to it are among others the system space virtual address pointing to our buffer, the number of bytes to copy (Length), and the address to copy from (PhysicalAddress).
Now yet another undocumented function is called: _MmDbgCopyMemory. This function performs the actual copying of the data. But before it can do any copying it needs to have a virtual address of the source instead of a physical address. Therefore it calls our last undocumented function: _MiDbgTranslatePhysicalAddress. Now we are really closing in on the answer to our question.
_MiDbgTranslatePhysicalAddress receives our physical address as a parameter. It then uses a variable called _ValidKernelPte, which serves as a kind of template for kernel PTE:s. The flags from the template are combined with the physical address to form a PTE pointing to the page we want to copy. Then the PFN database is indexed to find the entry corresponding to our source page. In this entry the kernel looks at the CacheAttribute flags of the u3.e1 member. The CacheAttribute flags are then used to set the PAT index in the PTE. Now we have our answer really, but I will continue a tiny bit further anyway. The PTE the kernel has built is now copied to the location pointed out by the kernel variable _MmDebugPte. Finally, the function returns the virtual address that will use the PTE for mapping. Now _MmDbgCopyMemory is free to copy the data from the physical source page to our buffer using virtual addresses only.
Other versions of Windows
The above reversing was performed on a kernel from Windows Server 2003 SP0. I have also tested NtSystemDebugControl with control code 10 on Windows Server 2003 SP1. There the call always fails with the error code STATUS_NOT_IMPLEMENTED. On Windows XP SP2 it works fine however. _MiDbgTranslatePhysicalAddress in Windows XP SP2 has the same kind of solution for handling cache types as the one I described above. On Windows 2000 the call always fails with the error code STATUS_INVALID_INFO_CLASS.
We now know that NtSystemDebugControl maps the memory with a cache type that is the same as the one found in the PFN database. If we allocate memory of a particular cache type, the PFN database entries corresponding to that memory will have their CacheAttribute flags set to match the cache type we asked for. So, we can conclude that NtSystemDebugControl automatically uses the correct cache type to avoid cache incoherence and other undefined processor behaviour.