Historical memory dumping techniques: Memory dumping with NtSystemDebugControl

By Arne Vidstrom

May 15, 2019

In another blog post, I wrote about caching issues when PhysicalMemory is used for memory dumping. Memory can also be dumped with a system call named NtSystemDebugControl. I believe this technique was first implemented in a dumping tool called kntlist, coded by George M. Garner Jr. some years ago. Here, I will once again investigate possible caching issues, but this time for NtSystemDebugControl control code 10. Modern versions of Windows don't allow that kind of access, however, and the typical method now is using a custom kernel module.

Using NtSystemDebugControl for dumping

For control code 10 we use a struct with the following layout as input buffer:

DWORD PhysicalAddress;
DWORD Reserved1;
void *Buffer;
DWORD Length;

We can, for example, malloc a page-sized buffer which we point Buffer to, write 4096 in Length, and point PhysicalAddress to the address of the page we wish to copy the contents of. Then, we execute NtSystemDebugControl, and if all goes well, our buffer now contains a copy of the data from the physical page.

Inside NtSystemDebugControl

To understand if, and if so how, NtSystemDebugControl handles caching issues, we have to take a dive into the kernel. The kernel exports the routine called NtSystemDebugControl, which contains a switch statement tasked with dispatching to the correct functionality based on the control code. I won't describe every tiny little detail of the code. The first reason is that many details are not relevant to what we are looking at. Second, I haven't decompiled the code, but only followed the relevant parts of the disassembly (from a Windows Server 2003 SP0 kernel).

The code handling control code 10 calls the undocumented routine _ExLockUserBuffer, which makes the pages in our buffer (*Buffer) resident, locks them in memory, and returns (in a pointer passed as a parameter) a system space virtual address pointing to it. Then, another undocumented function named _KdpCopyMemoryChunks is called. The parameters passed to it are, among others, the system space virtual address pointing to our buffer, the number of bytes to copy (Length), and the address to copy from (PhysicalAddress).

Now, yet another undocumented function is called: _MmDbgCopyMemory. This function performs the actual copying of the data. Before it can do any copying, it needs to have a virtual address of the source instead of a physical address. Therefore it calls our last undocumented function: _MiDbgTranslatePhysicalAddress. Now we are closing in on the answer to our question.

_MiDbgTranslatePhysicalAddress receives our physical address as a parameter. It then uses a variable called _ValidKernelPte, which serves as a kind of template for kernel PTE:s. The flags from the template are combined with the physical address to form a PTE pointing to the page we want to copy. Then, the PFN database is indexed to find the entry corresponding to our source page. In this entry, the kernel looks at the CacheAttribute flags of the u3.e1 member. The CacheAttribute flags are then used to set the PAT index in the PTE. Now we have our answer, but I'll continue a bit further. The PTE the kernel has built is now copied to the location pointed out by the kernel variable _MmDebugPte. Finally, the function returns the virtual address that will use the PTE for mapping. Now _MmDbgCopyMemory is free to copy the data from the physical source page to our buffer using virtual addresses only.

Other versions of Windows

I did the above reversing on a kernel from Windows Server 2003 SP0. I have also tested NtSystemDebugControl with control code 10 on Windows Server 2003 SP1. There, the call always fails with the error code STATUS_NOT_IMPLEMENTED. In Windows XP SP2 it works fine, however. _MiDbgTranslatePhysicalAddress in Windows XP SP2 has the same kind of solution for handling cache types as the one I described above. In Windows 2000, the call always fails with the error code STATUS_INVALID_INFO_CLASS.

Conclusions

We now know that NtSystemDebugControl maps the memory with a cache type that is the same as the one found in the PFN database. If we allocate memory of a particular cache type, the PFN database entries corresponding to that memory will have their CacheAttribute flags set to match the cache type we asked for. So, we can conclude that NtSystemDebugControl automatically uses the correct cache type to avoid cache incoherence and other undefined processor behavior.