The Portable Executable File Format - Abstract

来源:百度文库 编辑:神马文学网 时间:2024/04/27 23:47:56

The Portable Executable FileFormat

Abstract

The Windows NT™ version 3.1operating system introduces a new executable file format calledthe Portable Executable (PE) file format.

Introduction

The recent addition of theMicrosoft® Windows NT™ operating system to the family ofWindows™ operating systems brought many changes to thedevelopment environment and more than a few changes toapplications themselves. One of the more significant changes isthe introduction of the Portable Executable (PE) file format. Thenew PE file format draws primarily from the COFF (Common ObjectFile Format) specification that is common to UNIX® operatingsystems. Yet, to remain compatible with previous versions of theMS-DOS® and Windows operating systems, the PE file format alsoretains the old familiar MZ header from MS-DOS.

This article discusses each of thecomponents of the file as they occur when you traverse the file'scontents, starting at the top and working your way down throughthe file.

Much of the definition ofindividual file components comes from the file WINNT.H, a fileincluded in the Microsoft Win32™ Software Development Kit(SDK) for Windows NT. In it you will find structure typedefinitions for each of the file headers and data directoriesused to represent various components in the file. In other placesin the file, WINNT.H lacks sufficient definition of the filestructure.

The PE file format for Windows NTintroduces a completely new structure to developers familiar withthe Windows and MS-DOS environments. Yet developers familiar withthe UNIX environment will find that the PE file format is similarto, if not based on, the COFF specification.

The entire format consists of anMS-DOS MZ header, followed by a real-mode stub program, the PEfile signature, the PE file header, the PE optional header, allof the section headers, and finally, all of the section bodies.

The optional header ends with anarray of data directory entries that are relative virtualaddresses to data directories contained within section bodies.Each data directory indicates how a specific section body's datais structured.

The PE file format has elevenpredefined sections, as is common to applications for Windows NT,but each application can define its own unique sections for codeand data.

The .debug predefined section alsohas the capability of being stripped from the file into aseparate debug file. If so, a special debug header is used toparse the debug file, and a flag is specified in the PE fileheader to indicate that the debug data has been stripped.

Structure of PE Files

The PE file format is organized asa linear stream of data. It begins with an MS-DOS header, areal-mode program stub, and a PE file signature. Immediatelyfollowing is a PE file header and optional header. Beyond that,all the section headers appear, followed by all of the sectionbodies. Closing out the file are a few other regions ofmiscellaneous information, including relocation information,symbol table information, line number information, and stringtable data. All of this is more easily absorbed by looking at itgraphically, as shown in Figure 1.

Figure 1. Structure of aPortable Executable file image

Starting with the MS-DOS fileheader structure, each of the components in the PE file format isdiscussed below in the order in which it occurs in the file.

MS-DOS/Real-Mode Header

As mentioned above, the firstcomponent in the PE file format is the MS-DOS header. The MS-DOSheader is not new for the PE file format. It is the same MS-DOSheader that has been around since version 2 of the MS-DOSoperating system. The main reason for keeping the same structureintact at the beginning of the PE file format is so that, whenyou attempt to load a file created under Windows version 3.1 orearlier, or MS DOS version 2.0 or later, the operating system canread the file and understand that it is not compatible. In otherwords, when you attempt to run a Windows NT executable on MS-DOSversion 6.0, you get this message: "This program cannot berun in DOS mode." If the MS-DOS header was not included asthe first part of the PE file format, the operating system wouldsimply fail the attempt to load the file and offer somethingcompletely useless, such as: "The name specified is notrecognized as an internal or external command, operable programor batch file."

The MS-DOS header occupies thefirst 64 bytes of the PE file. A structure representing itscontent is described below:

WINNT.H

typedef struct _IMAGE_DOS_HEADER {  // DOS .EXE headerUSHORT e_magic;         // Magic numberUSHORT e_cblp;          // Bytes on last page of fileUSHORT e_cp;            // Pages in fileUSHORT e_crlc;          // RelocationsUSHORT e_cparhdr;       // Size of header in paragraphsUSHORT e_minalloc;      // Minimum extra paragraphs neededUSHORT e_maxalloc;      // Maximum extra paragraphs neededUSHORT e_ss;            // Initial (relative) SS valueUSHORT e_sp;            // Initial SP valueUSHORT e_csum;          // ChecksumUSHORT e_ip;            // Initial IP valueUSHORT e_cs;            // Initial (relative) CS valueUSHORT e_lfarlc;        // File address of relocation tableUSHORT e_ovno;          // Overlay numberUSHORT e_res[4];        // Reserved wordsUSHORT e_oemid;         // OEM identifier (for e_oeminfo)USHORT e_oeminfo;       // OEM information; e_oemid specificUSHORT e_res2[10];      // Reserved wordsLONG   e_lfanew;        // File address of new exe header} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

The first field, e_magic,is the so-called magic number. This field is used to identify anMS-DOS-compatible file type. All MS-DOS-compatible executablefiles set this value to 0x54AD, which represents the ASCIIcharacters MZ. MS-DOS headers are sometimes referred to asMZ headers for this reason. Many other fields are important toMS-DOS operating systems, but for Windows NT, there is really onemore important field in this structure. The final field, e_lfanew,is a 4-byte offset into the file where the PE file header islocated. It is necessary to use this offset to locate the PEheader in the file. For PE files in Windows NT, the PE fileheader occurs soon after the MS-DOS header with only thereal-mode stub program between them.

Real-Mode Stub Program

The real-mode stub program is anactual program run by MS-DOS when the executable is loaded. Foran actual MS-DOS executable image file, the application beginsexecuting here. For successive operating systems, includingWindows, OS/2®, and Windows NT, an MS-DOS stub program is placedhere that runs instead of the actual application. The programstypically do no more than output a line of text, such as:"This program requires Microsoft Windows v3.1 orgreater." Of course, whoever creates the application is ableto place any stub they like here, meaning you may often see suchthings as: "You can't run a Windows NT application on OS/2,it's simply not possible."

When building an application forWindows version 3.1, the linker links a default stub programcalled WINSTUB.EXE into your executable. You can override thedefault linker behavior by substituting your own validMS-DOS-based program in place of WINSTUB and indicating this tothe linker with the STUB module definition statement.Applications developed for Windows NT can do the same thing byusing the -STUB: linker option when linking the executablefile.

PE File Header and Signature

The PE file header is lo cated byindexing the e_lfanew field of the MS-DOS header. The e_lfanewfield simply gives the offset in the file, so add the file'smemory-mapped base address to determine the actual memory-mappedaddress

#define NTSIGNATURE(a) ((LPVOID)((BYTE *)a +    ((PIMAGE_DOS_HEADER)a)->e_lfanew))

When manipulating PE fileinformation, I found that there were several locations in thefile that I needed to refer to often. Since these locations aremerely offsets into the file, it is easier to implement theselocations as macros because they provide much better performancethan functions do.

Notice that instead of retrievingthe offset of the PE file header, this macro retrieves thelocation of the PE file signature. Starting with Windows and OS/2executables, .EXE files were given file signatures to specify theintended target operating system. For the PE file format inWindows NT, this signature occurs immediately before the PE fileheader structure. In versions of Windows and OS/2, the signatureis the first word of the file header. Also, for the PE fileformat, Windows NT uses a DWORD for the signature.

The macro presented above returnsthe offset of where the file signature appears, regardless ofwhich type of executable file it is. So depending on whether it'sa Windows NT file signature or not, the file header exists eitherafter the signature DWORD or at the signature WORD.

DWORD  WINAPI ImageFileType (LPVOID    lpFile){/* DOS file signature comes first. */if (*(USHORT *)lpFile == IMAGE_DOS_SIGNATURE){/* Determine location of PE File header fromDOS header. */if (LOWORD (*(DWORD *)NTSIGNATURE (lpFile)) ==IMAGE_OS2_SIGNATURE ||LOWORD (*(DWORD *)NTSIGNATURE (lpFile)) ==IMAGE_OS2_SIGNATURE_LE)return (DWORD)LOWORD(*(DWORD *)NTSIGNATURE (lpFile));else if (*(DWORD *)NTSIGNATURE (lpFile) ==IMAGE_NT_SIGNATURE)return IMAGE_NT_SIGNATURE;elsereturn IMAGE_DOS_SIGNATURE;}else/* unknown file type */return 0;}

The code listed above quicklyshows how useful the NTSIGNATURE macro becomes. The macromakes it easy to compare the different file types and return theappropriate one for a given type of file. The four different filetypes defined in WINNT.H are:

WINNT.H

#define IMAGE_DOS_SIGNATURE             0x5A4D      // MZ#define IMAGE_OS2_SIGNATURE             0x454E      // NE#define IMAGE_OS2_SIGNATURE_LE          0x454C      // LE#define IMAGE_NT_SIGNATURE              0x00004550  // PE00

At first it seems curious thatWindows executable file types do not appear on this list. Butthen, after a little investigation, the reason becomes clear:There really is no difference between Windows executables andOS/2 executables other than the operating system versionspecification. Both operating systems share the same executablefile structure.

Turning our attention back to theWindows NT PE file format, we find that once we have the locationof the file signature, the PE file follows four bytes later. Thenext macro identifies the PE file header:

#define PEFHDROFFSET(a) ((LPVOID)((BYTE *)a +  ((PIMAGE_DOS_HEADER)a)->e_lfanew + SIZE_OF_NT_SIGNATURE))

The only difference between thisand the previous macro is that this one adds in the constantSIZE_OF_NT_SIGNATURE. Sad to say, this constant is not defined inWINNT.H.

Now that we know the location ofthe PE file header, we can examine the data in the header simplyby assigning this location to a structure, as in the followingexample:

PIMAGE_FILE_HEADER   pfh;pfh = (PIMAGE_FILE_HEADER)PEFHDROFFSET (lpFile);

In this example, lpFilerepresents a pointer to the base of the memory-mapped executablefile, and therein lies the convenience of memory-mapped files. Nofile I/O needs to be performed; simply dereference the pointer pfhto access information in the file. The PE file header structureis defined as:

WINNT.H

typedef struct _IMAGE_FILE_HEADER {USHORT  Machine;USHORT  NumberOfSections;ULONG   TimeDateStamp;ULONG   PointerToSymbolTable;ULONG   NumberOfSymbols;USHORT  SizeOfOptionalHeader;USHORT  Characteristics;} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;#define IMAGE_SIZEOF_FILE_HEADER             20

Notice that the size of the fileheader structure is conveniently defined in the include file...

The information in the PE file isbasically high-level information that is used by the system orapplications to determine how to treat the file. The first fieldis used to indicate what type of machine the executable was builtfor, such as the DEC® Alpha, MIPS R4000, Intel® x86, or someother processor. The system uses this information to quicklydetermine how to treat the file before going any further into therest of the file data.

The Characteristics fieldidentifies specific characteristics about the file. For example,consider how separate debug files are managed for an executable.It is possible to strip debug information from a PE file andstore it in a debug file (.DBG) for use by debuggers. To do this,a debugger needs to know whether to find the debug information ina separate file or not and whether the information has beenstripped from the file or not. A debugger could find out bydrilling down into the executable file looking for debuginformation. To save the debugger from having to search the file,a file characteristic that indicates that the file has beenstripped (IMAGE_FILE_DEBUG_STRIPPED) was invented. Debuggers canlook in the PE file header to quickly determine whether the debuginformation is present in the file or not.

WINNT.H defines several otherflags that indicate file header information much the way theexample described above does. I'll leave it as an exercise forthe reader t o look up the flags to see if any of them areinteresting or not. They are located in WINNT.H immediately afterthe IMAGE_FILE_HEADER structure described above.

One other useful entry in the PEfile header structure is the NumberOfSections field. Itturns out that you need to know how many sections--morespecifically, how many section headers and section bodies--are inthe file in order to extract the information easily. Each sectionheader and section body is laid out sequentially in the file, sothe number of sections is necessary to determine where thesection headers and bodies end. The following function extractsthe number of sections from the PE file header:

int   WINAPI NumOfSections (LPVOID    lpFile){/* Number of sections is indicated in file header. */return (int)((PIMAGE_FILE_HEADER)PEFHDROFFSET (lpFile))->NumberOfSections);}

As you can see, the PEFHDROFFSETand the other macros are pretty handy to have around.

PE Optional Header

The next 224 bytes in theexecutable file make up the PE optional header. Though its nameis "optional header," rest assured that this is not anoptional entry in PE executable files. A pointer to the optionalheader is obtained with the OPTHDROFFSET macro:

#define OPTHDROFFSET(a) ((LPVOID)((BYTE *)a                 + ((PIMAGE_DOS_HEADER)a)->e_lfanew + SIZE_OF_NT_SIGNATURE + sizeof (IMAGE_FILE_HEADER)))

The optional header contains mostof the meaningful information about the executable image, such asinitial stack size, program entry point location, preferred baseaddress, operating system version, section alignment information,and so forth. The IMAGE_OPTIONAL_HEADER structurerepresents the optional header as follows:

WINNT.H

typedef struct _IMAGE_OPTIONAL_HEADER {//// Standard fields.//USHORT  Magic;UCHAR   MajorLinkerVersion;UCHAR   MinorLinkerVersion;ULONG   SizeOfCode;ULONG   SizeOfInitializedData;ULONG   SizeOfUninitializedData;ULONG   AddressOfEntryPoint;ULONG   BaseOfCode;ULONG   BaseOfData;//// NT additional fields.//ULONG   ImageBase;ULONG   SectionAlignment;ULONG   FileAlignment;USHORT  MajorOperatingSystemVersion;USHORT  MinorOperatingSystemVersion;USHORT  MajorImageVersion;USHORT  MinorImageVersion;USHORT  MajorSubsystemVersion;USHORT  MinorSubsystemVersion;ULONG   Reserved1;ULONG   SizeOfImage;ULONG   SizeOfHeaders;ULONG   CheckSum;USHORT  Subsystem;USHORT  DllCharacteristics;ULONG   SizeOfStackReserve;ULONG   SizeOfStackCommit;ULONG   SizeOfHeapReserve;ULONG   SizeOfHeapCommit;ULONG   LoaderFlags;ULONG   NumberOfRvaAndSizes;IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];} IMAGE_OPTIONAL_HEADER, *PIMAGE_OPTIONAL_HEADER;

As you can see, the list of fieldsin this structure is rather lengthy. Rather than bore you withdescriptions of all of these fields, I'll simply discuss theuseful ones--that is, useful in the context of exploring the PEfile format.

Standard Fields

First, note that the structure isdivided into "Standard fields" and "NT additionalfields." The standard fields are those common to the CommonObject File Format (COFF), which most UNIX executable files use.Though the standard fields retain the names defined in COFF,Windows NT actually uses some of them for different purposes thatwould be better described with other names.

  • Magic. I was unable to track down what this field is used for.
  • MajorLinkerVersion, MinorLinkerVersion. Indicates version of the linker that linked this image.
  • SizeOfCode. Size of executable code.
  • SizeOfInitializedData. Size of initialized data.
  • SizeOfUninitializedData. Size of uninitialized data.
  • AddressOfEntryPoint. Of the standard fields, the AddressOfEntryPoint field is the most interesting for the PE file format. This field indicates the location of the entry point for the application and, perhaps more importantly to system hackers, the location of the end of the Import Address Table (IAT). The following function demonstrates how to retrieve the entry point of a Windows NT executable image from the optional header.
     
    LPVOID  WINAPI GetModuleEntryPoint (    LPVOID    lpFile)    {    PIMAGE_OPTIONAL_HEADER   poh;    poh = (PIMAGE_OPTIONAL_HEADER)OPTHDROFFSET (lpFile);    if (poh != NULL)    return (LPVOID)poh->AddressOfEntryPoint;    else    return NULL;    }    
  • BaseOfCode. Relative offset of code (".text" section) in loaded image.
  • BaseOfData. Relative offset of uninitialized data (".bss" section) in loaded image.

Windows NT Additional Fields

The additional fields added to theWindows NT PE file format provide loader support for much of theWindows NT-specific process behavior. Following is a summary ofthese fields.

  • ImageBase. Preferred base address in the address space of a process to map the executable image to. The linker defaults to 0x00400000, but you can override the default with the -BASE: linker switch.
  • SectionAlignment. Each section is loaded into the address space of a process sequentially, beginning at ImageBase. SectionAlignment dictates the minimum amount of space a section can occupy when loaded--that is, sections are aligned on SectionAlignment boundaries.

    Section alignment can be no less than the page size (currently 4096 bytes on the x86 platform) and must be a multiple of the page size as dictated by the behavior of Windows NT's virtual memory manager. 4096 bytes is the x86 linker default, but this can be set using the -ALIGN: linker switch.

  • FileAlignment. Minimum granularity of chunks of information within the image file prior to loading. For example, the linker zero-pads a section body (raw data for a section) up to the nearest FileAlignment boundary in the file. This value is constrained to be a power of 2 between 512 and 65,535.
  • MajorOperatingSystemVersion. Indicates the major version of the Windows NT operating system.
  • MinorOperatingSystemVersion. Indicates the minor version of the Windows NT operating system.
  • MajorImageVersion. Used to indicate the major version number of the application.
  • MinorImageVersion. Used to indicate the minor version number of the application.
  • MajorSubsystemVersion. Indicates the Windows NT Win32 subsystem major version number.
  • MinorSubsystemVersion. Indicates the Windows NT Win32 subsystem minor version number.
  • Reserved1. Unknown purpose, currently not used by the system and set to zero by the linker.
  • SizeOfImage. Indicates the amount of address space to reserve in the address space for the loaded executable image. This number is influenced greatly by SectionAlignment. For example, consider a system having a fixed page size of 4096 bytes. If you have an executable with 11 sections, each less than 4096 bytes, aligned on a 65,536-byte boundary, the SizeOfImage field would be set to 11 * 65,536 = 720,896 (176 pages). The same file linked with 4096-byte alignment would result in 11 * 4096 = 45,056 (11 pages) for the SizeOfImage field. This is a simple example in which each section requires less than a page of memory. In reality, the linker determines the exact SizeOfImage by figuring each section individually. It first determines how many bytes the section requires, then it rounds up to the nearest page boundary, and finally it rounds page count to the nearest SectionAlignment boundary. The total is then the sum of each section's individual requirement.
  • SizeOfHeaders. This field indicates how much space in the file is used for representing all the file headers, including the MS-DOS header, PE file header, PE optional header, and PE section headers. The section bodies begin at this location in the file.
  • CheckSum. A checksum value is used to validate the executable file at load time. The value is set and verified by the linker. The algorithm used for creating these checksum values is proprietary information and will not be published.
  • Subsystem. Field used to identify the target subsystem for this executable. Each of the possible subsystem values are listed in the WINNT.H file immediately after the IMAGE_OPTIONAL_HEADER structure.
  • DllCharacteristics. Flags used to indicate if a DLL image includes entry points for process and thread initialization and termination.
  • SizeOfStackReserve, SizeOfStackCommit, SizeOfHeapReserve, S izeOfHeapCommit. These fields control the amount of address space to reserve and commit for the stack and default heap. Both the stack and heap have default values of 1 page committed and 16 pages reserved. These values are set with the linker switches -STACKSIZE: and -HEAPSIZE:.
  • LoaderFlags. Tells the loader whether to break on load, debug on load, or the default, which is to let things run normally.
  • NumberOfRvaAndSizes. This field identifies the length of the DataDirectory array that follows. It is important to note that this field is used to identify the size of the array, not the number of valid entries in the array.
  • DataDirectory. The data directory indicates where to find other important components of executable information in the file. It is really nothing more than an array of IMAGE_DATA_DIRECTORY structures that are located at the end of the optional header structure. The current PE file format defines 16 possible data directories, 11 of which are now being used.

Data Directories

As defined in WINNT.H, the datadirectories are:

WINNT.H

// Directory Entries// Export Directory#define IMAGE_DIRECTORY_ENTRY_EXPORT         0// Import Directory#define IMAGE_DIRECTORY_ENTRY_IMPORT         1// Resource Directory#define IMAGE_DIRECTORY_ENTRY_RESOURCE       2// Exception Directory#define IMAGE_DIRECTORY_ENTRY_EXCEPTION      3// Security Directory#define IMAGE_DIRECTORY_ENTRY_SECURITY       4// Base Relocation Table#define IMAGE_DIRECTORY_ENTRY_BASERELOC      5// Debug Directory#define IMAGE_DIRECTORY_ENTRY_DEBUG          6// Description String#define IMAGE_DIRECTORY_ENTRY_COPYRIGHT      7// Machine Value (MIPS GP)#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR      8// TLS Directory#define IMAGE_DIRECTORY_ENTRY_TLS            9// Load Configuration Directory#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG    10

Each data directory is basically astructure defined as an IMAGE_DATA_DIRECTORY. And althoughdata directory entries themselves are the same, each specificdirectory type is entirely unique. The definition of each defineddata directory is described in "Predefined Sections"later in this article.

WINNT.H

typedef struct _IMAGE_DATA_DIRECTORY {ULONG   VirtualAddress;ULONG   Size;} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

Each data directory entryspecifies the size and relative virtual address of the directory.To locate a particular directory, you determine the relativeaddress from the data directory array in the optional header.Then use the virtual address to determine which section thedirectory is in. Once you determine which section contains thedirectory, the section header for that section is then used tofind the exact file offset location of the data directory.

So to get a data directory, youfirst need to know about sections, which are described next. Anexample of how to locate data directories immediately followsthis discussion.

PE File Sections

The PE file specification consistsof the headers defined so far and a generic object called a section.Sections contain the content of the file, including code, data,resources, and other executable information. Each section has aheader and a body (the raw data). Section headers are describedbelow, but section bodies lack a rigid file structure. They canbe organized in almost any way a linker wishes to organize them,as long as the header is filled with enough information to beable to decipher the data.

Section Headers

Section headers are locatedsequentially right after the optional header in the PE fileformat. Each section header is 40 bytes with no padding betweenthem. Section headers are defined as in the following structure:

WINNT.H

#define IMAGE_SIZEOF_SHORT_NAME              8typedef struct _IMAGE_SECTION_HEADER {UCHAR   Name[IMAGE_SIZEOF_SHORT_NAME];union {ULONG   PhysicalAddress;ULONG   VirtualSize;} Misc;ULONG   VirtualAddress;ULONG   SizeOfRawData;ULONG   PointerToRawData;ULONG   PointerToRelocations;ULONG   PointerToLinenumbers;USHORT  NumberOfRelocations;USHORT  NumberOfLinenumbers;ULONG   Characteristics;} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;

How do you go about gettingsection header information for a particular section? Sincesection headers are organized sequentially in no specific order,section headers must be located by name. The following functionshows how to retrieve a section header from a PE image file giventhe name of the section:

BOOL    WINAPI GetSectionHdrByName (LPVOID                   lpFile,IMAGE_SECTION_HEADER     *sh,char                     *szSection){PIMAGE_SECTION_HEADER    psh;int                      nSections = NumOfSections (lpFile);int                      i;if ((psh = (PIMAGE_SECTION_HEADER)SECHDROFFSET (lpFile)) !=NULL){/* find the section by name */for (i=0; iName, szSection)){/* copy data to header */CopyMemory ((LPVOID)sh,(LPVOID)psh,sizeof (IMAGE_SECTION_HEADER));return TRUE;}elsepsh++;}}return FALSE;}

The function simply locates thefirst section header via the SECHDROFFSET macro. Then thefunction loops through each section, comparing each section'sname with the name of the section it's looking for, until itfinds the right one. When the section is found, the functioncopies the data from the memory-mapped file to the structurepassed in to the function. The fields of the IMAGE_SECTION_HEADERstructure can then be accessed directly from the structure.

Section Header Fields

  • Name. Each section header has a name field up to eight characters long, for which the first character must be a period.
  • PhysicalAddress or VirtualSize. The second field is a union field that is not currently used.
  • VirtualAddress. This field identifies the virtual address in the process's address space to which to load the section. The actual address is created by taking the value of this field and adding it to the ImageBase virtual address in the optional header structure. Keep in mind, though, that if this image file represents a DLL, there is no guarantee that the DLL will be loaded to the ImageBase location requested. So once the file is loaded into a process, the actual ImageBase value should be verified programmatically using GetModuleHandle.
  • SizeOfRawData. This field indicates the FileAlignment-relative size of the section body. The actual size of the section body will be less than or equal to a multiple of FileAlignment in the file. Once the image is loaded into a process's address space, the size of the section body becomes less than or equal to a multiple of SectionAlignment.
  • PointerToRawData. This is an offset to the location of the section body in the file.
  • PointerToRelocations, PointerToLinenumbers, NumberOfRelocations, NumberOfLinenumbers. None of these fi elds are used in the PE file format.
  • Characteristics. Defines the section characteristics. These values are found both in WINNT.H and in the Portable Executable Format specification located on this CD.
Value Definition   0x00000020 Code section   0x00000040 Initialized data section   0x00000080 Uninitialized data section   0x04000000 Section cannot be cached   0x08000000 Section is not pageable   0x10000000 Section is shared   0x20000000 Executable section   0x40000000 Readable section   0x80000000 Writable section

Locating Data Directories

Data directories exist within thebody of their corresponding data section. Typically, datadirectories are the first structure within the section body, butnot out of necessity. For that reason, you need to retrieveinformation from both the section header and optional header tolocate a specific data directory.

To make this process easier, thefollowing function was written to locate the data directory forany of the directories defined in WINNT.H:

LPVOID  WINAPI ImageDirectoryOffset (LPVOID    lpFile,DWORD     dwIMAGE_DIRECTORY){PIMAGE_OPTIONAL_HEADER   poh;PIMAGE_SECTION_HEADER    psh;int                      nSections = NumOfSections (lpFile);int                      i = 0;LPVOID                   VAImageDir;/* Must be 0 thru (NumberOfRvaAndSizes-1). */if (dwIMAGE_DIRECTORY >= poh->NumberOfRvaAndSizes)return NULL;/* Retrieve offsets to optional and section headers. */poh = (PIMAGE_OPTIONAL_HEADER)OPTHDROFFSET (lpFile);psh = (PIMAGE_SECTION_HEADER)SECHDROFFSET (lpFile);/* Locate image directory's relative virtual address. */VAImageDir = (LPVOID)poh->DataDirectory[dwIMAGE_DIRECTORY].VirtualAddress;/* Locate section containing image directory. */while (i++VirtualAddress <= (DWORD)VAImageDir &&psh->VirtualAddress +psh->SizeOfRawData > (DWORD)VAImageDir)break;psh++;}if (i > nSections)return NULL;/* Return image import directory offset. */return (LPVOID)(((int)lpFile +(int)VAImageDir. psh->VirtualAddress) +(int)psh->PointerToRawData);}

The function begins by validatingthe requested data directory entry number. Then it retrievespointers to the optional header and first section header. Fromthe optional header, the function determines the data directory'svirtual address, and it uses this value to determine within whichsection body the data directory is located. Once the appropriatesection body has been identified, the specific location of thedata directory is found by translating the relative virtualaddress of the data directory to a specific address into thefile.

Predefined Sections

An application for Windows NTtypically has the nine predefined sections named .text, .bss,.rdata, .data, .rsrc, .edata, .idata, .pdata, and .debug. Someapplications do not need all of these sections, while others maydefine still more sections to suit their specific needs. Thisbehavior is similar to code and data segments in MS-DOS andWindows version 3.1. In fact, the way an application defines aunique section is by using the standard compiler directives fornaming code and data segments or by using the name segmentcompiler option -NT--exactly the same way in whichapplications defined unique code and data segments in Windowsversion 3.1.

The following is a discussion ofsome of the more interesting sections common to typical WindowsNT PE files.

Executable code section, .text

One difference between Windowsversion 3.1 and Windows NT is that the default behavior combinesall code segments (as they are referred to in Windows version3.1) into a single section called ".text" in WindowsNT. Since Windows NT uses a page-based virtual memory managementsystem, there is no advantage to separating code into distinctcode segments. Consequently, having one large code section iseasier to manage for both the operating system and theapplication developer.

The .text section also containsthe entry point mentioned earlier. The IAT also lives in the.text section immediately before the module entry point. (TheIAT's presence in the .text section makes sense because the tableis really a series of jump instructions, for which the specificlocation to jump to is the fixed-up address.) When Windows NTexecutable images are loaded into a process's address space, theIAT is fixed up with the location of each imported function'sphysical address. In order to find the IAT in the .text section,the loader simply locates the module entry point and relies onthe fact that the IAT occurs immediately before the entry point.And since each entry is the same size, it is easy to walkbackward in the table to find its beginning.

Data sections, .bss, .rdata,.data

The .bss section representsuninitialized data for the application, including all variablesdeclared as static within a function or source module.

The .rdata section representsread-only data, such as literal strings, constants, and debugdirectory information.

All other variables (exceptautomatic variables, which appear on the stack) are stored in the.data section. Basically, these are application or module globalvariables.

Resources section, .rsrc

The .rsrc section containsresource information for a module. It begins with a resourcedirectory structure like most other sections, but this section'sdata is further structured into a resource tree. The IMAGE_RESOURCE_DIRECTORY,shown below, forms the root and nodes of the tree.

WINNT.H

typedef struct _IMAGE_RESOURCE_DIRECTORY {ULONG   Characteristics;ULONG   TimeDateStamp;USHORT  MajorVersion;USHORT  MinorVersion;USHORT  NumberOfNamedEntries;USHORT  NumberOfIdEntries;} IMAGE_RESOURCE_DIRECTORY, *PIMAGE_RESOURCE_DIRECTORY;

Looking at the directorystructure, you won't find any pointer to the next nodes. Instead,there are two fields, NumberOfNamedEntries and NumberOfIdEntries,used to indicate how many entries are attached to the directory.By attached, I mean the directory entries followimmediately after the directory in the section data. The namedentries appear first in ascending alphabetical order, followed bythe ID entries in ascending numerical order.

A directory entry consists of twofields, as described in the following IMAGE_RESOURCE_DIRECTORY_ENTRYstructure:

WINNT.H

typedef struct _IMAGE_RESOURCE_DIRECTORY_ENTRY {ULONG   Name;ULONG   OffsetToData;} IMAGE_RESOURCE_DIRECTORY_ENTRY, *PIMAGE_RESOURCE_DIRECTORY_ENTRY;

The two fields are used fordifferent things depending on the level of the tree. The Namefield is used to identify either a type of resource, a resourcename, or a resource's language ID. The OffsetToData fieldis always used to point to a sibling in the tree, either adirectory node or a leaf node.

Leaf nodes are the lowest node inthe resource tree. They define the size and location of theactual resource data. Each leaf node is represented using thefollowing IMAGE_RESOURCE_DATA_ENTRY structure:

WINNT.H

typedef struct _IMAGE_RESOURCE_DATA_ENTRY {ULONG   OffsetToData;ULONG   Size;ULONG   CodePage;ULONG   Reserved;} IMAGE_RESOURCE_DATA_ENTRY, *PIMAGE_RESOURCE_DATA_ENTRY;

The two fields OffsetToDataand Size indicate the location and size of the actualresource data. Since this information is used primarily byfunctions once the application has been loaded, it makes moresense to make the OffsetToData field a relative virtualaddress. This is precisely the case. Interestingly enough, allother offsets, such as pointers from directory entries to otherdirectories, are offsets relative to the location of the rootnode.

To make all of this a littleclearer, consider Figure 2.

Figure 2. A simple resourcetree structure

Figure 2 depicts a very simpleresource tree containing only two resource objects, a menu, and astring table. Further, the menu and string table have only oneitem each. Yet, you can see how complicated the resource treebecomes--even with as few resources as this.

At the root of the tree, the firstdirectory has one entry for each type of resource the filecontains, no matter how many of each type there are. In Figure 2,there are two entries identified by the root, one for the menuand one for the string table. If there had been one or moredialog resources included in the file, the root node would havehad one more entry and, consequently, another branch for thedialog resources.

The basic resource types areidentified in the file WINUSER.H and are listed below:

WINUSER.H

/** Predefined Resource Types*/#define RT_CURSOR           MAKEINTRESOURCE(1)#define RT_BITMAP           MAKEINTRESOURCE(2)#define RT_ICON             MAKEINTRESOURCE(3)#define RT_MENU             MAKEINTRESOURCE(4)#define RT_DIALOG           MAKEINTRESOURCE(5)#define RT_STRING           MAKEINTRESOURCE(6)#define RT_FONTDIR          MAKEINTRESOURCE(7)#define RT_FONT             MAKEINTRESOURCE(8)#define RT_ACCELERATOR      MAKEINTRESOURCE(9)#define RT_RCDATA           MAKEINTRESOURCE(10)#define RT_MESSAGETABLE     MAKEINTRESOURCE(11)

At the top level of the tree, theMAKEINTRESOURCE values listed above are placed in the Namefield of each type entry, identifying the different resources bytype.

Each of the entries in the rootdirectory points to a sibling node in the second level of thetree. These nodes are directories, too, each having their ownentries. At this level, the directories are used to identify thename of each resource within a given type. If you had multiplemenus defined in your application, there would be an entry foreach one here at the second level of the tree.

As you are probably already aware,resources can be identified by name or by integer. They aredistinguished in this level of the tree via the Name fieldin the directory structure. If the most significant bit of the Namefield is set, the other 31 bits are used as an offset to an IMAGE_RESOURCE_DIR_STRING_Ustructure.

WINNT.H

typedef struct _IMAGE_RESOURCE_DIR_STRING_U {USHORT  Length;WCHAR   NameString[ 1 ];} IMAGE_RESOURCE_DIR_STRING_U, *PIMAGE_RESOURCE_DIR_STRING_U;

This structure is simply a 2-byte Lengthfield followed by Length UNICODE characters.

On the other hand, if the mostsignificant bit of the Name field is clear, the lower 31bits are used to represent the integer ID of the resource. Figure2 shows the menu resource as a named resource and the stringtable as an ID resource.

If there were two menu resources,one identified by name and one by resource, they would both haveentries immediately after the menu resource directory. The namedresource entry would appear first, followed by theinteger-identified resource. The directory fields NumberOfNamedEntriesand NumberOfIdEntries would each contain the value 1,indicating the presence of one entry.

Below level two, the resource treedoes not branch out any further. Level one branches intodirectories representing each type of resource, and level twobranches into directories representing each resource byidentifier. Level three maps a one-to-one correspondence betweenthe individually identified resources and their respectivelanguage IDs. To indicate the language ID of a resource, the Namefield of the directory entry structure is used to indicate boththe primary language and sublanguage ID for the resource. For thevalue 0x0409, 0x09 represents the primary language asLANG_ENGLISH, and 0x04 is defined as SUBLANG_ENGLISH_CAN for thesublanguage. The entire set of language IDs is defined in thefile WINNT.H.

Since the language ID node is thelast directory node in the tree, the OffsetToData field inthe entry structure is an offset to a leaf node--the IMAGE_RESOURCE_DATA_ENTRYstructure mentioned earlier.

Referring back to Figure 2, youcan see one data entry node for each language directory entry.This node simply indicates the size of the resource data and therelative virtual address where the resource data is located.

One advantage to having so muchstructure to the resource data section, .rsrc, is that you canglean a great deal of information from the section withoutaccessing the resources themselves. For example, you can find outhow many there are of each type of resource, what resources--ifany--use a particular language ID, whether a particular resourceexists or not, and the size of individual types of resources. Todemonstrate how to make use of this information, the followingfunction shows how to determine the different types of resourcesa file includes:

int     WINAPI GetListOfResourceTypes (LPVOID    lpFile,HANDLE    hHeap,char      **pszResTypes){PIMAGE_RESOURCE_DIRECTORY          prdRoot;PIMAGE_RESOURCE_DIRECTORY_ENTRY    prde;char                               *pMem;int                                nCnt, i;/* Get root directory of resource tree. */if ((prdRoot = PIMAGE_RESOURCE_DIRECTORY)ImageDirectoryOffset(lpFile, IMAGE_DIRECTORY_ENTRY_RESOURCE)) == NULL)return 0;/* Allocate enough space from heap to cover all types. */nCnt = prdRoot->NumberOfIdEntries * (MAXRESOURCENAME + 1);*pszResTypes = (char *)HeapAlloc (hHeap,HEAP_ZERO_MEMORY,nCnt);if ((pMem = *pszResTypes) == NULL)return 0;/* Set pointer to first resource type entry. */prde = (PIMAGE_RESOURCE_DIRECTORY_ENTRY)((DWORD)prdRoot +sizeof (IMAGE_RESOURCE_DIRECTORY));/* Loop through all resource directory entry types. */for (i=0; iNumberOfIdEntries; i++){if (LoadString (hDll, prde->Name, pMem, MAXRESOURCENAME))pMem += strlen (pMem) + 1;prde++;}return nCnt;}

This function returns a list ofresource type names in the string identified by pszResTypes.Notice that, at the heart of this function, LoadString iscalled using the Name field of each resource typedirectory entry as the string ID.of resource type strings whoseIDs are defined the same as the type specifiers in the directoryentries.It would be rather easy to expand on these functions orwrite new functions that extracted other information from thissection.

Export data section, .edata

The .edata section contains exportdata for an application or DLL. When present, this sectioncontains an export directory for getting to the exportinformation.

WINNT.H

typedef struct _IMAGE_EXPORT_DIRECTORY {ULONG   Characteristics;ULONG   TimeDateStamp;USHORT  MajorVersion;USHORT  MinorVersion;ULONG   Name;ULONG   Base;ULONG   NumberOfFunctions;ULONG   NumberOfNames;PULONG  *AddressOfFunctions;PULONG  *AddressOfNames;PUSHORT *AddressOfNameOrdinals;} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;

The Name field in theexport directory identifies the name of the executable module. NumberOfFunctionsand NumberOfNames fields indicate how many functions andfunction names are being exported from the module.

The AddressOfFunctionsfield is an offset to a list of exported function entry points.The AddressOfNames field is the address of an offset tothe beginning of a null-separated list of exported functionnames. AddressOfNameOrdinals is an offset to a list ofordinal values (each 2 bytes long) for the same exportedfunctions.

The three AddressOf...fields are relative virtual addresses into the address space of aprocess once the module has been loaded. Once the module isloaded, the relative virtual address should be added to themodule base address to get the exact location in the addressspace of the process. Before the file is loaded, however, theaddress can be determined by subtracting the section headervirtual address (VirtualAddress) from the given fieldaddress, adding the section body offset (PointerToRawData)to the result, and then using this value as an offset into theimage file. The following example illustrates this technique:

int  WINAPI GetExportFunctionNames (LPVOID    lpFile,HANDLE    hHeap,char      **pszFunctions){IMAGE_SECTION_HEADER       sh;PIMAGE_EXPORT_DIRECTORY    ped;char                       *pNames, *pCnt;int                        i, nCnt;/* Get section header and pointer to data directoryfor .edata section. */if ((ped = (PIMAGE_EXPORT_DIRECTORY)ImageDirectoryOffset(lpFile, IMAGE_DIRECTORY_ENTRY_EXPORT)) == NULL)return 0;GetSectionHdrByName (lpFile, &sh, ".edata");/* Determine the offset of the export function names. */pNames = (char *)(*(int *)((int)ped->AddressOfNames -(int)sh.VirtualAddress   +(int)sh.PointerToRawData +(int)lpFile)    -(int)sh.VirtualAddress   +(int)sh.PointerToRawData +(int)lpFile);/* Figure out how much memory to allocate for all strings. */pCnt = pNames;for (i=0; i<(int)ped->NumberOfNames; i++)while (*pCnt++);nCnt = (int)(pCnt. pNames);/* Allocate memory off heap for function names. */*pszFunctions = HeapAlloc (hHeap, HEAP_ZERO_MEMORY, nCnt);/* Copy all strings to buffer. */CopyMemory ((LPVOID)*pszFunctions, (LPVOID)pNames, nCnt);return nCnt;}

Notice that in this function thevariable pNames is assigned by determining first theaddress of the offset and then the actual offset location. Boththe address of the offset and the offset itself are relativevirtual addresses and must be translated before being used, asthe function demonstrates. You could write a similar function todetermine the ordinal values or entry points of the functions.

Import data section, .idata

The .idata section is import data,including the import directory and import address name table.Although an IMAGE_DIRECTORY_ENTRY_IMPORT directory is defined, nocorresponding import directory structure is included in the fileWINNT.H. Instead, there are several other structures calledIMAGE_IMPORT_BY_NAME, IMAGE_THUNK_DATA, andIMAGE_IMPORT_DESCRIPTOR. Personally, I couldn't make heads ortails of how these structures are supposed to correlate to the.idata section, so I spent several hours deciphering the .idatasection body and came up with a much simpler structure. I namedthis structure IMAGE_IMPORT_MODULE_DIRECTORY.

typedef struct tagImportDirectory{DWORD    dwRVAFunctionNameList;DWORD    dwUseless1;DWORD    dwUseless2;DWORD    dwRVAModuleName;DWORD    dwRVAFunctionAddressList;}IMAGE_IMPORT_MODULE_DIRECTORY,* PIMAGE_IMPORT_MODULE_DIRECTORY;

Unlike the data directories ofother sections, this one repeats one after another for eachimported module in the file. Think of it as an entry in a list ofmodule data directories, rather than a data directory to theentire section of data. Each entry is a directory to the importinformation for a specific module.

One of the fields in the IMAGE_IMPORT_MODULE_DIRECTORYstructure is dwRVAModuleName, a relative virtual addresspointing to the name of the module. There are also two dwUselessparameters in the structure that serve as padding to keep thestructure aligned properly within the section. The PE file formatspecification mentions something about import flags, a time/datestamp, and major/minor versions, but these two fields remainedempty throughout my experimentation, so I still consider themuseless.

Based on the definition of thisstructure, you can retrieve the names of modules and allfunctions in each module that are imported by an executable file.The following function demonstrates how to retrieve all themodule names imported by a particular PE file:

int  WINAPI GetImportModuleNames (LPVOID    lpFile,HANDLE    hHeap,char      **pszModules){PIMAGE_IMPORT_MODULE_DIRECTORY  pid;IMAGE_SECTION_HEADER            idsh;BYTE                            *pData;int                             nCnt = 0, nSize = 0, i;char                            *pModule[1024];char                            *psz;pid = (PIMAGE_IMPORT_MODULE_DIRECTORY)ImageDirectoryOffset(lpFile, IMAGE_DIRECTORY_ENTRY_IMPORT);pData = (BYTE *)pid;/* Locate section header for ".idata" section. */if (!GetSectionHdrByName (lpFile, &idsh, ".idata"))return 0;/* Extract all import modules. */while (pid->dwRVAModuleName){/* Allocate buffer for absolute string offsets. */pModule[nCnt] = (char *)(pData +(pid->dwRVAModuleName-idsh.VirtualAddress));nSize += strlen (pModule[nCnt]) + 1;/* Increment to the next import directory entry. */pid++;nCnt++;}/* Copy all strings to one chunk of heap memory. */*pszModules = HeapAlloc (hHeap, HEAP_ZERO_MEMORY, nSize);psz = *pszModules;for (i=0; i

The function is prettystraightforward. However, one thing is worth pointing out--noticethe while loop. This loop is terminated when pid->dwRVAModuleNameis 0. Implied here is that at the end of the list of IMAGE_IMPORT_MODULE_DIRECTORYstructures is a null structure that has a value of 0 for at leastthe dwRVAModuleName field. This is the behavior I observedin my experimentation with the file and later confirmed in the PEfile format specification.

The first field in the structure, dwRVAFunctionNameList,is a relative virtual address to a list of relative virtualaddresses that each point to the function names within the file.As shown in the following data, the module and function names ofall imported modules are listed in the .idata section data:

E6A7 0000 F6A7 0000  08A8 0000 1AA8 0000  ................28A8 0000 3CA8 0000  4CA8 0000 0000 0000  (...<...L.......0000 4765 744F 7065  6E46 696C 654E 616D  ..GetOpenFileNam6541 0000 636F 6D64  6C67 3332 2E64 6C6C  eA..comdlg32.dll0000 2500 4372 6561  7465 466F 6E74 496E  ..%.CreateFontIn6469 7265 6374 4100  4744 4933 322E 646C  directA.GDI32.dl6C00 A000 4765 7444  6576 6963 6543 6170  l...GetDeviceCap7300 C600 4765 7453  746F 636B 4F62 6A65  s...GetStockObje6374 0000 D500 4765  7454 6578 744D 6574  ct....GetTextMet7269 6373 4100 1001  5365 6C65 6374 4F62  ricsA...SelectOb6A65 6374 0000 1601  5365 7442 6B43 6F6C  ject....SetBkCol6F72 0000 3501 5365  7454 6578 7443 6F6C  or..5.SetTextCol6F72 0000 4501 5465  7874 4F75 7441 0000  or..E.TextOutA..

This particular section representsthe beginning of the list of import module and function names. Ifyou begin examining the right section part of the data, youshould recognize the names of familiar Win32 API functions andthe module names they are found in. Reading from the top down,you get GetOpenFileNameA, followed by the module nameCOMDLG32.DLL. Shortly after that, you get CreateFontIndirectA,followed by the module GDI32.DLL and then the functions GetDeviceCaps,GetStockObject, GetTextMetrics, and so forth.

This pattern repeats throughoutthe .idata section. The first module name is COMDLG32.DLL and thesecond is GDI32.DLL. Notice that only one function is importedfrom the first module, while many functions are imported from thesecond module. In both cases, the function names and the modulename to which they belong are ordered such that a function nameappears first, followed by the module name and then by the restof the function names, if any.

The following functiondemonstrates how to retrieve the function names for a specificmodule:

int  WINAPI GetImportFunctionNamesByModule (LPVOID    lpFile,HANDLE    hHeap,char      *pszModule,char      **pszFunctions){PIMAGE_IMPORT_MODULE_DIRECTORY  pid;IMAGE_SECTION_HEADER     idsh;DWORD                    dwBase;int                      nCnt = 0, nSize = 0;DWORD                    dwFunction;char                     *psz;/* Locate section header for ".idata" section. */if (!GetSectionHdrByName (lpFile, &idsh, ".idata"))return 0;pid = (PIMAGE_IMPORT_MODULE_DIRECTORY)ImageDirectoryOffset(lpFile, IMAGE_DIRECTORY_ENTRY_IMPORT);dwBase = ((DWORD)pid. idsh.VirtualAddress);/* Find module's pid. */while (pid->dwRVAModuleName &&strcmp (pszModule,(char *)(pid->dwRVAModuleName+dwBase)))pid++;/* Exit if the module is not found. */if (!pid->dwRVAModuleName)return 0;/* Count number of function names and length of strings. */dwFunction = pid->dwRVAFunctionNameList;while (dwFunction                      &&*(DWORD *)(dwFunction + dwBase) &&*(char *)((*(DWORD *)(dwFunction + dwBase)) +dwBase+2)){nSize += strlen ((char *)((*(DWORD *)(dwFunction +dwBase)) + dwBase+2)) + 1;dwFunction += 4;nCnt++;}/* Allocate memory off heap for function names. */*pszFunctions = HeapAlloc (hHeap, HEAP_ZERO_MEMORY, nSize);psz = *pszFunctions;/* Copy function names to memory pointer. */dwFunction = pid->dwRVAFunctionNameList;while (dwFunction                      &&*(DWORD *)(dwFunction + dwBase) &&*((char *)((*(DWORD *)(dwFunction + dwBase)) +dwBase+2))){strcpy (psz, (char *)((*(DWORD *)(dwFunction + dwBase)) +dwBase+2));psz += strlen((char *)((*(DWORD *)(dwFunction + dwBase))+dwBase+2)) + 1;dwFunction += 4;}return nCnt;}

Like the GetImportModuleNamesfunction, this function relies on the end of each list ofinformation to have a zeroed entry. In this case, the list offunction names ends with one that is zero.

The final field, dwRVAFunctionAddressList,is a relative virtual address to a list of virtual addresses thatwill be placed in the section data by the loader when the file isloaded. Before the file is loaded, however, these virtualaddresses are replaced by relative virtual addresses thatcorrespond exactly to the list of function names. So before thefile is loaded, there are two identical lists of relative virtualaddresses pointing to imported function names.

Debug information section, .debug

Debug information is initiallyplaced in the .debug section. The PE file format also supportsseparate debug files (normally identified with a .DBG extension)as a means of collecting debug information in a central location.The debug section contains the debug information, but the debugdirectories live in the .rdata section mentioned earlier. Each ofthose directories references debug information in the .debugsection. The debug directory structure is defined as an IMAGE_DEBUG_DIRECTORY,as follows:

WINNT.H

typedef struct _IMAGE_DEBUG_DIRECTORY {ULONG   Characteristics;ULONG   TimeDateStamp;USHORT  MajorVersion;USHORT  MinorVersion;ULONG   Type;ULONG   SizeOfData;ULONG   AddressOfRawData;ULONG   PointerToRawData;} IMAGE_DEBUG_DIRECTORY, *PIMAGE_DEBUG_DIRECTORY;

The section is divided intoseparate portions of data representing different types of debuginformation. For each one there is a debug directory describedabove. The different types of debug information are listed below:

WINNT.H

#define IMAGE_DEBUG_TYPE_UNKNOWN          0#define IMAGE_DEBUG_TYPE_COFF             1#define IMAGE_DEBUG_TYPE_CODEVIEW         2#define IMAGE_DEBUG_TYPE_FPO              3#define IMAGE_DEBUG_TYPE_MISC             4

The Type field in eachdirectory indicates which type of debug information the directoryrepresents. As you can see in the list above, the PE file formatsupports many different types of debug information, as well assome other informational fields. Of those, the IMAGE_DEBUG_TYPE_MISCinformation is unique. This information was added to representmiscellaneous information about the executable image that couldnot be added to any of the more structured data sections in thePE file format. This is the only location in the image file wherethe image name is sure to appear. If an image exportsinformation, the export data section will also include the imagename.

Each type of debug information hasits own header structure that defines its data. Each of these islisted in the file WINNT.H. One nice thing about the IMAGE_DEBUG_DIRECTORYstructure is that it includes two fields that identify the debuginformation. The first of these, AddressOfRawData, is therelative virtual address of the data once the file is loaded. Theother, PointerToRawData, is an actual offset within the PEfile, where the data is located. This makes it easy to locatespecific debug information.

As a last example, consider thefollowing function, which extracts the image name from the IMAGE_DEBUG_MISCstructure:

int    WINAPI RetrieveModuleName (LPVOID    lpFile,HANDLE    hHeap,char      **pszModule){PIMAGE_DEBUG_DIRECTORY    pdd;PIMAGE_DEBUG_MISC         pdm = NULL;int                       nCnt;if (!(pdd = (PIMAGE_DEBUG_DIRECTORY)ImageDirectoryOffset(lpFile, IMAGE_DIRECTORY_ENTRY_DEBUG)))return 0;while (pdd->SizeOfData){if (pdd->Type == IMAGE_DEBUG_TYPE_MISC){pdm = (PIMAGE_DEBUG_MISC)((DWORD)pdd->PointerToRawData + (DWORD)lpFile);nCnt = lstrlen (pdm->Data)*(pdm->Unicode?2:1);*pszModule = (char *)HeapAlloc (hHeap,HEAP_ZERO_MEMORY,nCnt+1;CopyMemory (*pszModule, pdm->Data, nCnt);break;}pdd ++;}if (pdm != NULL)return nCnt;elsereturn 0;}

As you can see, the structure ofthe debug directory makes it relatively easy to locate a specifictype of debug information. Once the IMAGE_DEBUG_MISCstructure is located, extracting the image name is as simple asinvoking the CopyMemory function.

As mentioned above, debuginformation can be stripped into separate .DBG files. The WindowsNT SDK includes a utility called REBASE.EXE that serves thispurpose. For example, in the following statement an executableimage named TEST.EXE is being stripped of debug information:

rebase -b 40000 -x c:\samples\testdir test.exe

The debug information is placed ina new file called TEST.DBG and located in the path specified, inthis case c:\samples\testdir. The file begins with a single IMAGE_SEPARATE_DEBUG_HEADERstructure, followed by a copy of the section headers that existin the stripped executable image. Then the .debug section datafollows the section headers. So, right after the section headersare the series of IMAGE_DEBUG_DIRECTORY structures andtheir associated data. The debug information itself retains thesame structure as described above for normal image file debuginformation.

 

Copyright © 1996,1997 Johannes Plachy
Last modified: 25.08.97