Inside Windows: An In-Depth Look into the Win32 Portable Executable File Format

来源:百度文库 编辑:神马文学网 时间:2024/04/29 00:43:28

An In-Depth Look into the Win32 Portable Executable File Format

Matt Pietrek This article assumes you're familiar with C++ and Win32 Level of Difficulty     1   2   3  Download the code for this article: PE.exe (98KB)

SUMMARYA good understanding of the Portable Executable (PE) file format leadsto a good understanding of the operating system. If you know what's inyour DLLs and EXEs, you'll be a more knowledgeable programmer. Thisarticle, the first of a two-part series, looks at the changes to the PEformat that have occurred over the last few years, along with anoverview of the format itself.
      After this update, the authordiscusses how the PE format fits into applications written for .NET, PEfile sections, RVAs, the DataDirectory, and the importing of functions.An appendix includes lists of the relevant image header structures andtheir descriptions.

long time ago, in a galaxy far away, I wrote one of my first articles for Microsoft Systems Journal (now MSDN® Magazine). The article, "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format,"turned out to be more popular than I had expected. To this day, I stillhear from people (even within Microsoft) who use that article, which isstill available from the MSDN Library. Unfortunately, the problem witharticles is that they're static. The world of Win32® has changed quitea bit in the intervening years, and the article is severely dated. I'llremedy that situation in a two-part article starting this month.
      Youmight be wondering why you should care about the executable fileformat. The answer is the same now as it was then: an operatingsystem's executable format and data structures reveal quite a bit aboutthe underlying operating system. By understanding what's in your EXEsand DLLs, you'll find that you've become a better programmer all around.
      Sure,you could learn a lot of what I'll tell you by reading the Microsoftspecification. However, like most specs, it sacrifices readability forcompleteness. My focus in this article will be to explain the mostrelevant parts of the story, while filling in the hows and whys thatdon't fit neatly into a formal specification. In addition, I have somegoodies in this article that don't seem to appear in any officialMicrosoft documentation.

Bridging the Gap

      Let megive you just a few examples of what has changed since I wrote thearticle in 1994. Since 16-bit Windows® is history, there's no need tocompare and contrast the format to the Win16 New Executable format.Another welcome departure from the scene is Win32s®. This was theabomination that ran Win32 binaries very shakily atop Windows 3.1.
      Backthen, Windows 95 (codenamed "Chicago" at the time) wasn't evenreleased. Windows NT® was still at version 3.5, and the linker gurus atMicrosoft hadn't yet started getting aggressive with theiroptimizations. However, there were MIPS and DEC Alpha implementationsof Windows NT that added to the story.
      And what about all thenew things that have come along since that article? 64-bit Windowsintroduces its own variation of the Portable Executable (PE) format.Windows CE adds all sorts of new processor types. Optimizations such asdelay loading of DLLs, section merging, and binding were still over thehorizon. There are many new things to shoehorn into the story.
      Andlet's not forget about Microsoft® .NET. Where does it fit in? To theoperating system, .NET executables are just plain old Win32 executablefiles. However, the .NET runtime recognizes data within theseexecutable files as the metadata and intermediate language that are socentral to .NET. In this article, I'll knock on the door of the .NETmetadata format, but save a thorough survey of its full splendor for asubsequent article.
      And if all these additions andsubtractions to the world of Win32 weren't enough justification toremake the article with modern day special effects, there are alsoerrors in the original piece that make me cringe. For example, mydescription of Thread Local Storage (TLS) support was way out in leftfield. Likewise, my description of the date/time stamp DWORD usedthroughout the file format is accurate only if you live in the Pacifictime zone!
      In addition, many things that were true then areincorrect now. I had stated that the .rdata section wasn't really usedfor anything important. Today, it certainly is. I also said that the.idata section is a read/write section, which has been found to be mostuntrue by people trying to do API interception today.
      Alongwith a complete update of the PE format story in this article, I'vealso overhauled the PEDUMP program, which displays the contents of PEfiles. PEDUMP can be compiled and run on both the x86 and IA-64platforms, and can dump both 32 and 64-bit PE files. Most importantly,full source code for PEDUMP is available for download fropm the link atthe top of this article, so you have a working example of the conceptsand data structures described here.

Overview of the PE File Format

      Microsoftintroduced the PE File format, more commonly known as the PE format, aspart of the original Win32 specifications. However, PE files arederived from the earlier Common Object File Format (COFF) found onVAX/VMS. This makes sense since much of the original Windows NT teamcame from Digital Equipment Corporation. It was natural for thesedevelopers to use existing code to quickly bootstrap the new Windows NTplatform.
      The term "Portable Executable" was chosen becausethe intent was to have a common file format for all flavors of Windows,on all supported CPUs. To a large extent, this goal has been achievedwith the same format used on Windows NT and descendants, Windows 95 anddescendants, and Windows CE.
      OBJ files emitted by Microsoftcompilers use the COFF format. You can get an idea of how old the COFFformat is by looking at some of its fields, which use octal encoding!COFF OBJ files have many data structures and enumerations in commonwith PE files, and I'll mention some of them as I go along.
      Theaddition of 64-bit Windows required just a few modifications to the PEformat. This new format is called PE32+. No new fields were added, andonly one field in the PE format was deleted. The remaining changes aresimply the widening of certain fields from 32 bits to 64 bits. In mostof these cases, you can write code that simply works with both 32 and64-bit PE files. The Windows header files have the magic pixie dust tomake the differences invisible to most C++-based code.
      Thedistinction between EXE and DLL files is entirely one of semantics.They both use the exact same PE format. The only difference is a singlebit that indicates if the file should be treated as an EXE or as a DLL.Even the DLL file extension is artificial. You can have DLLs withentirely different extensions—for instance .OCX controls and ControlPanel applets (.CPL files) are DLLs.
      A very handy aspect of PEfiles is that the data structures on disk are the same data structuresused in memory. Loading an executable into memory (for example, bycalling LoadLibrary) is primarily a matter of mapping certain ranges ofa PE file into the address space. Thus, a data structure like theIMAGE_NT_HEADERS (which I'll examine later) is identical on disk and inmemory. The key point is that if you know how to find something in a PEfile, you can almost certainly find the same information when the fileis loaded in memory.
      It's important to note that PE files arenot just mapped into memory as a single memory-mapped file. Instead,the Windows loader looks at the PE file and decides what portions ofthe file to map in. This mapping is consistent in that higher offsetsin the file correspond to higher memory addresses when mapped intomemory. The offset of an item in the disk file may differ from itsoffset once loaded into memory. However, all the information is presentto allow you to make the translation from disk offset to memory offset(see Figure 1).


Figure 1 Offsets

      WhenPE files are loaded into memory via the Windows loader, the in-memoryversion is known as a module. The starting address where the filemapping begins is called an HMODULE. This is a point worth remembering:given an HMODULE, you know what data structure to expect at thataddress, and you can use that knowledge to find all the other datastructures in memory. This powerful capability can be exploited forother purposes such as API interception. (To be completely accurate, anHMODULE isn't the same as the load address under Windows CE, but that'sa story for yet another day.)
      A module in memory representsall the code, data, and resources from an executable file that isneeded by a process. Other parts of a PE file may be read, but notmapped in (for instance, relocations). Some parts may not be mapped inat all, for example, when debug information is placed at the end of thefile. A field in the PE header tells the system how much memory needsto be set aside for mapping the executable into memory. Data that won'tbe mapped in is placed at the end of the file, past any parts that willbe mapped in.
      The central location where the PE format (aswell as COFF files) is described is WINNT.H. Within this header file,you'll find nearly every structure definition, enumeration, and #defineneeded to work with PE files or the equivalent structures in memory.Sure, there is documentation elsewhere. MSDN has the "MicrosoftPortable Executable and Common Object File Format Specification," forinstance (see the October 2001 MSDN CD under Specifications). ButWINNT.H is the final word on what PE files look like.
      Thereare many tools for examining PE files. Among them are Dumpbin fromVisual Studio, and Depends from the Platform SDK. I particularly likeDepends because it has a very succinct way of examining a file'simports and exports. A great free PE viewer is PEBrowse Professional,from Smidgeonsoft (http://www.smidgeonsoft.com). The PEDUMP program included with this article is also very comprehensive, and does almost everything Dumpbin does.
      From an API standpoint, the primary mechanism provided by Microsoft for reading and modifying PE files is IMAGEHLP.DLL.
      BeforeI start looking at the specifics of PE files, it's worthwhile to firstreview a few basic concepts that thread their way through the entiresubject of PE files. In the following sections, I will discuss PE filesections, relative virtual addresses (RVAs), the data directory, andhow functions are imported.

PE File Sections

      A PE file section represents code ordata of some sort. While code is just code, there are multiple types ofdata. Besides read/write program data (such as global variables), othertypes of data in sections include API import and export tables,resources, and relocations. Each section has its own set of in-memoryattributes, including whether the section contains code, whether it'sread-only or read/write, and whether the data in the section is sharedbetween all processes using the executable.
      Generallyspeaking, all the code or data in a section is logically related insome way. At a minimum, there are usually at least two sections in a PEfile: one for code, the other for data. Commonly, there's at least oneother type of data section in a PE file. I'll look at the various kindsof sections in Part 2 of this article next month.
      Each sectionhas a distinct name. This name is intended to convey the purpose of thesection. For example, a section called .rdata indicates a read-onlydata section. Section names are used solely for the benefit of humans,and are insignificant to the operating system. A section named FOOBARis just as valid as a section called .text. Microsoft typicallyprefixes their section names with a period, but it's not a requirement.For years, the Borland linker used section names like CODE and DATA.
      Whilecompilers have a standard set of sections that they generate, there'snothing magical about them. You can create and name your own sections,and the linker happily includes them in the executable. In Visual C++,you can tell the compiler to insert code or data into a section thatyou name with #pragma statements. For instance, the statement
#pragma data_seg( "MY_DATA" )
causes all data emitted by Visual C++ to go into a sectioncalled MY_DATA, rather than the default .data section. Most programsare fine using the default sections emitted by the compiler, butoccasionally you may have funky requirements which necessitate puttingcode or data into a separate section.
      Sections don't springfully formed from the linker; rather, they start out in OBJ files,usually placed there by the compiler. The linker's job is to combineall the required sections from OBJ files and libraries into theappropriate final section in the PE file. For example, each OBJ file inyour project probably has at least a .text section, which containscode. The linker takes all the sections named .text from the variousOBJ files and combines them into a single .text section in the PE file.Likewise, all the sections named .data from the various OBJs arecombined into a single .data section in the PE file. Code and data from.LIB files are also typically included in an executable, but thatsubject is outside the scope of this article.
      There is arather complete set of rules that linkers follow to decide whichsections to combine and how. I gave an introduction to the linkeralgorithms in the July 1997 Under The Hood column in MSJ.A section in an OBJ file may be intended for the linker's use, and notmake it into the final executable. A section like this would beintended for the compiler to pass information to the linker.
      Sectionshave two alignment values, one within the disk file and the other inmemory. The PE file header specifies both of these values, which candiffer. Each section starts at an offset that's some multiple of thealignment value. For instance, in the PE file, a typical alignmentwould be 0x200. Thus, every section begins at a file offset that's amultiple of 0x200.
      Once mapped into memory, sections alwaysstart on at least a page boundary. That is, when a PE section is mappedinto memory, the first byte of each section corresponds to a memorypage. On x86 CPUs, pages are 4KB aligned, while on the IA-64, they're8KB aligned. The following code shows a snippet of PEDUMP output forthe .text and .data section of the Windows XP KERNEL32.DLL.
Section Table01 .text     VirtSize: 00074658  VirtAddr:  00001000raw data offs:   00000400  raw data size: 00074800•••02 .data     VirtSize: 000028CA  VirtAddr:  00076000raw data offs:   00074C00  raw data size: 00002400
The .text section is at offset 0x400 in the PE file andwill be 0x1000 bytes above the load address of KERNEL32 in memory.Likewise, the .data section is at file offset 0x74C00 and will be0x76000 bytes above KERNEL32's load address in memory.
      It'spossible to create PE files in which the sections start at the sameoffset in the file as they start from the load address in memory. Thismakes for larger executables, but can speed loading under Windows 9xor Windows Me. The default /OPT:WIN98 linker option (introduced inVisual Studio 6.0) causes PE files to be created this way. In VisualStudio® .NET, the linker may or may not use /OPT:NOWIN98, depending onwhether the file is small enough.
      An interesting linkerfeature is the ability to merge sections. If two sections have similar,compatible attributes, they can usually be combined into a singlesection at link time. This is done via the linker /merge switch. Forinstance, the following linker option combines the .rdata and .textsections into a single section called .text:
/MERGE:.rdata=.text
      The advantage to merging sections is that it savesspace, both on disk and in memory. At a minimum, each section occupiesone page in memory. If you can reduce the number of sections in anexecutable from four to three, there's a decent chance you'll use oneless page of memory. Of course, this depends on whether the unusedspace at the end of the two merged sections adds up to a page.
      Thingscan get interesting when you're merging sections, as there are no hardand fast rules as to what's allowed. For example, it's OK to merge.rdata into .text, but you shouldn't merge .rsrc, .reloc, or .pdatainto other sections. Prior to Visual Studio .NET, you could merge.idata into other sections. In Visual Studio .NET, this is not allowed,but the linker often merges parts of the .idata into other sections,such as .rdata, when doing a release build.
      Since portions ofthe imports data are written to by the Windows loader when they areloaded into memory, you might wonder how they can be put in a read-onlysection. This situation works because at load time the system cantemporarily set the attributes of the pages containing the imports datato read/write. Once the imports table is initialized, the pages arethen set back to their original protection attributes.

Relative Virtual Addresses

      In an executable file, thereare many places where an in-memory address needs to be specified. Forinstance, the address of a global variable is needed when referencingit. PE files can load just about anywhere in the process address space.While they do have a preferred load address, you can't rely on theexecutable file actually loading there. For this reason, it's importantto have some way of specifying addresses that are independent of wherethe executable file loads.
      To avoid having hardcoded memoryaddresses in PE files, RVAs are used. An RVA is simply an offset inmemory, relative to where the PE file was loaded. For instance,consider an EXE file loaded at address 0x400000, with its code sectionat address 0x401000. The RVA of the code section would be:
(target address) 0x401000 - (load address)0x400000  = (RVA)0x1000.
      To convert an RVA to an actual address, simplyreverse the process: add the RVA to the actual load address to find theactual memory address. Incidentally, the actual memory address iscalled a Virtual Address (VA) in PE parlance. Another way to think of aVA is that it's an RVA with the preferred load address added in. Don'tforget the earlier point I made that a load address is the same as theHMODULE.
      Want to go spelunking through some arbitrary DLL'sdata structures in memory? Here's how. Call GetModuleHandle with thename of the DLL. The HMODULE that's returned is just a load address;you can apply your knowledge of the PE file structures to find anythingyou want within the module.

The Data Directory

      There are many data structures withinexecutable files that need to be quickly located. Some obvious examplesare the imports, exports, resources, and base relocations. All of thesewell-known data structures are found in a consistent manner, and thelocation is known as the DataDirectory.
      The DataDirectory isan array of 16 structures. Each array entry has a predefined meaningfor what it refers to. The IMAGE_DIRECTORY_ENTRY_ xxx #defines are array indexes into the DataDirectory (from 0 to 15). Figure 2 describes what each of the IMAGE_DATA_DIRECTORY_xxxvalues refers to. A more detailed description of many of the pointed-todata structures will be included in Part 2 of this article.

Importing Functions

      When you use code or data fromanother DLL, you're importing it. When any PE file loads, one of thejobs of the Windows loader is to locate all the imported functions anddata and make those addresses available to the file being loaded. I'llsave the detailed discussion of data structures used to accomplish thisfor Part 2 of this article, but it's worth going over the concepts hereat a high level.
      When you link directly against the code anddata of another DLL, you're implicitly linking against the DLL. Youdon't have to do anything to make the addresses of the imported APIsavailable to your code. The loader takes care of it all. Thealternative is explicit linking. This means explicitly making sure thatthe target DLL is loaded and then looking up the address of the APIs.This is almost always done via the LoadLibrary and GetProcAddress APIs.
      Whenyou implicitly link against an API, LoadLibrary and GetProcAddress-likecode still executes, but the loader does it for you automatically. Theloader also ensures that any additional DLLs needed by the PE filebeing loaded are also loaded. For instance, every normal programcreated with Visual C++® links against KERNEL32.DLL. KERNEL32.DLL inturn imports functions from NTDLL.DLL. Likewise, if you import fromGDI32.DLL, it will have dependencies on the USER32, ADVAPI32, NTDLL,and KERNEL32 DLLs, which the loader makes sure are loaded and allimports resolved. (Visual Basic 6.0 and the Microsoft .NET executablesdirectly link against a different DLL than KERNEL32, but the sameprinciples apply.)
      When implicitly linking, the resolutionprocess for the main EXE file and all its dependent DLLs occurs whenthe program first starts. If there are any problems (for example, areferenced DLL that can't be found), the process is aborted.
      VisualC++ 6.0 added the delayload feature, which is a hybrid between implicitlinking and explicit linking. When you delayload against a DLL, thelinker emits something that looks very similar to the data for aregular imported DLL. However, the operating system ignores this data.Instead, the first time a call to one of the delayloaded APIs occurs,special stubs added by the linker cause the DLL to be loaded (if it'snot already in memory), followed by a call to GetProcAddress to locatethe called API. Additional magic makes it so that subsequent calls tothe API are just as efficient as if the API had been imported normally.
      Withina PE file, there's an array of data structures, one per imported DLL.Each of these structures gives the name of the imported DLL and pointsto an array of function pointers. The array of function pointers isknown as the import address table (IAT). Each imported API has its ownreserved spot in the IAT where the address of the imported function iswritten by the Windows loader. This last point is particularlyimportant: once a module is loaded, the IAT contains the address thatis invoked when calling imported APIs.
      The beauty of the IATis that there's just one place in a PE file where an imported API'saddress is stored. No matter how many source files you scatter calls toa given API through, all the calls go through the same function pointerin the IAT.
      Let's examine what the call to an imported APIlooks like. There are two cases to consider: the efficient way andinefficient way. In the best case, a call to an imported API looks likethis:
CALL DWORD PTR [0x00405030]
If you're not familiar with x86 assembly language, this isa call through a function pointer. Whatever DWORD-sized value is at0x405030 is where the CALL instruction will send control. In theprevious example, address 0x405030 lies within the IAT.
      The less efficient call to an imported API looks like this:
CALL 0x0040100C•••0x0040100C:JMP       DWORD PTR [0x00405030]
In this situation, the CALL transfers control to a smallstub. The stub is a JMP to the address whose value is at 0x405030.Again, remember that 0x405030 is an entry within the IAT. In anutshell, the less efficient imported API call uses five bytes ofadditional code, and takes longer to execute because of the extra JMP.
      You'reprobably wondering why the less efficient method would ever be used.There's a good explanation. Left to its own devices, the compiler can'tdistinguish between imported API calls and ordinary functions withinthe same module. As such, the compiler emits a CALL instruction of theform
CALL XXXXXXXX
where XXXXXXXX is an actual code address that willbe filled in by the linker later. Note that this last CALL instructionisn't through a function pointer. Rather, it's an actual code address.To keep the cosmic karma in balance, the linker needs to have a chunkof code to substitute for XXXXXXXX. The simplest way to do this is to make the call point to a JMP stub, like you just saw.
      Wheredoes the JMP stub come from? Surprisingly, it comes from the importlibrary for the imported function. If you were to examine an importlibrary, and examine the code associated with the imported API name,you'd see that it's a JMP stub like the one just shown. What this meansis that by default, in the absence of any intervention, imported APIcalls will use the less efficient form.
      Logically, the nextquestion to ask is how to get the optimized form. The answer comes inthe form of a hint you give to the compiler. The __declspec(dllimport)function modifier tells the compiler that the function resides inanother DLL and that the compiler should generate this instruction
CALL DWORD PTR [XXXXXXXX]
rather than this one:
CALL XXXXXXXX
      In addition, the compiler emits information tellingthe linker to resolve the function pointer portion of the instructionto a symbol named __imp_functionname. For instance, if you were callingMyFunction, the symbol name would be __imp_MyFunction. Looking in animport library, you'll see that in addition to the regular symbol name,there's also a symbol with the __imp__ prefix on it. This __imp__symbol resolves directly to the IAT entry, rather than to the JMP stub.
      Sowhat does this mean in your everyday life? If you're writing exportedfunctions and providing a .H file for them, remember to use the__declspec(dllimport) modifier with the function:
__declspec(dllimport) void Foo(void);
If you look at the Windows system header files, you'llfind that they use __declspec(dllimport) for the Windows APIs. It's noteasy to see this, but if you search for the DECLSPEC_IMPORT macrodefined in WINNT.H, and which is used in files such as WinBase.H,you'll see how __declspec(dllimport) is prepended to the system APIdeclarations.

PE File Structure

      Now let's dig into the actual formatof PE files. I'll start from the beginning of the file, and describethe data structures that are present in every PE file. Afterwards, I'lldescribe the more specialized data structures (such as imports orresources) that reside within a PE's sections. All of the datastructures that I'll discuss below are defined in WINNT.H, unlessotherwise noted.
      In many cases, there are matching 32 and64-bit data structures—for example, IMAGE_NT_HEADERS32 andIMAGE_NT_HEADERS64. These structures are almost always identical,except for some widened fields in the 64-bit versions. If you're tryingto write portable code, there are #defines in WINNT.H which select theappropriate 32 or 64-bit structures and alias them to a size-agnosticname (in the previous example, it would be IMAGE_NT_HEADERS). Thestructure selected depends on which mode you're compiling for(specifically, whether _WIN64 is defined or not). You should only needto use the 32 or 64-bit specific versions of the structures if you'reworking with a PE file with size characteristics that are differentfrom those of the platform you're compiling for.

The MS-DOS Header

      Every PE file begins with a smallMS-DOS® executable. The need for this stub executable arose in theearly days of Windows, before a significant number of consumers wererunning it. When executed on a machine without Windows, the programcould at least print out a message saying that Windows was required torun the executable.
      The first bytes of a PE file begin withthe traditional MS-DOS header, called an IMAGE_DOS_HEADER. The only twovalues of any importance are e_magic and e_lfanew. The e_lfanew fieldcontains the file offset of the PE header. The e_magic field (a WORD)needs to be set to the value 0x5A4D. There's a #define for this value,named IMAGE_DOS_SIGNATURE. In ASCII representation, 0x5A4D is MZ, theinitials of Mark Zbikowski, one of the original architects of MS-DOS.

The IMAGE_NT_HEADERS Header

      The IMAGE_NT_HEADERSstructure is the primary location where specifics of the PE file arestored. Its offset is given by the e_lfanew field in theIMAGE_DOS_HEADER at the beginning of the file. There are actually twoversions of the IMAGE_NT_HEADER structure, one for 32-bit executablesand the other for 64-bit versions. The differences are so minor thatI'll consider them to be the same for the purposes of this discussion.The only correct, Microsoft-approved way of differentiating between thetwo formats is via the value of the Magic field in theIMAGE_OPTIONAL_HEADER (described shortly).
      An IMAGE_NT_HEADER is comprised of three fields:
typedef struct _IMAGE_NT_HEADERS {DWORD Signature;IMAGE_FILE_HEADER FileHeader;IMAGE_OPTIONAL_HEADER32 OptionalHeader;} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
In a valid PE file, the Signature field is set to thevalue 0x00004550, which in ASCII is "PE00". A #define,IMAGE_NT_SIGNATURE, is defined for this value. The second field, astruct of type IMAGE_FILE_HEADER, predates PE files. It contains somebasic information about the file; most importantly, a field describingthe size of the optional data that follows it. In PE files, thisoptional data is very much required, but is still called theIMAGE_OPTIONAL_HEADER.
      Figure 3shows the fields of the IMAGE_FILE_HEADER structure, with additionalnotes for the fields. This structure can also be found at the verybeginning of COFF OBJ files. Figure 4 lists the common values of IMAGE_FILE_xxx. Figure 5 shows the members of the IMAGE_OPTIONAL_HEADER structure.
      TheDataDirectory array at the end of the IMAGE_OPTIONAL_HEADERs is theaddress book for important locations within the executable. EachDataDirectory entry looks like this:
typedef struct _IMAGE_DATA_DIRECTORY {DWORD   VirtualAddress;     // RVA of the dataDWORD   Size;               // Size of the data};

The Section Table

      Immediately following theIMAGE_NT_HEADERS is the section table. The section table is an array ofIMAGE_SECTION_HEADERs structures. An IMAGE_SECTION_HEADER providesinformation about its associated section, including location, length,and characteristics. Figure 6contains a description of the IMAGE_SECTION_HEADER fields. The numberof IMAGE_SECTION_HEADER structures is given by theIMAGE_NT_HEADERS.FileHeader.NumberOfSections field.
      The filealignment of sections in the executable file can have a significantimpact on the resulting file size. In Visual Studio 6.0, the linkerdefaulted to a section alignment of 4KB, unless /OPT:NOWIN98 or the/ALIGN switch was used. The Visual Studio .NET linker, while stilldefaulting to /OPT:WIN98, determines if the executable is below acertain size and if that is the case uses 0x200-byte alignment.
      Anotherinteresting alignment comes from the .NET file specification. It saysthat .NET executables should have an in-memory alignment of 8KB, ratherthan the expected 4KB for x86 binaries. This is to ensure that .NETexecutables built with x86 entry point code can still run under IA-64.If the in-memory section alignment were 4KB, the IA-64 loader wouldn'tbe able to load the file, since pages are 8KB on 64-bit Windows.

Wrap-up

      That's it for the headers of PE files. In Part 2of this article I'll continue the tour of portable executable files bylooking at commonly encountered sections. Then I'll describe the majordata structures within those sections, including imports, exports, andresources. And finally, I'll go over the source for the updated andvastly improved PEDUMP.
For related articles see:
Peering Inside the PE: A Tour of the Win32 Portable Executable File Format

For background information see:
The Common Object File Format (COFF)

 

Matt Pietrekis an independent writer, consultant, and trainer. He was the leadarchitect for Compuware/NuMega's Bounds Checker product line for eightyears and has authored three books on Windows system programming. HisWeb site, at http://www.wheaty.net, has a FAQ page and information on previous columns and articles.