G Common Object File Format (COFF)


Overall structure 630
File header 632
Optional header 633
Section headers 634
Raw data sections 636
COFF relocation information 637
Line number information 639
Symbol table 641
Additional symbols 643
String table 643


This section describes the Common Object File Format, COFF, used by the linker.

For further information on COFF, including the meaning of debugging symbols generated by Wind River compilers, see Understanding and Using COFF, Gircys, Gintaras R., O'Reilly & Associates, Inc., November, 1988.


Overall structure

The COFF Object Format is used both for object files (.o extension) and executable files. Some of the information is only present in object files, other information is only present in the executable files.

Table G-1   COFF file components 
Section   Description  

File header  

Contains general information; always present.  

Optional header  

Contains information about an executable file; usually only present in executables.  

Section header  

Contains information about the different COFF sections; one for each section.  

Raw data sections  

One for each section containing raw data, such as machine instructions and initialized variables.  

Relocation information  

Contains information about unresolved references to symbols in other modules; one for each section having external references. Usually only present in object files and not in executable files.  

Line number information  

Contains debugging information about source line numbers; one for each section if compiled with the -g option.  

Symbol table  

Contains information about all the symbols in the object file; present if not stripped from an executable file.  

String table  

Contains long symbol names.  

The following figure shows the COFF file structure:


File header

The file header contains general information about the object file and has the following structure from the file filehdr.h:

struct filehdr {
    unsigned short  f_magic;    /* magic */
    unsigned short  f_nscns;    /* number of sections */
    long            f_timdat;   /* date stamp */
    long            f_symptr;   /* fileptr to symtab */
    long            f_nsyms;    /* symtab count */
    unsigned short  f_opthdr;   /* sizeof(optional hdr) */
    unsigned short  f_flags;    /* flags */
};

Table G-2   COFF header fields 
Field   Description  

f_magic  

Magic number used to identify the file as a COFF file. It has the value 0x170 for the PowerPC family of processors.

f_nscns  

Number of sections this file contains.

f_timdat  

Creation time of the file represented as a 32 bit value.

f_symptr  

File offset of the symbol table.

f_nsyms  

Number of entries in the symbol table.

f_opthdr  

Number of bytes in the Optional Header.

f_flags  

Bit field containing the following flags:

 

F_RELFLG (0x1)  

Set if the COFF file does not contain relocation information; normally true only for executable files.  

 

F_EXEC (0x2)  

Set if the file is executable and all references are resolved.  

 

F_LNNO (0x4)  

Set if the COFF file does not contain line number information; this symbolic debugging information can be stripped with the -s option or the strip program.  

 

F_LSYMS (0x8)  

Set if the COFF file does not contain local symbols; these symbols can be stripped with the -X and -x options to the assembler and linker.  

 

F_AR32W (0x200)  

Always set to indicate Big-Endian byte ordering.  


Optional header

The optional header contains information about an executable file and has the following structure from the file aouthdr.h:

typedef struct aouthdr {
    short   magic;              /* a.out magic */
    short   vstamp;             /* version stamp */
    long    tsize;              /* .text size */
    long    dsize;              /* .data size */
    long    bsize;              /* .bss size */
    long    entry;              /* entry point */
    long    text_start;         /* fileptr to .text */
    long    data_start;         /* fileptr to .data */
} AOUTHDR;

Table G-3   COFF optional (executable) header fields 
Field   Description  

magic  

Value 0x10b.  

vstamp  

Set by the option -VS, but not used by the linker.  

tsize  

Size of the .text section.  

dsize  

Size of the .data section.  

bsize  

Size of the .bss section.  

entry  

Entry point in the executable program where execution will begin. The default entry point is the symbol start defined in the file function main(). The -e option can change this to any other symbol in the program.  

text_start  

File offset to the .text section in the COFF file.  

data_start  

File offset to the .data section in the COFF file.  


Section headers

There is one section header for each section in the COFF file, specified by the f_nscns field in the COFF File Header. Section headers have the following structure from the file scnhdr.h:

struct scnhdr {                     /* modified COFF*/
    char            s_name[8];      /* section name */
    long            s_paddr;        /* physical address */
    long            s_vaddr;        /* virtual address */
    long            s_size;         /* size of section */
    long            s_scnptr;       /* fileptr to raw data*/
    long            s_relptr;       /* fileptr to reloc */
    long            s_lnnoptr;      /* fileptr to lineno */
    unsigned long short  s_nreloc;       /* reloc count */
    unsigned long short  s_nlnno;        /* line number count */
    long            s_flags;        /* flags */
};

#define SCNHDR struct scnhdr
#define SCNHSZ sizeof(SCNHDR)

Table G-4   COFF section header fields 
Field   Description  

s_name[8]  

Eight byte null terminated section name. Standard names include .text, .data, and .bss.

s_paddr  

Physical start address of the section. It is usually set to the same value as s_vaddr, but can be set to a different value with the command in the linker command language. This can be useful when initialized data is physically allocated to a ROM address, but moved to a logical address in RAM at start-up.

s_vaddr  

Logical start address of the section as allocated by the assembler or linker.

s_size  

Size in bytes of the memory allocated to the section.

s_scnptr  

File offset to the raw data of the section. Note that the .bss section does not have any raw data since it will be initialized by the operating system.

s_relptr  

File offset to the relocation information of the section.

s_lnnoopt  

File offset to the line number information of the section.

s_nreloc  

Number of relocation information entries.

s_nlnno  

Number of line number information entries.

s_flags  

Bit field containing the following flags:

 

STYP_TEXT (0x20)  

set for a .text section.  

 

STYP_DATA (0x40)  

set for a .data section.  

 

STYP_BSS (0x80)  

set for .bss section.  

 

STYP_INFO (0x200)  

set for a .comment section.  

The following table shows the correspondence between the type-spec as defined on p.409 and the COFF section flags assigned to the output section.

Table G-5   type-spec - COFF section flag correspondence
type-spec   Section flags (s_flags)  

BSS  

STYP_BSS  

COMMENT  

STYP_INFO  

CONST  

STYP_DATA  

DATA  

STYP_DATA  

TEXT  

STYP_TEXT  


Raw data sections

The Raw Data Sections contain the actual raw data for each section.

Table G-6   COFF section names 

.text  

Machine instructions, constant data, and strings  

.sdata2  

Small constant data; see the Set size limit for "small const" variables (-Xsmall-const=n), p.106.  

.data  

Initialized data.  

.sdata  

Small initialized data; see the Set size limit for "small data" variables (-Xsmall-data=n), p.106.  

.bss  

Uninitialized data; does not have any raw data.  

.sbss  

Small uninitialized data.  

.comment  

Comments from #ident directives in C.  

.init  

Code that is to be executed before the main() function.  

.fini  

Code that is to be executed when the user program has finished execution.  

.eini  

The instructions of the .fini code; the .init, .fini, and .eini sections should be placed after each other in memory.  


COFF relocation information

The Relocation Information segment contains information about unresolved references. Since compilers and assemblers do not know at what absolute memory address a symbol will be allocated, and since they are unaware of definitions of symbols in other files, every reference to such a symbol will create a relocation entry. The relocation entry will point to the address where the reference is being made, and to the symbol table entry that contains the symbol that is referenced. The linker will use this information to fill in the correct address after it has allocated addresses to all symbols.

When an offset is added to a symbol in the assembly source,

lwz     r3,(var+16)(r0)
move.l  var+16,d0

that offset is stored in the addressing mode, so that adding the real address of the symbol with the address field will yield a correct reference.

The relocation segment does not exist in executable files.

A relocation entry has the following structure from the file reloc.h:

struct reloc {                  /* modified COFF */
    long            r_vaddr;    /* address of reference */
    long            r_symndx;   /* index into symtab */
    unsigned short  r_type;     /* relocation type */
    unsigned short  r_offset;   /* hi word of rel addr */
};

#define RELOC   struct reloc
#define RELSZ   sizeof(RELOC)
#define RELSZ   10              /* sizeof(RELOC) */

Table G-7   COFF relocation entry fields  
Field   Description  

r_vaddr  

The relative address of the area within the current section to be patched with the correct address.  

r_symndx  

Index into the symbol table pointing to the entry describing the symbol that is referenced at r_vaddr.  

r_type  

Type of addressing mode used; it describes whether the mode is absolute or relative, and the size of the addressing mode. See the table below for relocation types used by the Wind River tools.  

r_offset  

The high 16 bits of any offset that is added to the symbol in a R_HVRT16, R_LVRT16, and R_HAVRT16 relocation modes. Since the address field in the instruction is only 16 bits, it cannot represent a large offset. Example:

addis r13,r0,(var+0x123456)@ha.

The address field in the addis instruction will contain 0x3456 and r_offset will contain 0x12.  

  

 

Table G-8   COFF relocation types 
Relocation type  
Number
 
Description  

R_RELWORD  

16
 

16 bit absolute address:

lwz    r3,var(r0)  

R_HVRT16  

131
 

Higher 16 bits of an absolute address:

addis  r3,r0,var@h  

R_LVRT16  

132
 

Lower 16 bits of an absolute address:

lwz    r3,var@l(r0)  

R_HAVRT16  

136
 

Adjusted higher 16 bits of an absolute address. If the lower 16 bits is a negative number, one is added to the upper 16 bits:

addis  r3,r0,var@ha  

R_PCR16S2  

137
 

16 bit PC relative address where the lower two bits are ignored:

bc     4,2,label  

R_PCR26S2  

138
 

26 bit PC relative address where the lower two bits are ignored:

bl     func  

R_REL16S2  

139
 

16 bit absolute address where the lower two bits are ignored:

bca    4,2,label  

R_REL26S2  

140
 

26 bit absolute address where the lower two bits are ignored:

bla    func  


Line number information

The line number information segment contains the mapping from source line numbers to machine instruction addresses used by symbolic debuggers. This information is only available if the -g option is specified to the compiler.

Line number entries for a section form groups of pairs where the first pair in a group is a pointer to the function containing the source. After that, every source line that has generated any instruction has an entry specifying the line number relative to the beginning of the function, and the corresponding instruction address. Normally only the .text section has line number information. The following table demonstrates the layout of the line number entries:

A line number entry has the following structure from the file linenum.h:

struct lineno {
    union {
        long        l_symndx;
        long        l_paddr;
    } l_addr;
    unsigned long short  l_lnno;
};

#define LINENO      struct lineno
#define LINESZ      sizeof(LINENO)
#define LINESZ      6

Table G-9   COFF line number fields 
Field   Description  

l_symndx  

Symbol table index for a new function; only valid if l_lnno is set to zero.  

l_paddr  

Instruction address corresponding to the source line l_lnno.  

l_lnno  

Source line relative to the start of the current function.  


Symbol table

The symbol table is an array of entries containing information about the symbols referenced in the COFF file. A symbol table entry has the following structure from the file syms.h:

struct syment {
    union {
        char        _n_name[8];
        struct {
            long    _n_zeroes;
            long    _n_offset;
        } _n_n;
        char        *_n_nptr[2]
    } _n;
    long            n_value;
    short           n_scnum;

    unsigned short  n_type;
    char            n_sclass;
    char            n_numaux;
    short           n_pad;
};

#define SYMENT      struct syment
#define SYMESZ c    20
#define SYMESZ      18
#define n_name      _n._n_name
#define n_nptr      _n._n_nptr[1]
#define n_zeroes    _n._n_n._n_zeroes
#define n_offset    _n._n_n._n_offset

Table G-10   COFF symbol table fields 
Field   Description  

n_name  

Name of the symbol if the length is less than or equal to 8 bytes. If it is less than 8 bytes the name is terminated by a null character.  

n_zeroes  

Zero if a symbol name is longer than 8 bytes. This field overlaps the first 4 bytes of n_name.  

n_offset  

An offset into the String Table if n_zeroes is zero.  

n_value  

This pointer allows for overlays.  

n_value  

A value whose contents depends on the symbol type. Normally it contains the address or the size of the symbol if the symbol is a common block. A zero value indicates an undefined symbol if n_scnum is also zero.  

n_scnum  

Section number of the symbol starting with one. A zero value indicates one of two things:

If n_value is zero then the symbol is an undefined symbol that must be defined in another file.

If n_value is not zero then the symbol is a common block of size n_value. All common blocks with the same name are combined by the linker and put in the .bss section, unless some other file defines that symbol in a section.  

n_type  

Type of the symbol; only set if compiled with -g.  

n_sclass  

Storage class of the symbol. There are over 20 storage classes, but most are used only with the -g compiler option. The two classes of interest to the linker are C_EXT, external storage, and C_STAT, static (local to the file) storage.  

n_numaux  

Number of auxiliary entries used by the symbol.  

n_pad  

Pad the structure to a multiple of four bytes.  

Any auxiliary entries to a symbol are stored immediately after the symbol in the table. They are mainly used for symbolic debugging (-g option) and are not discussed here.


Additional symbols

Wind River uses special COFF symbols as follows:

Table G-11   Special COFF Symbols
Extension   Description  

!sn!section-name  

Long section-name.  

!cd!name  

COMDAT-section-name. See Mark sections as COMDAT for linker collapse (-Xcomdat), p.71.  

!sf!flags  

Section flags (a: allocate, w: write, x: execute, b: bss/nocode).  

!al!value  

Section alignment.  

!wk!symbol-name  

Weak symbol. See weak pragma, p.138.  


String table

The string table contains the null terminated names of symbols longer than eight characters. Those symbols point into the string table through an offset, n_offset. The first four bytes of the string table contain the size of the table and after that all strings are stored sequentially.

 

support@windriver.com
Copyright © 2002, Wind River Systems, Inc. All rights reserved.