Program Structure: .data, .bss, .text, global start

From MediaWiki
Jump to navigation Jump to search

Program Structure (.data, .bss, .text, global _start)

Assembly programs are organized into well-defined sections that separate code and data. Understanding this structure is crucial for writing, linking, and debugging NASM programs.

Overview

A typical x86-64 assembly program for GNU/Linux contains:

  • A **comment block** describing metadata
  • A **.data** section for initialized data
  • A **.bss** section for uninitialized data
  • A **.text** section for executable code
  • A **global _start** label marking the entry point

Comment Block

The top of an assembly source file often contains documentation for readability and maintenance.

Example:

; Executable name : eatsyscall64
; Version         : 1.0
; Author          : E. Benoist
; Description     : A simple NASM Linux 64-bit program
; Build with:
; nasm -f elf64 -g -F dwarf eatsyscall64.asm
; ld -o eatsyscall64 eatsyscall64.o

.data Section

This section holds **initialized data**, such as constants and strings. Values are stored directly in the assembled file.

Example:

SECTION .data
Msg: db "Hello, world!", 10
MsgLen: equ $ - Msg

Characteristics:

  • Contents are included in the binary.
  • Used for read-only or predefined data.

.bss Section

The **Block Started by Symbol (BSS)** section reserves space for uninitialized data. It allocates memory at runtime but does not increase the program’s file size.

Example:

SECTION .bss
Buffer: resb 64     ; reserve 64 bytes
Counter: resq 1     ; reserve one 64-bit integer

Characteristics:

  • Space allocated at runtime.
  • Initialized to zero by the OS loader.

.text Section

Contains the actual **executable instructions**. This is where your program’s logic resides.

Example:

SECTION .text
global _start
_start:
    mov rax, 1       ; sys_write
    mov rdi, 1       ; stdout
    mov rsi, Msg      ; pointer to message
    mov rdx, MsgLen   ; message length
    syscall
    mov rax, 60       ; sys_exit
    xor rdi, rdi
    syscall

The global _start Label

Every standalone Linux assembly program must define a global entry point called `_start`. It is where execution begins when the operating system loads the program.

Example:

global _start
_start:
    ; program entry point
    mov rax, 60
    xor rdi, rdi
    syscall

Summary

  • **.data** — constants and initialized data
  • **.bss** — uninitialized variables (space reserved)
  • **.text** — program code
  • **_start** — entry point required by the linker

A correct section layout ensures that your code assembles and runs properly under Linux.