Assembly Programming: Overview & Conventions

From MediaWiki
Revision as of 14:46, 20 October 2025 by Bfh-sts (talk | contribs) (Created page with "= Assembly Programming: Overview & Conventions = This page introduces x86-64 assembly programming, the syntax conventions used in this course, and the tools involved. == Introduction == Assembly language (ASM) is a low-level programming language that provides direct access to the CPU’s instruction set. Unlike high-level languages, assembly operates on registers, memory addresses, and immediate values. In this course, we use NASM (Netwide Assembler) on GNU/Linux wit...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Assembly Programming: Overview & Conventions

This page introduces x86-64 assembly programming, the syntax conventions used in this course, and the tools involved.

Introduction

Assembly language (ASM) is a low-level programming language that provides direct access to the CPU’s instruction set. Unlike high-level languages, assembly operates on registers, memory addresses, and immediate values.

In this course, we use NASM (Netwide Assembler) on GNU/Linux with Intel syntax. Intel syntax is more readable for most beginners and aligns with documentation from Intel and AMD.

Intel vs. AT&T syntax

There are two dominant syntaxes for x86 assembly:

  • **Intel syntax** – used by NASM, MASM, and most documentation.
  • **AT&T syntax** – used by GNU Assembler (GAS) and older Unix systems.

Main differences in AT&T syntax:

  • Immediate values are prefixed with `$`, e.g. `push $4` instead of `push 4`.
  • Registers are prefixed with `%`, e.g. `inc %eax` instead of `inc eax`.
  • Source and destination are swapped: `add eax, 4` becomes `add $4, %eax`.
  • Instruction size is suffixed to the opcode: `mov ax, bx` becomes `movw %bx, %ax`.

In this course we use **Intel syntax** exclusively.

Example comparison

Intel syntax:

mov eax, 4
add eax, 2
mov ebx, eax

AT&T syntax:

movl $4, %eax
addl $2, %eax
movl %eax, %ebx

Intel syntax is generally easier to read, especially when addressing memory:

lea eax, [rcx + rax*8 - 0x30]

vs.

lea -0x30(%rcx, %rax, 8), %eax

Why learn assembly

Learning assembly provides insight into how computers actually execute code:

  • How registers, memory, and the stack interact.
  • How arithmetic and logic are performed at CPU level.
  • How operating system services are invoked through syscalls.

It builds a foundation for understanding compilers, reverse engineering, performance optimization, and embedded systems.