×
Create a new article
Write your page title here:
We currently have 176 articles on Open Eggbert. Type your article name above or click on one of the titles below and start writing!



Open Eggbert
176Articles
in:

Assembly Language: Difference between revisions

No edit summary
No edit summary
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
Only the x86 instruction set is mostly described here.
Only the [[x86 instruction set]], version [[i586]] (processor [[Pentium 100]]) is mostly described here.
 
See: [[X86 instruction set]]


== Basics ==
== Basics ==
Line 7: Line 9:


=== What is the difference between the Assembly language and high-level programming languages ===
=== What is the difference between the Assembly language and high-level programming languages ===
Each assembly language is specific for a given computer architecture (instruction set).
Each assembly languages is specific for a given computer architecture (instruction set).


High level programming language are mostly portable across multiple systems.
High level programming language are mostly portable across multiple systems.
Line 18: Line 20:


=== List examples of some assemblers ===
=== List examples of some assemblers ===
NASM, MASM.
NASM - Free, well documented, can be used on both Linuxc and Windwos
 
MASM - Microsoft Assembler
 
TASM - Borland Turbo Assembler
 
GAS - GNU Assembler


=== Why Assembly languages exist? ===
=== Why Assembly languages exist? ===
Line 25: Line 33:
Each family of processors has its own set of instructions.
Each family of processors has its own set of instructions.


Processors understands only machine language instructions, which are ones and zeros. But develop software unly using ones and zeros is too hard and complex. As the solution there are the assembly languages for each the instruction sets, instructions are represented with symbolic code and a more understandable form.
Processors understands only machine language instructions, which are ones and zeros. But develop software only using ones and zeros is too hard and complex. As the solution there are the assembly languages for each the instruction sets, instructions are represented with symbolic code and a more understandable form.


=== Advantages of understanding the assembly language ===
=== Advantages of understanding the assembly language ===
Line 41: Line 49:


Suitable for time-critical jobs
Suitable for time-critical jobs
=== List 3 basic parts of the computer hardware related to the computing part ===
Processor, memory and registers.
=== Describe shortly processor ===
Processor executes program instructions.
=== Describe shortly registers ===
Registers hold data and address.
=== Describe shortly memory ===
Storage for data. The transfer speed much higher than the speed of an HDD or SDD. The transfer speed is lower than the speed of registers.
=== What is a bit ===
The smallest unit of the storage is a bit, which can be ON (1) or OFF (0).
=== What is the name for a group of 8 bits? ===
Group of 8 related bits is name a byte.
=== Which data sizes are supported by the processor? ===
* Word: a 2-byte data item
* Doubleword: a 4-byte (32 bit) data item
* Quadword: an 8-byte (64 bit) data item
* Paragraph: a 16-byte (128 bit) area
* Kilobyte: 1024 bytes
* Megabyte: 1,048,576 bytes
=== Binary number system ===
The base is 2.
=== Hexadecimal number system ===
The base is 16.
=== Octal number system ===
The base is 8.
=== Decimal number system ===
The base is 10.
=== What are the steps of an execution cycle of the processor ===
* Fetching the instruction from memory
* Decoding or identifying the instruction
* Executing the instruction
=== How processor stores and loads the data ===
Storing and loading is done in the reverse-byte sequence
=== Kinds of memory addressing ===
* Absolute address - a direct reference of specific location.
* Segment address (or offset) - starting address of a memory segment with the offset value.
== Assembly syntax ==
=== Parts of an assembly program ===
* data section
* bss section
* text section
=== Data section ===
Used for declaring constants - which do not change at the runtime
The start of this sections is declared as: section.data
=== Bss section ===
Used for declaring variables
The start of this sections is declared as: bss.data
=== Text section ===
Used for declaring the code.
The start of this sections is declared as:
<code>section.text</code>
<code>global _start</code>
<code>_start:</code>
=== Comments ===
Comments starts with the semicolon (;) character.
One comments cannot be on more lines. Assembly language comments are only one-line.
Comments (;) can start on a new line or after an instruction
=== Types of Assembly Language Statements ===
* Executable instructions or instructions,
* Assembler directives or pseudo-ops, and
* Macros.
Executable instructions, or simply "instructions," direct the processor's actions. Each one includes an operation code (opcode) and corresponds to a single machine language instruction.
Assembler directives, or pseudo-ops, provide guidance to the assembler on aspects of the assembly process. They are non-executable and do not produce machine language instructions.
Macros act as a mechanism for text substitution.
=== Syntax of Assembly Language Statements ===
Assembly language statements are entered one statement per line.
[label]  mnemonic  [operands]  [;comment]
=== Hello World program in the Assembly Language ===
This example shows how to write a simple program in assembly for the x86 architecture that prints "Hello, World!" to the screen using Linux system calls.
<pre>
section .data
    msg db 'Hello, World!', 0    ; null-terminated string
section .text
    global _start                ; entry point for the program
_start:
    ; Write the string to stdout (file descriptor 1)
    mov eax, 4                    ; syscall number for sys_write
    mov ebx, 1                    ; file descriptor 1 (stdout)
    mov ecx, msg                  ; pointer to the string
    mov edx, 13                  ; length of the string
    int 0x80                      ; invoke the system call
    ; Exit the program
    mov eax, 1                    ; syscall number for sys_exit
    xor ebx, ebx                  ; exit status 0
    int 0x80                      ; invoke the system call
</pre>
==== Explanation ====
1. '''Section .data''': Defines the data segment, which contains the string "Hello, World!".
2. '''Section .text''': Defines the code segment where the program starts executing.
3. '''System Call (int 0x80)''': This is the interface to Linux system calls.
  - '''eax = 4''': Specifies the '''sys_write''' system call, which writes to a file descriptor (stdout in this case).
  - '''ebx = 1''': Specifies the file descriptor for standard output.
  - '''ecx = msg''': Points to the memory address of the message string.
  - '''edx = 13''': Specifies the length of the string.
4. The program exits with a status of 0 ('''xor ebx, ebx''' clears the register).
==== Compilation and Execution ====
To assemble and run this code on a Linux system, follow these steps:
# Save the code in a file (e.g., '''hello.asm''' or any name you prefer).
# Assemble and link it:
<pre>
nasm -f elf32 hello.asm -o hello.o
ld -m elf_i386 -s -o hello hello.o
</pre>
# Run the resulting executable:
<pre>
./hello
</pre>
This program will print "Hello, World!" to the screen.
=== What are i586 registers ===
See: [[I586 registers]]


== External links ==
== External links ==
https://www.tutorialspoint.com/assembly_programming/index.htm
https://www.tutorialspoint.com/assembly_programming/index.htm
[[Category:X86]]

Latest revision as of 06:26, 11 November 2024

Only the x86 instruction set, version i586 (processor Pentium 100) is mostly described here.

See: X86 instruction set

Basics

What is Assembly Language

Assembly language is a low-level programming language for a computer or other programmable device.

What is the difference between the Assembly language and high-level programming languages

Each assembly languages is specific for a given computer architecture (instruction set).

High level programming language are mostly portable across multiple systems.

How is the source code of an high-level programming language converted to the executable machine code

Via a Compiler

How is the source code of an assembly language converted to the executable machine code

Via an Assembler

List examples of some assemblers

NASM - Free, well documented, can be used on both Linuxc and Windwos

MASM - Microsoft Assembler

TASM - Borland Turbo Assembler

GAS - GNU Assembler

Why Assembly languages exist?

Each computer has a microprocessor with arithmetical, logical and control activities.

Each family of processors has its own set of instructions.

Processors understands only machine language instructions, which are ones and zeros. But develop software only using ones and zeros is too hard and complex. As the solution there are the assembly languages for each the instruction sets, instructions are represented with symbolic code and a more understandable form.

Advantages of understanding the assembly language

You will know:

  • How applications communicate with the operating system, processor and BIOS
  • The ways, data is represented in memory and other external devices
  • Access and execution of instructions by the processor
  • Access and processing data by instructions

Advantages of the assembly language

Less RAM

Less execution time

Suitable for time-critical jobs

List 3 basic parts of the computer hardware related to the computing part

Processor, memory and registers.

Describe shortly processor

Processor executes program instructions.

Describe shortly registers

Registers hold data and address.

Describe shortly memory

Storage for data. The transfer speed much higher than the speed of an HDD or SDD. The transfer speed is lower than the speed of registers.

What is a bit

The smallest unit of the storage is a bit, which can be ON (1) or OFF (0).

What is the name for a group of 8 bits?

Group of 8 related bits is name a byte.

Which data sizes are supported by the processor?

  • Word: a 2-byte data item
  • Doubleword: a 4-byte (32 bit) data item
  • Quadword: an 8-byte (64 bit) data item
  • Paragraph: a 16-byte (128 bit) area
  • Kilobyte: 1024 bytes
  • Megabyte: 1,048,576 bytes

Binary number system

The base is 2.

Hexadecimal number system

The base is 16.

Octal number system

The base is 8.

Decimal number system

The base is 10.

What are the steps of an execution cycle of the processor

  • Fetching the instruction from memory
  • Decoding or identifying the instruction
  • Executing the instruction

How processor stores and loads the data

Storing and loading is done in the reverse-byte sequence

Kinds of memory addressing

  • Absolute address - a direct reference of specific location.
  • Segment address (or offset) - starting address of a memory segment with the offset value.

Assembly syntax

Parts of an assembly program

  • data section
  • bss section
  • text section

Data section

Used for declaring constants - which do not change at the runtime

The start of this sections is declared as: section.data

Bss section

Used for declaring variables

The start of this sections is declared as: bss.data

Text section

Used for declaring the code.

The start of this sections is declared as:

section.text

global _start

_start:

Comments

Comments starts with the semicolon (;) character.

One comments cannot be on more lines. Assembly language comments are only one-line.

Comments (;) can start on a new line or after an instruction

Types of Assembly Language Statements

  • Executable instructions or instructions,
  • Assembler directives or pseudo-ops, and
  • Macros.

Executable instructions, or simply "instructions," direct the processor's actions. Each one includes an operation code (opcode) and corresponds to a single machine language instruction.

Assembler directives, or pseudo-ops, provide guidance to the assembler on aspects of the assembly process. They are non-executable and do not produce machine language instructions.

Macros act as a mechanism for text substitution.

Syntax of Assembly Language Statements

Assembly language statements are entered one statement per line.

[label]   mnemonic   [operands]   [;comment]

Hello World program in the Assembly Language

This example shows how to write a simple program in assembly for the x86 architecture that prints "Hello, World!" to the screen using Linux system calls.

section .data
    msg db 'Hello, World!', 0    ; null-terminated string

section .text
    global _start                 ; entry point for the program

_start:
    ; Write the string to stdout (file descriptor 1)
    mov eax, 4                    ; syscall number for sys_write
    mov ebx, 1                    ; file descriptor 1 (stdout)
    mov ecx, msg                  ; pointer to the string
    mov edx, 13                   ; length of the string
    int 0x80                      ; invoke the system call

    ; Exit the program
    mov eax, 1                    ; syscall number for sys_exit
    xor ebx, ebx                  ; exit status 0
    int 0x80                      ; invoke the system call

Explanation

1. Section .data: Defines the data segment, which contains the string "Hello, World!".

2. Section .text: Defines the code segment where the program starts executing.

3. System Call (int 0x80): This is the interface to Linux system calls.

  - eax = 4: Specifies the sys_write system call, which writes to a file descriptor (stdout in this case).
  - ebx = 1: Specifies the file descriptor for standard output.
  - ecx = msg: Points to the memory address of the message string.
  - edx = 13: Specifies the length of the string.

4. The program exits with a status of 0 (xor ebx, ebx clears the register).

Compilation and Execution

To assemble and run this code on a Linux system, follow these steps:

  1. Save the code in a file (e.g., hello.asm or any name you prefer).
  2. Assemble and link it:
nasm -f elf32 hello.asm -o hello.o
ld -m elf_i386 -s -o hello hello.o
  1. Run the resulting executable:
./hello

This program will print "Hello, World!" to the screen.

What are i586 registers

See: I586 registers

External links

https://www.tutorialspoint.com/assembly_programming/index.htm