The Dumbest Program in the World! Lesson 1

We're going to write our first program together. It's going to be the best program at doing nothing that has ever been programed. Type this into your terminal.

What this does is create a file with nothing in it. Believe it or not, it will assemble. It won't link, though. We'll see why in a bit. If you enter that command again, it'll just update the time stamp on it. Even if there's something in the file, now. So let's look at it.

As you can see, it is zero bytes long. Let's assemble it and see nothing happen. When NASM runs, the only time it will output any text is when something is wrong. So if it doesn't say anything, that's good. GCC, GAS, TCC, and LD all do that, too.

The -f elf64 argument tells the assembler to target a 64 bit executable. That's what we want. nothing.asm is our source file. -o nothing.o is the output file. So, if we ls -l again we should see the .o file.

There it is. 304 bytes of nothing. Let's try to link it. The command is
ld -z noexecstack -no-pie -s nothing.o -o nothing
The -z noexecstack and -no-pie parts are there to get rid of some annoying warnings. We'll eventually learn about those. The -s is to strip the executable of labels and symbols to make the file a bit smaller. You can actually leave that out. It won't hurt a thing.

Like I said. If we see any of these programs output text, it's probably because we effed something up. That last line is what the problem is.
"ld: warning: cannot find entry symbol _start"
We didn't tell the program where to start. It's an empty file, so there nothing there to start. I guess we'll have to open that text editor and type something in. Open your text editor. Enter this in and save it as nothing.asm

global _start

section .text

_start:
    mov    rax,    60
    syscall

So let's assemble and link it again. Isn't it nice to see nothing, again? Let's run it too. It looks like this program did absolutely nothing and output nothing at all.

But, it did something. The first line global _start tells the linker that other modules can see the function _start. That's what the linker complained about. _start tells the program where to start executing code at. section .txt is where the executable code and instructions go. There are a couple of other sections to know about, too. The .data section and the .bss section. We'll get into them later.

_start: is the label that tells the program that this is where we start running. It's what's called a label. Every label is a number, too. It the location in memory where that data at the label is.

mov rax, 60 is an instruction. RAX is what's called a register. Registers are like little pieces of memory, but they are in the cpu itself and not in ram or rom. They are very fast. There are more registers than this one. In this lesson, we'll learn about one more register. 60 is the number that we are "moving" into the RAX register. The destination is first (rax) and the source is second (60). 60 is also known as an immediate value. Immediate values are when you operate on a number constant instead of a register or memory. You can move memory into registers, registers into memory, immediates into registers or memory, but you cant move memory into memory. I don't know why, but you'll p off the assembler if you try. RAX is also a 64 bit register. It can hold 8 bytes of information. That's a signed number from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. You can also use half of that register as a 32 bit register (EAX), a fourth of it as a 16 bit register (AX), and an eighth of it as an 8 bit register (AL). We can just as easily use mov al, 60, mov ax, 60, or mov eax, 60 and it would do the same thing essentially.

syscall tells the program to execute a system call. These are functions in the Linux kernel that do things. The particular function we are calling is exit. The number 60 in the RAX register tells syscall that the exit call is what to do. There are a lot of syscall functions. There are over 300 of them in Linux, and I know what a dozen of them do. You can look at all of them at this page. I encourage you to. So after the system call of exit is made, we return to the command prompt.

Every x86 program does output one thing, though. It's the exit code. Sometimes called the error code. Even if we don't tell the program what exit code to pass to the operating system, it will output something. To see what exit code the program output, we enter this command.
echo $?

This time it exited on a 0. An exit code of zero is usually considered a success. However, we didn't tell it what exit code to quit on. It was left undefined. I wouldn't have been surprised to see any random number in there. Syscall exit returns an exit code as anything in the RDI register. Right now, just think of RDI as another RAX. We can move a number into RDI the same way we did with RAX. After the mov rax, 60 line, insert this line: mov rdi, 5. Save, assemble, link, run and check the exit code.

global _start

section .text

_start:
    mov    rax,    60
    mov    rdi,    5
    syscall

Now we got a program that can output a 5 all day without getting tired. Let's end on something more interesting.

global _start

section .text

_start:
    mov    rax,    60
    rdrand rdi
    syscall

rdrand rdi fills all RDI's 64 bits with random 1s and 0s. Only the lowestest byte of RDI will be output, I think. Exit codes go from 0 to 255, I think. That's an unsigned byte. So lets do it.

So it turns out that "The Dumbest Program in the World" actually does something maybe useful. A program that outputs a random exit code can pass that code to other programs and can be used in batch files or bash scripts. Next, we'll talk more about registers and system calls. We'll build some more dumb programs, too. Most tutorials start with a program that shouts "Hello" to the world like some kind of idiot. We'll make that one, too.

Search This Blog

Relearning Assembly

The Dumbest Program in the World! Lesson 1

Comments

Post a Comment

Popular posts from this blog

Let's learn together.