Learning assembly is a major pain in the ass because there is little documentation on the subject compared to higher level languages like C++ or Java. Furthermore, different assemblers (Things that turn your assembly code into programs) have different ways of doing things, making switching between assemblers a difficult task.
To make things even more tricky, many assembly tutorial writers choose to write their tutorials in a manner which means the examples will only function on a certain operating system. To break from the mold, this tutorial gives examples that should function on Windows, OS X, Linux, BSD, Solaris and any other. In fact, you would have to be using a pretty strange system if these examples didn't work. This tutorial will also point out when parts of the code may vary on other popular assemblers.
So, in light of the lack of *easy to understand* documentation on the internet, I have decided to start writing a massive assembly tutorial, split up into many manageable sections for everyone to read. I have even targeted this tutorial at people who can't even program full stop. This tutorial isn't designed as a reference, later parts build off content from previous parts, so you should read all the parts in the right order.
This part (Part 1) will discuss why you would want to learn assembly and why I choose the options I choose in the this tutorial. I will also discuss how to use assemblers, which will lead nicely into Part 2, which is where I will show you how to write a "Hello World" code. The "Hello World" code is the classic code that people learning how to program write, and it just displays the phrase "Hello, World!" on the screen when it is run. I know you want to write a "Hello World" code right away, but before we can do this, we need to get a little info, which is what this part is all about.
I will also teach you how to use C functions (Features in the C language that gives you the ability to achieve certain things), but I do *not* assume you know C, so I will tell you exactly what the functions are doing.
As I said before, all the assembly examples on my tutorial will be portable, you should be able to assemble them on Windows, OS X, and *nix. That is why you won't see unportable stuff like stand-alone system calls (System calls are a way assembly can do things, but I won't be using them in this tutorial, because they are not portable. But, I will be teaching how to use 'system call variables' in a later part of this tutorial, which are system calls that are portable. But in the earlier parts, I will just use C functions which are also portable, and a lot easier).
Don't worry if you don't know what 'C functions' or 'System call variables' are, I will explain all of this later on!
Just to say, I haven't copied and pasted off a site, this is completely my work.
Why Assembly
Why the hell would anyone learn assembly when excellent languages like C++ and Java exist? Why would a n00b want to learn assembly before any other language? Well, there are a few reasons, both for and against. Here are a few....
Against Assembly -
> Takes ages to write a simple code.
> Different across operating systems, assemblers and platforms.
> Very easy to crash programs and even damage hardware.
> Little support and documentation.
> Fewer job opportunities than higher level languages.
For Assembly -
> Blazingly fast programs.
> You can reverse engineer any program without the source code (The code from what the program was built from).
> You can use C functions and system call variables to make code portable, these features will be explained throughout the tutorial.
> You can optimize compiled code generated from C, C++, Parcal etc compilers (Compilers turn higher level code like C++ into a working program).
> You can access powerful capabilities in hardware, which is something higher level languages couldn't dream of.
> You can use inline assembly (Assembly code embedded into your higher level code) to speed up trouble spots in higher level languages.
> You can avoid using capabilities exclusive to certain hardware to make your code, again, portable.
> It will be very easy to learn a higher level language once you know assembly, and you will have a solid understanding about how higher level languages actually work.
Which one?
There are many different types of assembly, there are different ways of doing the same thing. In my opinion, there are a few assemblers you should know about.
MASM -
Microsoft's assembler. This is only for Windows and is not maintained as a individual product anymore, but it is included with Visual Studio.NET. It uses the Intel Syntax (See more about syntaxes below). There is a another project called MASM32, which is non Microsoft, but I know little about this project...
NASM -
The Netwide assembler. It can be installed on many operating systems, including Windows, OS X and of course, *nix. It uses the Intel syntax.
Gas -
The GNU assembler. It is available on many operating systems including Windows, and it's on *nix and OS X by default on many versions. It uses the AT&T syntax by default. It can assemble code for so many platforms, including SPARC, x64 etc.
Other assemblers include FASM (Flat Assembler), YASM (Don't know it's acronym) and SOL_ASM (Solar Assembler).
When I say 'Intel syntax' or 'AT&T syntax', I mean how the language looks. A line of code that performs a certain task may look different from one syntax to another, even though they are both assembly. For example there may be certain symbols in one syntax that aren't used in the other. One may use keywords, whilst the other may use a different keyword.
The one we will be using is Gas, which uses the AT&T syntax. This is because it's the assembler that GCC uses, which means it will be easy to integrate C functions into it. So many people use GCC these days, it seems Gas is a good choice. Gas supports loads of hardware, more than NASM and MASM. It works very well with many GNU tools that are useful for assembly, like the profiler, debugger and compiler. The profiler allows us to check the speed of programs. The debugger is let's us go through our programs step by step. The compiler is used to assemble code that uses C functions (Because it's just so much easier than using the assembler). I doubt I will be covering these tools in my tutorial, except the assembler and the compiler. This tutorial is about teaching assembly, not how to use GNU assembly tools, since there is enough documentation on that subject.
The unfortunate thing about Gas, is that it uses AT&T syntax rather than Intel. This means it will be different to use Gas than NASM or MASM. But once you've learned assembly, it shouldn't be that hard to learn the differences. Major differences between them will be pointed out throughout the tutorial.
How to assemble stuff
Before I give you the example code in Part 2, I need to tell you how to assemble your code. Firstly, you will be writing your code into a text file. You can use Notepad (I wouldn't recommend it though) for Windows, OS X's text editor, or Gedit in Linux. You don't need to do anything special, you just type your code into it and save it. But here is the difference... when you save it you need to put .s at the end of the file name. This is called a file extension, and it's used to identify the file type to the operating system. .s means assembly source for Gas. Other assemblers like MASM use the .asm extension instead.
For example here is a good file name for some assembly code.
really_cool_computer_program.sHere is a file name that isn't good....
really_cool_computer_program.txt.sThe file name can have spaces, but it's more easier to just use underscores instead. This avoids confusion when you tell Gas to assemble your code, which leads us to the next bit...
Using Gas is quite simple. It needs to be done in the command line (Don't worry, no command line knowledge needed, though it helps). Here is what you do....
Open the command line. It's in Start>All Programs>Accessories>Command Prompt on Windows. On OS X, it's in Application/Utilities/Terminal. On Linux/BSD it's...well.... everywhere.
Now you're in the evil command line, let's use Gas. Window's users will have to actually find the as.exe file on their system (Use the search tool) and manually type the entire location into the command line. OS X and *nix users on the other hand, should only need to just type 'as' into the command line.
Note that the ~ symbol means your Home directory in OS X/*nix.
So, you first type the program in. On Windows it would look something like this (I'm using the the Gas included with popular Dev-C++ compiler, but you could use any) -
c:\dev-cpp\bin\as.exeOn *nix/OS X, you only have to type this - as
Now, put a space after that. Now you have to type the address of the assembly code you want to assemble -
On Windows -
c:\dev-cpp\bin\as.exe c:\documents and settings\user\desktop\assembly_source.sOn *nix/OS X -
as ~/Desktop/assembly_source.sDon't hit enter yet....
Now, that's all well and good, but we need to tell the assembler where to dump the 'object code' it creates using your assembly code. Object code is the 'half way' point between a assembly source and a finished program. Notice in the next example how the second address is the same file, but it ends with .o rather than .s? The output should always end with .o. Now, add another space and add '-o' and put a space after this. Then you write the address of the place where you want to put the finished.
On Windows it would be like this -
c:\dev-cpp\bin\as.exe c:\documents and settings\user\desktop\assembly_source.s -o c:\documents and settings\user\desktop\assembly_source.oAnd on *nix/OS X
as ~/Desktop/assembly_source.s -o ~/Desktop/assembly_source.oNow see that file with .o at the end of it? That's the file that will be converted to a finished program using a process called 'Linking'. You don't need to know how this works right now, just accept that it works =)
Linking a file is the same process as before, but with a few modifications....
firstly, as is replaced with ld. And now the file that ends with a .o is the first address, not the second. The second address is the finished program. Here is a example in Windows -
c:\dev-cpp\bin\ld.exe c:\documents and settings\user\desktop\assembly_source.o -o c:\documents and settings\user\desktop\assembly_source.exeAnd on OS X/*nix -
ld ~/Desktop/assembly_source.o -o ~/Desktop/assembly_sourceNote that a Windows program has a .exe extension, whilst a *nix file doesn't have one at all. I'm not sure about OS X though, I think it's .app.
Now you know how to assemble programs, you should go onto part 2, which will tell you the basic layout of a assembly program, and a "Hello World" code.
PART 2 WILL BE WRITTEN SOON! =)
SunSpyda