Unassemble: Unassembles a block of code. Great for debugging (and cracking) -u 100 L 8 <-- unassembles 8 bytes starting at offset 100 107A:0100 MOV AH,02 <-- debut's response 107A:0102 MOV DL,41 107A:0104 INT 21 107A:0106 INT 20 Write: This command works very similar to Load. It also has 2 ways it can operate: using name, and by specifying an exact location. Refer to back to Load for more information. NOTE: The register CX must be set the file size in order to write! NOTE: Write will not write files with a .EXE or .HEX extension. Enough about debug, lets move on to CodeView. CodeView -------- CodeView is another program that might come in handy sometimes. However it is not free. There are many debuggers similar to CodeView out there, but it is enough for you to understand one. CodeView has a number of different windows, Help, Locals, Watch, Source 1, Source 2, Memory 1, Memory 2, Registers and a few more, depending on the version number. The Source Windows Source 1 and 2 let you view 2 different source code segments at the same time. This is very useful for comparing. Memory Windows These windows let you view and edit different sections of memory. On the left side you have the memory location in segment:offset form, in the middle the hex value of the instructions, and on the right side the ASCII value. Again, non-printable characters are represented by a ".". You can switch between multiple menus using F6. You can also press Shift+F4 to switch between hexadecimal, ASCII, words, double words, signed integers, floating values, and more. Register This menu lets you view and change the value in each register. The FL register near the bottom stands for Flags. At the very bottom you should see 8 different values. They are the specific flag values. OV/NV = Overflow (OVerflow/No oVerflow) DN/UP = Direction (DowN/UP) DI/EI = Interrupt (????) PL/NG = Sign (????) NZ/ZR = Zero (Not Zero/ZeRo) NA/AC = Auxiliary Carry (No Auxiliary carry/Auxiliary Carry) PO/PE = Parity (????) NC/CY = Cary (????) Command This window lets you pass commands to CodeView. I will not explains these as they are almost identical to the ones Debug uses, however a bit more powerful. This chapter went through a lot of material. Make sure you actually get it all, or at least most of it. Debug will be insanely useful later on, so learn it now! The key is practise, lots of practise! Exercises: 1. Make a program that prints an A on the screen using debug, save it to C drive as cow.com. Quite debug and delete it. Now get back into debug and restore it again. HINT: If you delete a file in DOS, DOS simply changes the first character to E5 It's not as hard as it sounds, basically here's what you do: I) Load as many sectors of your drive as you think you will need II) Search those sectors for the hex value E5 and the string "ow" III) Dumb the offset of the location the search returned IV) Edit that offset and change the E5 instruction to a letter of your choice (41) V) Write the sectors you loaded into RAM back to C drive 2. Use debug to get your modem into CS (Clear to Send) mode. The hex value is 2. 3. Make a program called cursor.com using debug that will change the cursor size. I) Move 01 into AH II) Move 0007 into CX III) Call interrupt 10 IV) Call interrupt 20 6. More basics =============== Before reading this chapter, make sure you completely understood EVERYTHING I talked about so far. .COM File Format ---------------- COM stands for COre iMage, but it is much easier to memorize it as Copy Of Memory, as that description is even better. A COM file is nothing more than a binary image of what should appear in the RAM. It was originally used for the CP/M and even though CP/M were used in the Z80/8080 period, COM files have still the same features as they did back in the 70's. Let's examine how a COM file is loaded into memory: 1. You type in the file name, DOS searches for filename + .com, if found that file gets executed. If not DOS will search for filename + .exe, if it can't find that it will search for filename + .bat, and if that search fails it will display the familiar "Bad command or filename" message. 2. If it found a .com file in step 1, DOS will check its records and make sure that a 64k block of memory is found. This is necessary or else the new program could overwrite existing memory. 3. Next DOS builds the Program Segment Prefix. The PSP is a 256 byte long block of memory which looks like the table below: Address Description 00h-01h Instructions to terminate the program, usually interrupt 20h 02h-03h Segment pointer to next available block 04h Reserved, should be 0 05h-09h Far call to DOS dispatcher 0Ah-0Dh INT 22h vector (Terminate program) 0Eh-11h INT 23h vector (Ctrl+C handler) 12h-15h INT 24h vector (Critical Error) 16h-17h PSP segment of parent process 18h-2Bh Pointer to file handler 2Ch-2Dh DOS environment segment 2Eh-31h SS:SP save area 32h-33h Number of file handles 34h-37h Pointer to file handle table 40h-41h DOS version 5Ch-6Bh File control block 1 6Ch-7Bh File control block 2 7Ch-7Fh Reserved 80h Length of parameter string 81h-FFh Default DTA (Disk Transfer Area) 4. DS,ES, and SS are set to point to block of memory 5. SP is set to FFFFh 6. 0000h is pushed on the stack (stack is cleared) 7. CS is set to point to memory (segment), IP is set to 0100h (offset, remember debug?) The PSP is exactly 255 bytes long, meaning that to fit into one segment (aka. to be a valid .com file your program cannot be larger than 65280 bytes). However as I mentioned before, by the time you can code a program in assembly that is that large, you already know well more than enough to make a .EXE file. So what do you need this information for? Well like all other memory, you can view and edit the PSP. So you could play around with it. For example, later when we get into file operations you will be working with the DTA. Or maybe you need to know the DOS version, you can just check 40h-41h, etc. Flow control operations ----------------------- Flow control operations are just what the name says, operations that control the flow of your program. If you have worked with another language before, those are the if/then statements. From what you've hear about assembly, you might think that this is fairly difficult, but it's not. To do the equivalent of a if/then I will have to introduce you to 3 new things, labels, the compare command and jump instructions. First things first, maybe you recall the simple program that prints A from the interrupts section. Notice how our layout contains the line START:? START: is a label. If you come from C/C++/Pascal you can think of a label almost like a function/procedure. Take a look at the following code, by now you should know what's happening here: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,SS:NOTHING ORG 100h START: INT 20 MAIN ENDS END START Notice the line that saying START:, that's a label. So what's the point of putting labels in your code? Simple, you can easily jump to any label in your program using the JMP operator. For example, consider the following code: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,02h MOV DL,41h INT 21h JMP EXIT MOV AH,02h MOV DL,42h INT 21h EXIT: INT 20h MAIN ENDS END START First the program prints an A using the familiar routine, but instead of using INT 20h to exit, it jumps to the label EXIT and calls INT 20h from there. The result is that it completely skips everything after the JMP EXIT line, the B doesn't get printed at all. So by using a label you can easily close the program from any location. This is fairly useless so far though. It gets interesting when you start using some of the other jump commands. I will only explain a few here as there are just too many. Below is a alphabetical list of most of them: JA - Jump if Above JAE - Jump if Above or Equal JB - Jump if Below JBE - Jump if Below or Equal JC - Jump on Carry JCXZ - Jump if CX is Zero JE - Jump if Equal JG - Jump if Greater JGE - Jump if Greater than or Equal JL - Jump if Less than JLE - Jump if Less than or Equal JMP - Jump unconditionally JNA - Jump if Not Above JNAE - Jump if Not Above or Equal JNB - Jump if Not Below JNE - Jump if Not Equal JNG - Jump if Not Greater JNGE - Jump if Not Greater or Equal JNL - Jump if Not Less JNLE - Jump if Not Less or Equal JNO - Jump if No Overflow JNP - Jump on No Parity JNS - Jump on No Sign JNZ - Jump if Not Zero JO - Jump on Overflow JP - Jump on Parity JPE - Jump on Parity Even JPO - Jump on Parity Odd JS - Jump on Sign JZ - Jump on Zero Some of these are fairly self-explanatory (like JCXZ), but others require some more explanation. What is being compared to what, and how does the jump know the result? Well almost anything can be compared to almost anything. The result of that comparison is stored in the flags register. The jump command simply checks there and response accordingly. Let's make a simple if/then like structure: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV DL,41h MOV DH,41h CMP DH,DL JE TheyAreTheSame JMP TheyAreNotSame TheyAreNotSame: MOV AH,02h MOV DL,4Eh INT 21h INT 20h TheyAreTheSame: MOV AH,02h MOV DL,59h INT 21h INT 20h This code is fairly straight forward, it could be expressed in C++ as: void main () { int DH,DL DL = 41 DH = 41 if (DH == DL) { cout << "Y"; } else { cout << "N"; } In this case the program will return Y, but try changing either DH, or DL to some other value. It should display N. HINT: Tired of constantly typing "tasm blah.asm", "tlink /t blah.obj"? Make a simple batch file containing the following 3 lines and save it as a.bat in your tasm dir. @ECHO OFF TASM %1.ASM TLINK /T %1.OBJ Now you can just type "a blah", even without the file extension. If you have A86 and are sick of typing "a86 blah.asm", just rename a86.exe to something like a.exe. Loops ----- Loops are a essential part of programming, in fact loops make the difference between being a programming language and being something like HTML. If you don't know what loops are, they are just the repeaded execution of a block of code. The 2 most common types of loops are For and While. A for loop repeads a block of code until a certain condition is met. Look at the following C++ code: main() { for (int counter = 0; counter < 5; counter++) { cout << "A"; } return 0; This will produce the following output: AAAAA This code is fairly easy, it initializes a variable and sets it equal to zero. It will loop until the varialbe is less than 5, and after each execution the variable gets incremented. Now lets make an exact copy of that program in assembly: blah segment assume cs:blah, ds:blah, ss:blah, es:blah, ss:blah ;do the usual setup org 0100h start: ;label for start of program MOV CX,5 ;cx is always the counter LOOP_LABEL: ;label to loop MOV AH,02h ;do the familiar A shit (this printed some wierd MOV DL,41h ;character instead for me, anyone know why?) INT 21h LOOP LOOP_LABEL ;loop everything in between loop_label: and the ;loop statement as many times as specified in CX INT 20h ;usual ending shit blah ends end Start Output should be: AAAAA C:\> But as I said, it printed some other shit for me. Well who cares, as long as it looped. This code is basicly doing this: 1. Set CX to 5 2. Print an A 3. Check of CX = 0, if not decrement CX 4. Go back to loop_label 5. Check if CX = 0, if not decrement CX 6. etc Next we have the While loop. It also repeads a block of code as long as a condition is true, but the condition is not changed during in the loop declaration as with the For loop. Take a look at a simple C++ While loop: main() { int counter = 0; while(counter < 5) { counter++ cout << "A"; } return 0; Notice how the condition is being changed in the actual loop. This is very important as you may already know. Let's convert that piece of code to assembly: blah segment assume cs:blah, ds:blah, ss:blah, es:blah ;do the usual setup org 0100h start: MOV CX, 5 ;set CX equal to 5 loop_label: MOV AH,02h ;print the A MOV DL,41h INT 21h DEC CX ;decrement CX CMP CX,0 ;check if CX is zero JNZ loop_label ;no? go back to loop_label INT 20h ;yes? terminate program ;usual ending shit blah ends end Start Output: AAAAA C:\> They look almost identical, so what's different? And why use em? Well the For loop is good for loops that have a set number of repetitions, while the while loop can change the amount or repedition during the loop execution. This is useful for user input for example. It is also possible to make a for loop without using the loop statement, just like you would do a while loop. That might not look as pretty, but it can potentially be a bit faster. Sometimes interrupts can modify the CX register, in which case your loop would loop an unpredictable number of times. That's not good. To stop that you can make use of the stack: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV CX,5 Loop_Label: PUSH CX ;store CX on the stack MOV AH,02h MOV DL,41h INT 21h POP CX ;restore it after the pass is completed LOOP Loop_Label INT 20h ;usual ending shit MAIN ENDS END START Variables --------- Yeah, yeah, yeah, I know there are no variables in assembly, but this is close enough for me. You may be familiar with variables if you've come from another language, if not variables are simply a name given to a memory area that contains data. To access that data you don't have to specify that memory address, you can simply refer to that variable. In this chapter I will introduce you to the 3 most common variables types: bytes, words, and double words. You declare variables in this format: NAME TYPE CONTENTS Were type is either DB (Declare Byte), DW (Declare Word), or DD (Declare DoubleWord). Variables can consist of number, characters and underscores (_), but must begin with a character. Example 1: A_Number DB 1 This creates a byte long variable called A_Number and sets it equal to 1 Example 2: A_Letter DB "1" This creates a byte long variable called A_Letter and sets it equal to the ASCII value 1. Note that this is NOT a number. Example 3: Big_number DD 1234 This declares a Double Word long variable and sets it equal to 1234. You can also create constants. Constants are data types that like variables can be used in your program to access data stored at specific memory locations, but unlike variables they can not be changed during the execution of a program. You declare constants almost exactly like variables, but instead of using D?, you use EQU. Well actually EQU declares a Text Macro. But since we haven't covert macros yet and the effect is basicly the same, we will just tread it as a constant. Example 4: constant EQU DEADh So how do you use variables and constants? Just as if they were data. Take a look at the next example: Example 5: constant EQU 100 mov dx,constant mov ax,constant add dx,ax This declares a constant called constant and sets it equal to 100, then it assigns the value in constant to dx and ax and adds them. This is the same as mov dx,100 mov ax,100 add dx,ax The EQU directive is a bit special though. It's not really a standard assembly instruction. It's assembler specific. That means that we can for example do the following: bl0w EQU PUSH sUcK EQU POP bl0w CX sUcK CX When you assemble this, the assembler simply substitues PUSH and POP with every occurance of bl0w and sUcK respectivly. Arrays ------ Using this knowledge it is possible to create simple arrays. Example 1: A_String DB "Cheese$" This creates a 5 byte long array called A_String and sets it equal to the string Cheese. Notice the $ at the end. This has to be there, otherwise your CPU will start executing instructions after the last character, which is whatever is in memory at that particular location. There probably won't be any damage done, but who knows what's hidden in those dark corners... To use quotes (single or double) within a string you can use a little trick: Example 2: Cow DB 'Ralph said "Cheese is good for you!"$' or Cow DB "Ralph said 'Cheese is good for you!'$" Use whichever you think looks better. What if you have to use both types of quotes? Example 3: Cow DD 'Ralph said "I say: ""GNAAAARF!""$' Use double double/single quotes. What if you don't know what the variable is going to equal? Maybe it's user-inputed. Example 4: Uninitialized_variable DB ? Now lets use a variable in an actual program: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: COW DB "Hello World!$" MOV AH,09h MOV DX,OFFSET COW INT 21h INT 20h MAIN ENDS END START Yes, even in assembly you finally get to make a Hello World program! Here we're using interrupt 21h, function 9h to print a string. To use this interrupt you have to set AH to 9h and DX must point to the location of the string. NOTE: VERY important! ALWAYS declare unitialized arrays at the VERY END of your program, or in a special UDATA segment! That way they will take up no space at all, regardless of how big you decide to make them. For example say you have this in a program: Some_Data DB 'Cheese' Some_Array DB 500 DUP (?) More_Data DB 'More Cheese' This will automaticly add 500 bytes of NULL characters to your program. However if you do this instead: Some_Data DB 'Cheese' More_Data DB 'More Cheese' Some_Array DB 500 DUP (?) Your program will become 500 bytes smaller. String Operations ----------------- Now that you know some basics of strings, let's use that knowledge. There are a number of string operations available to you. Here I will discuss 4 of them. Lets start with MOVSB. This command will move a byte from one location to another. The source destination is ES:SI and the destination is DS:DI. Example 1: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,9 MOV DX,OFFSET NEWSTRING INT 21h MOV DX,OFFSET OLDSTRING INT 21h MOV CX,9 LEA SI,OLDSTRING LEA DI,NEWSTRING REP MOVSB MOV DX,OFFSET NEWSTRING INT 21h MOV DX,OFFSET OLDSTRING INT 21h INT 20h OLDSTRING DB 'ABCDEFGHI $' NEWSTRING DB '123456789 $' MAIN ENDS END START Output: ABCDEFGHI 123456789 ABCDEFGHI ABCDEFGHI This little example has a few instructions that you haven't seen before, so lets go through this thing step by step. 1. We do the regular setup 2. We use the method from the previous section to print NEWSTRING (ABCDEFGHI) 3. We print OLDSTRING (123456789) 4. We set CX equal to 9. Remeber that the CX register is the counter. 5. Here's a new instruction, LEA. LEA stands for Load Effective Address. This instruction will load the contents of a "variable" into a register. Since DI contains the destination and SI the source, we assign the location of NEWSTRING and OLDSTRING to them respectivly 6. MOVSB is the string operator that will move a byte from SI to DI. Since we have an array of 9 characters (well 10 if you count the space, but that is the same in both anyway) we have to move 9 bytes. To do that we use REP. REP will REPeat the given instruction for as many times as specified in CX. So REP MOVSB will perform the move instruction 9 times, ones for each character. 7. To see our result we simple print each string again using the same code we used in step 2 and 3. The next string operator is not only very easy to use, but also very useful. It will scan a string for a certain character and set the EQUAL flag bit if the search was successful. The operator is SCASB, the location of the string is in DI, and the character is stored in AL. Example 2: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV CX,17h LEA DI, STRING MOV AL, SEARCH REPNE SCASB JE FOUND JNE NOTFOUND NOTFOUND: MOV AH,09h MOV DX,OFFSET NOTFOUND_S INT 21h INT 20h FOUND: MOV AH,09h MOV DX,OFFSET FOUND_S INT 21h INT 20h SEARCH DB '!' STRING DB 'Cheese is good for you!' FOUND_S DB 'Found$' NOTFOUND_S DB 'Not Found$' MAIN ENDS END START This should be fairly easy to figure out for you. If you can't, I'll explain it: 1. We do the usual setup 2. We set CX equal to 17h (23 in decimal), since our string is 17h characters long 3. We load the location of STRING into DI 4. And the value of the constant SEARCH into AL 5. Now we repeat the SCASB operation 23x 6. And use a jump to signal wether or not we found the string Finally we have the CMPS instruction. This operator will compare the value of two strings with each other until they're equal. Example 3: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV CX,17h LEA SI,STRING1 LEA DI,STRING REP CMPSB JE EQUAL JNE NOTEQUAL NOTEQUAL: MOV AH,09h MOV DX,OFFSET NOT_EQUAL INT 21h INT 20h EQUAL: MOV AH,09h MOV DX,OFFSET EQUAL1 INT 21h INT 20h STRING1 DB 'Cheese is good for you!' STRING DB 'Cheese is good for you!' EQUAL1 DB 'They''re equal$' NOT_EQUAL DB 'They''re not equal$' MAIN ENDS END START By now you should know what's going on. SI and DI contain the two strings to be compared, and REP CMPSB does the comparison 17h times, or until it comes across two bytes that are not equal (b and g in this case). Then it does a jump command to display the appropriate message. The final string operations I will introduce you to are STOSB and and LODSB. STOSB will store a byte from AL at the location that ES:DI points to. STOSB will get a byte that ES:DI points into AL. These two instructions are very very powerful as you will see if you continue learning assembly. Take a look at the next example. MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,9 MOV DI,OFFSET STRING MOV SI,DI LODSB INC AL STOSB MOV DX,DI DEC DX INT 21h INT 20h STRING DB "oh33r! $" MAIN ENDS END START This code will return: ph33r! So what does it do? 1. It moves 9 into AH to set it up for interrupt 21's Print String function 2. Move the location of STRING into DI for the the LODSB instruction 3. Do the same with SI 4. Load ES:DI into AL 5. Increment AL, thus changing it from o to p 6. And put the contents of AL back to ES:DI 7. Put DI into DX for interrupt 21's Print String function 8. STOSB will increment DI after a successful operation, so decrement it 9. Call interrupt 21h 10. And terminate the program And here's a final note that I should have mentioned earlier: All these string instructions actually don't always end in B. The B simply means Byte but could be replaced by a W for example. That is, MOVSB will move a byte, and MOVSW will move a word. If you're using a instruction that requires another register like AL for example, you use that registers 32 or 64 bit part. For example, LODSW will move a word into AX. Sub-Procedures -------------- This chapter should be fairly easy as I will only introduce one new operator, CALL. CALL does just that, it CALLs a sub-procedure. Sub-Procedure are almost exactly like labels, but they don't end with a : and have to have a RET statement at the end of the code. The purpose of sub-procedures is to make your life easier. Since you can call them from anywhere in the program you don't need to write certain sections over and over again. MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: CALL CHEESE CALL CHEESE CALL CHEESE CALL CHEESE CALL CHEESE CALL CHEESE CALL CHEESE INT 20h CHEESE PROC MOV AH,09 LEA DX,MSG INT 21h RET CHEESE ENDP MSG DB 'Cheese is good for you! $' MAIN ENDS END START 1. We use the CALL command to call the sub-procedure CHEESE 7 times. 2. We set up a sub-procedure called CHEESE. This is done in the following format: LABEL PROC 3. We type in the code that we want the sub-procedure to do 4. And add a RET statement to the end. This is necessary as it returns control to the main function. Without it the procedure wouldn't end and INT 20h would never get executed. 5. We end the procedure using LABEL ENDP 6. The usual... User Input ---------- Finally! User Input has arrived. This chapter will discuss simple user input using BIOS interrupts. The main keyboard interrupt handler is 16h. For the first part of this chapter we will be using the function 0h. Lets start with a simple program that waits for a keypress: Example 1: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,0 INT 16h INT 20h MAIN ENDS END START This program waits for you to press a key, and then just quits. Expected more? Of course. We have the echo the key back. Only than it will be truly 3|337. Remember all those programs you did in the debug part of this tutorial that printed out an A? Remember how we did it? No? Like this: Example 2: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,2h MOV DL,41h INT 21h INT 20h MAIN ENDS END START Notice how the register DL contains the value that we want to print. Well if we use interrupt 16h to get a key using function 0h, the ASCII scan code gets stored in AL, so all we have to do is move AL into DL, then call the old interrupt 21h, function 2h. Example 3: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,0h INT 16h MOV AH,2h MOV DL,AL INT 21h INT 20h MAIN ENDS END START Isn't this awesome? Well that's not all INT 16 can do. It can also check the status of the different keys like Ctrl, Alt, Caps Lock, etc. Check Appendix A for links to interrupt listings and look them up. Let's use our new found t3kn33kz to create another truly 3|337 program: Example 4: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,0h INT 16h MOV KEY,AL CMP KEY,90h JE ITS_A_Z JNE NOT_A_Z ITS_A_Z: MOV AH,9h MOV DX,OFFSET NOTA INT 21h INT 20h NOT_A_Z: MOV AH,2h MOV DL,KEY INT 21h INT 20h KEY DB ? NOTA DB "You pressed Z!!!!!!!!",10,13,"Ph33r! $" MAIN ENDS END START Well you should be able to understand this program without any problems. If you don't: 1. We set AX to 0h and call interrupt 16h, that waits for the user to press a key 2. We move the value of AL (which holds the ASCII value of the key pressed) into a variable. This way we can manipulate the registers without having to worry about destroying it 3. We compare KEY with 90h, which is hex for Z (case sensitive) 4. If it is a Z we jump to ITS_A_Z which displays the message 5. If not, we jump to NOT_A_Z, which simply echos the key back. 6. We decalared 2 variables, one which is not initialized yet called KEY, and one that holds the value "You pressed Z!!!!!!!!",10,13,"Ph33r! $" Which looks like this on a DOS computer: You pressed Z!!!!!!!! Ph33r! Exercises: 1. Make a program that will accept a series of keypresses, but when the user enters the following characters, convert them to their real values as shown below: S = Z F = PH PH = F E = 3 I = 1 EA = 33 T = 7 O = 0 A = 4 L = | NOTE: This is NOT case sensitive. In other words, you're going to either have to convert lower case to upercase (or the otherway around) as soon as its entered by for example subtracting 20 from the ASCII value, or by making a branch for either case. Also, try using procedures to do this. 7. Basics of Graphics ====================== Graphics are something we all love. Today you will learn how to create some bad ass graphics in assembly! Well actually I will tell you how to plot a pixel using various methods. You can apply that knowledge to create some other graphics routines, like line drawing shit, or a circle maybe. It's all just grade 11 math. Using interrupts ---------------- This is the easiest method. We set up some registers and call an interrupt. The interrupt we will be using is 10h, BIOS video. Before we do anything, we have to get into graphics mode. For the purpose of simplicity I will just cover 320x200x256 resolution (that is 320 vertical pixels, 200 horizontal pixels, and 256 shades of colors). So how do you get into this mode? You set AH to 00h and AL to 13h. 00h tells interrupt 10h that we want to get into graphics mode, and 13h is the mode (320x200x256). Example 1: MAIN SEGMENT ASSUME DS:MAIN,ES:MAIN,SS:MAIN,CS:MAIN ORG 100h START: MOV AH,00h MOV AL,13h INT 10h INT 20h MAIN ENDS END START This ins't too exiting, just looks bigger. Let's plot a pixel. Example 2: MAIN SEGMENT ASSUME DS:MAIN,ES:MAIN,SS:MAIN,CS:MAIN ORG 100h START: MOV AH,00h MOV AL,13h INT 10h MOV AH,0Ch MOV AL,10 MOV CX,100 MOV DX,100 MOV BX,1h INT 10h INT 20h MAIN ENDS END START First we get into graphics mode, then we set AH to 0Ch which is the Draw Pixel function of interrupt 10h. In order to use 0Ch we have to set up some other registers as well. AL contains the colors of the pixel, CX the location on the X axis and DX the location on the Y axis. Finally BX tells interrupt 10h to use page one of the VGA card. Don't worry about what pages are until you get into more advanced shit. Once in graphics mode you can switch back to text using MOV AH,00h MOV AL,03h INT 10h So putting it all together, the following program will draw a green pixel at location 100,100 on page 1, then switch back to text mode, clearing the pixel along the way. Notice that it sets the AL and AH registers using only 1 move by moving them into AX. This might save you a clock tick or two and makes the executable file a whooping 3 bytes smaller! Example 3: MAIN SEGMENT ASSUME DS:MAIN,ES:MAIN,SS:MAIN,CS:MAIN ORG 100h START: MOV AX,0013h INT 10h MOV AX,0C04h MOV CX,100 MOV DX,100 MOV BX,1h INT 10h MOV AX,0003h INT 10h INT 20h MAIN ENDS END START Even though we did a bit of optimization there, it's still very slow. Maybe with one pixel you won't notice a difference, but if you start drawing screen full after screen full using this method, even a fast computer will start to drag. So lets move on to something quite a bit faster. By the way, if you're computer is faster than a 8086, you will see nothing at all because even though the routine is slow, a single pixel can still be drawn fast. So the program will draw the pixel and earase is before your eye can comprehend its existance. Writing directly to the VRAM ---------------------------- This is quite a bit harder than using interrupts as it involves some math. To make things even worse I will introduce you to some new operators that will make the pixels appear even faster. When you used interrupts to plot a pixel you were just giving the X,Y coordinates, when writing directly the the VRAM you can't do that. Instead you have to find the offset of the X,Y location. To do this you use the following equation: Offset = Y x 320 + X The segment is A000, which is were VRAM starts, so we get: A000:Y x 320 + X However computers hate multiplication as it is just repeated adding, which is slow. Let's break that equation down into different numers: A000:Y x 256 + Y x 64 + X or A000:Y x 2^8 + Y x 2^6 + X Notice how now we're working with base 2? But how to we get the power of stuff? Using Shifts. Shifting is a fairly simple concept. There are two kinds of shifts, shift left and shift right. When you shift a number, the CPU simply adds a zero to one end, depending on the shift that you used. For example, say you want to shift 256 256 = 100000000b Shift Left: 1000000000b 512 = 1000000000b Shift Right: 0100000000b 256 = 100000000b Shifts are equal to 2^n where N is the number shifted by. So we can easily plug shifts into the previous equation. A000:Y SHL 8 + Y SHL 6 + X This is still analog. Let's code that in assembly: SET_VSEGMENT: ;set up video segment MOV AX,0A000h ;point ES to VGA segment MOV ES,AX VALUES: ;various values used for plotting later on MOV AX,100 ;X location MOV BX,100 ;Y location GET_OFFSET: ;get offset of pixel location using X,Y MOV DI,AX ;put X location into DI MOV DX,BX ;and Y into DX SHL BX,8 ;Y * 2^8. same as saying Y * 256 SHL DX,6 ;Y * 2^8. same as sayinh Y * 64 ADD DX,BX ;add the two together ADD DI,BX ;and add the X location Now all we have to do is plot the pixel using the STOSB instruction. The color of the pixel will be in AL. MOV AL,4 ;set color attributes STOSB ;and store a byte So the whole code to plot a pixel by writing directly to the VRAM looks like this: MAIN SEGMENT ASSUME CS:MAIN,ES:MAIN,DS:MAIN,SS:MAIN ORG 100h START: MOV AH,00h ;get into video mode. 00 = Set Video Mode MOV AL,13h ;13h = 320x240x16 INT 10h SET_VSEGMENT: ;set up video segment MOV AX,0A000h ;point ES to VGA segment MOV ES,AX VALUES: ;various values used for plotting later on MOV AX,100 ;X location MOV BX,100 ;Y location GET_OFFSET: ;get offset of pixel location using X,Y MOV DI,AX ;put X location into DI MOV DX,BX ;and Y into DX SHL BX,8 ;Y * 2^8. same as saying Y * 256 SHL DX,6 ;Y * 2^8. same as sayinh Y * 64 ADD DX,BX ;add the two together ADD DI,BX ;and add the X location ;this whole thing gives us the offset location of the pixel MOV AL,4 ;set color attributes STOSB ;and store XOR AX,AX ;wait for keypress INT 16h MOV AX,0003h ;switch to text mode INT 10h INT 20h ;and exit END START MAIN ENDS If you don't understand this yet, study the source code. Remove all comments and add them yourself in your own words. Know what each line does and why it does what it does. A line drawing program ---------------------- To finish up the graphics section I'm going to show you a little modification to the previous program to make it print a line instead of just a pixel. All you have to do is repeat the pixel ploting procedure as many times as required. It should be commented well enough, so I wont bother explaining it. MAIN SEGMENT ASSUME CS:MAIN,ES:MAIN,DS:MAIN,SS:MAIN ORG 100h START: MOV AH,00h ;get into video mode. 00 = Set Video Mode MOV AL,13h ;13h = 320x240x16 INT 10h SET_VSEGMENT: ;set up video segment MOV AX,0A000h ;point ES to VGA segment MOV ES,AX VALUES: ;various values used for plotting later on MOV AX,100 ;X location MOV BX,100 ;Y location MOV CX,120 ;length of line. used for REP GET_OFFSET: ;get offset of pixel location using X,Y MOV DI,AX ;put X location into DI MOV DX,BX ;and Y into DX SHL BX,8 ;Y * 2^8. same as saying Y * 256 SHL DX,6 ;Y * 2^8. same as sayinh Y * 64 ADD DX,BX ;add the two together ADD DI,BX ;and add the X location ;this whole thing gives us the offset location of the pixel MOV AL,4 ;set color attributes REP STOSB ;and store 100 bytes, decrementing CX and ;incrementing DI XOR AX,AX ;wait for keypress INT 16h MOV AX,0003h ;switch to text mode INT 10h INT 20h ;and exit END START MAIN ENDS 8. Basics of File Operations ============================= In the old days, DOS did not include interrupts that would handle file operations. So programers had to use some complicated t3kn33kz to write/open files. Today we don't have to do that anymore. DOS includes quite a few interrupts to simplify this process. File Handles ------------ File handles are are numbers assigned to a file upon opening it. Note that opening a file does not mean displaying it or reading it. Take a look at the following code: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,SS:MAIN,ES:MAIN ORG 100h START: MOV AX,3D00h LEA DX,FILENAME INT 21h JC ERROR INT 20h ERROR: MOV AH,09h LEA DX,ERRORMSG INT 21h INT 20h FILENAME DB 'TEST.TXT',0 ERRORMSG DB 'Unable to open [test.txt]$' MAIN ENDS END START If you have a file called test.txt in the current directory the program will simply quite. If the file is missing it will display an error message. So what's happening here? 1. We move 3D00h into AX. This is a shorter way of saying: MOV AH,3Dh MOV AL,00h 3Dh is the interrupt 21h function for opening files. The interrupt checks the AL register to how it should open the file. The value of AL is broken down into the following: Bit 0-2: Access mode 0 - Read 1 - Write 2 - Read/Write Bit 3: Reserved (0) Bit 4-6: Sharing Mode 0 - Compadibility 1 - Exclusiv 2 - Deny Write 3 - Deny Read 4 - Deny None Bit 7: Inheritance Flag 0 - File is inherited by child processes 1 - Prive to current process Don't worry too much about what all this means. We will only use the Access mode bit. 2. We load the address of the file name into DX. Note that the filename has to be an ASCIIZ string, meaning it is terminated with a NULL character (0). 3. We call interrupt 21h 4. If an error occured while opening the file, the carry flag is set and the error code is returned in AX. In this case we jump to the ERROR label. 5. If no error occured, the file handel is stored in AX. Since we don't know what to do with that yet, we terminate the program at this point. Reading files ------------- Having optained the file handle of the file, we can now use the file. For example read it. When you use interrupt 21h's file read function, you have to set up the registers as follows: AH = 3Fh BX = File handle CX = Number of bytes to read DX = Pointer to buffer to put file contents in Than you simply print out that buffer using interrupt 21h's print string function. However notice how you have to specify the amount of data to read. That's not good since most of the time we don't know how much data is in a file. So we can use a little trick. If an error occured, the error code is stored in AX. The error code 0 means that the program has encounter a EOF (End Of File). So we can simply make a while loop that prints a single byte from the text as long as AX is not equal to zero. If it is, we know that the file ended and we can terminate the program. Note that it is good coding practise to use interrupt 21h's function Close File to do just that. Here is the code for this thing: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,SS:MAIN,ES:MAIN ORG 100h START: MOV AX,3D00h LEA DX,FILENAME INT 21h JC ERROR MOV BX,AX READFILE: MOV AH,3Fh MOV CX,0001h LEA DX,CHARACTER INT 21h CMP AX,000h JE ENDPROGRAM MOV AH,02h MOV DL,CHARACTER INT 21h JMP READFILE ENDPROGRAM: MOV AH,3Eh INT 21h INT 20h ERROR: MOV AH,09h LEA DX,ERRORMSG INT 21h INT 20h FILENAME DB 'TEST.TXT',0 ERRORMSG DB 'Unable to open [test.txt]$' CHARACTER DB ? MAIN ENDS END START This is a fairly big piece of code, but you should be able to understand it. 1. We get the file handle using the method discussed in the previous chapter. 2. We move the file handle from AX into BX. This is because interrupt 21h's function to read a file requires the handle to be in BX. 3. We move 3Fh into AH, tells interrupt 21h that we want to read a file 4. CX contains the bytes to read, we only want one 5. The read byte is put into buffer that DX points to. In this case its called CHARACTER. Notice how we set up CHARACTER is an unitialized variable. 6. We compare AX to 0, which it would be if a EOF is encountered. If it is, we end the program. 7. Otherwise we use interrupt 21h's function Print Character to print the character in the buffer. You should be familiar with that from previous chapters. 8. We return to the label READFILE to read another byte. 9. If EOF is encountered, we use function 3Eh to close the file and terminate the program. Creating files -------------- To create files you have to: 1. Create an empty file 2. Move a buffer into the file handle The following code will do that for us: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h START: MOV AH,3Ch XOR CX,CX MOV DX,OFFSET FILE_NAME INT 21h JC ERROR1 MOV BX,AX MOV AH,40h MOV CX,9 MOV DX,OFFSET SHIT INT 21h JC ERROR INT 20h ERROR: MOV AH,09h MOV DX,OFFSET ERROR_WRITING INT 21h INT 20h ERROR1: MOV AH,09h MOV DX,OFFSET ERROR_CREATING INT 21h INT 20h FILE_NAME db "blow.me",0 SHIT db "123456789" ERROR_WRITING db "error writing to file$" ERROR_CREATING db "error creating file$" MAIN ENDS END START 1. We create the file using function 3Ch. Register have to be set up like this: CX - Type of file. 0 - normal, 1 - Read Only, 2 - Hidden DX - Name of the file. Has to be an ASCIIZ string. This function returns the file handle of the new file in AX. 2. We check for an error, and jump of necessary 3. We move the file handle from AX into BX 4. And choose interrupt 21h's function Write File (40h). For this function we need the registers set up like this: BX - File Handle CX - File size to write (9 in our case) DX - Points to buffer to be written 5. We check for an error, if so we jump, otherwise we terminate the program. Search operations ----------------- In assembly you have two search functions at your disposal, Search First and Search Next. Out of those to search first is the more complicated one. As the name implies, Search Next can only be done after a Search First function. So first thing we do to search for a file is set up a Search First routine. The register have to be setup as follows: AH - 4Eh CL - File Attributes DX - Pointer to ASCIIZ path/file name The file attributes are set up in a wierd way, and I will not get into those. It's enough for you to know that we will be using 6h, which is a normal file. Well actually DOS will read 6h as 00000110, and each bit has a different meaning. This function will return an error in AX. If AX is zero the search was successful, otherwise we know it wasn't. If it found the files to search for, DOS will setup a block of memory 43 bytes long in the DTA. DTA stands for Disk Transfer Area and for now it's enough to think of it as a "scratch pad" for DOS. In this tutorial I will not get into reading it, but it doesn't hurt telling you what these 43 byte contain: 0 - 21: Reserved for the Find Next function. This saves us from having to do the setup again. 21 - 22: Attributes of the file found 22 - 24: Time the file found was created 24 - 26: Date the file found was created 26 - 30: Size of the file found (in bytes) 30 - 43: File name of the file found. So our Search First function will look like this: SEARCH: MOV DX,OFFSET FILE_NAME MOV CL,6h MOV AH,4Eh INT 21h OR AL,AL JNZ NOT_FOUND Notice how we use a bitwise operator instead of a CMP? Bitwise operations are insanly fast, and CoMParing 2 values is bascily subtracting which is slower. Remember how OR works? 0 OR 0 = 0 1 OR 0 = 1 0 OR 1 = 1 1 OR 1 = 1 So OR AL,AL will only return 0 if every single bit in AL is 0. So if it doesn't return 0, we know that it contains an error code and the search failed. We wont bother checking what the error code is, we just jump to a label that will display an error message. If the search was successful we move on to the Search Next function to check if anymore files meet our describtion. Search Next is a fairly easy function. All we have to do is move 4Fh into AH and call int 21h. FOUND: MOV Ah,4Fh INT 21h OR AL,AL JNZ ONE_FILE This code will perform the Search Next function, and if it fails jump to the label ONE_FILE. But what happens if it found another file? Well we could do another Search Next function. MORE_FOUND: MOV AH,4Fh INT 21h OR AL,AL JNZ MORE_FILES_MSG This will check if yet another file is found. Now we should implement a way of knowing how many files we found. We can do so by setting a register to 1 after the Search First function was successful, and incremeant it each time it finds another file. So lets put all this together and create a program that will search for a file, search again if it found it and start a loop that keeps searching for files and keeps track of how many it found: MAIN SEGMENT ASSUME CS:MAIN,DS:MAIN,ES:MAIN,SS:MAIN ORG 100h SEARCH: MOV DX,OFFSET FILE_NAME MOV CL,6h MOV AH,4Eh INT 21h OR AL,AL JNZ NOT_FOUND FOUND: MOV CL,1 ;the counter that keeps track of how many files we found MOV Ah,4Fh INT 21h OR AL,AL JNZ ONE_FILE MORE_FOUND: INC CL ;here we increment it MOV AH,4Fh INT 21h OR AL,AL JNZ MORE_FILES_MSG JMP MORE_FOUND MORE_FILES_MSG: MOV AH,02h OR CL,30h ;convert counter to number (see blow) MOV DL,CL ;and display it. INT 21h MOV AH,9h MOV DX,OFFSET MORE_FILES INT 21h INT 20h ONE_FILE: MOV AH,9h MOV DX,OFFSET FILE_FOUND INT 21h INT 20h NOT_FOUND: MOV AH,9h MOV DX,OFFSET FILE_NOT_FOUND INT 21h INT 20h MORE_FILES DB " FILES FOUND",10,13,'$' FILE_NOT_FOUND DB "FILE NOT FOUND",10,13,'$' FILE_FOUND DB "1 FILE FOUND",10,13,'$' FILE_NAME DB "*.AWC",0 ;this is the file we search for MAIN ENDS END SEARCH Returns: FILE NOT FOUND If not files with extension .AWC are found 1 FILE FOUND If the current directory contains 1 file with the extension .AWC X FILES FOUND If more than one file with extension .AWC was found. X stands for the number of files found. Remember how function 2h will print the ASCII value of a hex number? Well we don't really want that. So to convert it to a number we OR it with 30h. That's because if you look at an ASCII chart you'll notice that the numeric value of a ASCII number is always 30h more than the hex number. For example, The number 5 is equal to 35h, 6 is 36h, etc. So to convert it we OR it with 30h: 5h = 000101 30h = 000110 35h = 110101 (ASCII Value: "5") 6h = 000110 30h = 110000 36h = 110110 (ASCII Value: "6") etc. Exercises: 1. Create a program that will display how many files are in the current directory 2. Create a program that will create a new file, write something to it, close it, open it, and read its contents. Basics of Win32 =============== Introduction ------------ I didn't want to include this as I absolutly HATE microsoft, but I guess I have to face the fact that it sadly took over all other good operating systems and people have started to switch to it. This chapter will be quite a bit different from the previous ones as Win32 programing is not really low level. Basicly all you're doing is making calls to internal windows .DLL files. But the most signicant differance is the fact that you will be working in Protected Mode. This is the mode a briefly mentioned where you have a 4 gig limit instead of the old 64k you've been working with so far. I won't heavily get into what protected mode is and does as that is out of the scope of this tutorial (my next asm tutorial will though), but you will need to refer back to .EXE file layout I talked about in chapter 3. Tools ----- Well first of all you will have to download a new assembler. That's because my version of TASM is older and doesn't support Win32. So for this chapter get yourself a copy of MASM. That's an assembler by microsoft that has now become freeware. Why didn't I mention MASM before since it's free? Well the only thing MASM is now good for is Win32 programing. TASM uses something called IDEAL mode which is a much better way of programing in assembly. MASM uses MASM mode which quite frankly blows. Get MASM from: <url> Download and install it, than move on to the next section A Message Box ------------- First of all you have to get familiar with the program layout: 386 .MODEL Flat, STDCALL .DATA .DATA? .CONST .CODE LABEL: END LABEL This should look fairly familiar to you. If it doesn't, let's go over it again: 386 - This declares the processor type to use. You can also use 4 and 586, but for the sake of backwards compadibility you should stick with 386 unless you have to use something higher. .MODEL FLAT, STDCALL - This declares the memory model to use. In Win32 program you don't have the choices you did before anymore, FLAT is the only one. The STDCALL tells the assembler how to pass parameters. Don't worry about what that means just yet, you will most likely never use anything buy STDCALL in Win32 programming as there is only 1 instruction that needs a different one (C). .DATA - All your initialized data should go in here .DATA? - All your uninitialized data should go here .CONSTS - Constants go here .CODE - And your code goes here LABEL: - Just like before, you have to define a starting label END LABEL - And END it Now I'm gonna ask you to take a different look at this whole assembly thing. So far you have been manipulating memory and the CPU, with Win32 you manipulate memory and Windows components. I'm sure you know what Include files are, files that will be included with your program when you compile it. Well in Win32 programing you're using windows include files in the form of DLLs. These files are known as Application Programming Interface or API for short. For example, Kernel32.dll, User32.dll, gdi32.dll are APIs. Again, I won't bother getting into details on how APIs work. Assuming you have included all the .DLL files you need, you call specific Win32 functions in the following format: INVOKE expression,arguments So for example, to exit a program by making a call to the exit function you do: INVOKE ExitProcess,0 So let's make a program that does just that, exits: 386 .MODEL Flat, STDCALL option casemap:none ;turn case sensitivity on include \masm32\include\windows.inc ;the include files that we need include \masm32\include\kernel32.inc includelib \masm32\lib\kernel32.lib .DATA .CODE START: INVOKE ExitProcess,0 END START To get an .EXE out of this, get into your MASM directory and then into BIN. Then assemble with: ml /c /coff /Cp filename.asm And link with: link /SUBSYSTEM:WINDOWS /LIBPATH:c:\masm32\lib filename.obj This will get you a file called filename.exe, run it and ph33r. Now lets make this into a message box. We use the INVOKE command again, but instead of using the ExitProcess function, we use MessageBox. INVOKE MessageBox, 0, OFFSET MsgBoxText, OFFSET MsgBoxCaption, MB_OK Let's disect this thing: MessageBox tells windows what function we want, and add a 0, just like we did with ExitProcess. This is done because all ANSI strings in windows must be terminated with a 0. Next we put the location of MsgBoxText in there. This is done just like you would do it using INT 21h, OFFSET LOCATION. We do the same with MsgBoxCaption and finally specify what kind of message box we want. In this case MB_OK is a constant representing the familiar box where you can only press Ok. Usually this would be a number, but we're including a file that contains defintions of them. So how did I know what goes where? A Win32 refrence will tell you. We also have to define MsgBoxText and MsgBoxCaption. We do this the way we always did: MsgBoxCaption DB "ph33r b1ll g473z!",0 MsgBoxText DB "Yes, I ph33r",0 So throwing it all together, the code would look like this: .386 .MODEL FLAT,stdcall option casemap:none include \masm32\include\windows.inc include \masm32\include\kernel32.inc includelib \masm32\lib\kernel32.lib include \masm32\include\user32.inc includelib \masm32\lib\user32.lib .DATA MsgBoxCaption DB "ph33r b1ll g473z!",0 MsgBoxText DB "Yes, eYe ph33r",0 .CODE START: INVOKE MessageBox, 0, OFFSET MsgBoxText, OFFSET MsgBoxCaption, MB_OK INVOKE ExitProcess, 0 END START NOTE: Instead of offset you could have use ADDR. ADDR does basicly the same, but it can handle forward refrences and OFFSET can't. In other words, if you would have declared MsgBoxCaption and MsgBoxText after you use them (INVOKE.....), using OFFSET would return an error. So you should get the habbit of using ADDR instead of Win32. Now assemble and link with: ml /c /coff /Cp filename.asm link /SUBSYSTEM:WINDOWS /LIBPATH:c:\masm32\lib filename.obj By the way, you should have made a .bat file by now that does this for you. If you haven't, make a file containing the following lines and save it as whatever.bat: @echo off ml /c /coff /Cp %1.asm link /SUBSYSTEM:WINDOWS /LIBPATH:c:\masm32\lib %1.obj A Window -------- As I said, I hate Win32 programming, so I'm thinking of scratching this part as it's quite a bit more complex than a message box. Some final words: The key to mastering assembly is LOTS of practise! Don't worry if you don't understand half of the stuff I talked about here. Put this thing aside and just make lots and lots of little programs. If they don't work, debug them, even if that takes you all night or longer. Than come back to this. And don't bother trying to find help. There are only very few people who know assembly, and if you can figure it out yourself you learn more. By the way, 4 months after I first opened up a text file on assembly my tasm directory contains 93 working .asm files coded by myself. On average that's almost 1 program a day. Remember, you don't have to start coding something big, as long as you code _something_! Don't expect to learn everything in this tutorial within a few days, I would say that if you can do it in 4 months or so you are doing great. [/QUOTE]
Minggu, 27 Februari 2011
Sambungan tutorial assembly
sambungan tutorial assembly
Langganan:
Posting Komentar (Atom)
Tidak ada komentar:
Posting Komentar