**Foreword:**

p0.1.0

Hello. It's imamelia here, ready to teach you as much as I know about ASM, with lessons, examples, and a few bad jokes. Now, of course, I'm not an expert by any means; I'm still learning, too. But they say that the best way to learn is to teach others. I had fun writing this tutorial, anyway, and I certainly hope that it can help people in their ASM endeavors.

p0.1.1

What,
right now, do you think you know about ASM? Lots? Some? None at all?
Maybe you've looked through the custom blocks, sprites, or patches
section at SMW Central and thought, “Wow, cool. I want to make
some of those, too!”. Perhaps you have opened the .asm file of
a custom sprite intending to remap its graphics, and you couldn't
help but be curious about the rest of the code. What is “JSR
GET_DRAW_INFO”? What does “PLB” do? Why is it that
some numbers have both a # sign and a $ sign in front of them, but
others have only the $ sign? What is the average flight speed of a
swallow? Okay, maybe you weren't thinking too much about that last
question. Well, in any case, don't worry. This tutorial may not
answer *all* of your questions, but it should answer a lot of
them. You'll learn why GET_DRAW_INFO is so important (Lab 3), what
PLB does and how to use it (Lesson 3-2), what the difference between
#$-- and $-- is and when to use which (Lesson 2-1), and...while ASM
may not be the best tool for calculating the average flight speed of
a swallow, you *can* use it to calculate the average flight
speed of a rideable bomb-dropping Albatoss.

p0.1.2

Well, enough blabbering. I'd say it's about time for the actual lessons, don't you think? Let's go!

—imamelia

**Part
1: An Introduction to the Assembly System**

Lesson 1-1: Hexadecimal and Binary

Lesson 1-2: A Bit of Vocabulary

__Lesson
1-1: Hexadecimal and Binary__

p1.1.0

First of all, if you've looked at any ASM, you may have noticed something. Some numbers, like 14, 28, and 80, look normal, but some, like 7F, A8, and EC, have letters—usually capital—in them. Since when do letters count as numbers? What kind of weirdos run this place? Sure, a number like 7F wouldn't exactly be a “number” per se in our normal base 10 system...but we're not counting in base 10. We're counting in base 16, or hexadecimal. While base 10, the decimal system, uses 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), hexadecimal uses 16 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F). That means that, almost invariably, when you see the number 10 in ASM code, it doesn't mean 10, ten, the number of fingers a normal human has. It means 10, as in 16, 4 times 4. If you wanted to express the number of fingers you have in hexadecimal, then you'd write it as A or 0A (unless, of course, you have a birth defect or are from another planet where people don't have exactly 10 fingers). See that? In hexadecimal, after 9 comes A, which is the same as 10 in decimal. After A comes B, then C, D, E, and F, which correspond to 11, 12, 13, 14, and 15 in decimal. After F, as you might have guessed, comes 10, which is 16 in decimal. Base 10 uses ones, tens, hundreds, thousands, ten-thousands, hundred thousands, etc., but base 16 uses ones, sixteens, two hundred and fifty-sixes, four thousand and ninety-sixes, etc., multiplying the value of each column by 16 to get the value of the next one.

p1.1.1

Now,
the question is, why use hexadecimal at all? Who the heck wants to be
bothered with remembering numbers like 256 and 65,536 when numbers
like 100 and 10,000 are so much easier? Well, you have to remember,
*humans* really only like base 10 because we have 10 fingers.
Computers, obviously, have no fingers at all (but if you see one that
does, *run*), and at the
most basic level, they can understand only two things: off and on. Up
and down. Left and right. Indented or flat. Present or absent. They
process data with zillions of tiny “pieces” that can only
have two possible states, and if we translate that into numbers, the
result is that the numbers we end up with have only two different
digits: 0 and 1. I'll touch on the hexadecimal question later, but
this brings me to the second part of this lesson...binary.

p1.1.2

Binary is almost like hexadecimal on a smaller scale. Binary is base 2, so it uses only two digits: 0 and 1. Sound familiar? In binary, instead of ones, tens, hundreds, and so on, we have ones, twos, fours, eights, etc., multiplying the value of each column by 2 to get the value of the next one. So if you see a “10” that you know is in binary, its equivalent base-ten value would be only 2. 10 in binary = 2 in decimal. 11 in binary = 3 in decimal. 100 in binary = 4 in decimal, 101 = 5, 110 = 6, and so on and so forth. If you want to talk about having “ten” fingers, but you want to express it in binary, the number you would write is 1010. Count the value of the columns (zero ones, one two, zero fours, and one eight), and you'll see that they add up to 10 (decimal).

p1.1.3

Now,
to tie this all together and conclude this lesson, let's go back to
the hexadecimal question that I posed two paragraphs ago. Why even
use hexadecimal, as opposed to decimal? If we like base 10 so much,
then why can't we just use that in our ASM? (Well, technically we
can, but that's for a later lesson.) The answer, if you haven't
guessed yet, lies in the two-value scenario. Computers think in
binary, which consists of 1s and 0s and is based around the number
2...BUT the number 16 just so happens to be a power of 2; 2^4 = 16.
So hexadecimal really isn't that big of a stretch from binary, but
it's a lot easier on us humans than long strings of 0s and 1s. *That's*
why we use it for things like ASM. (Side note: “Hexadecimal”
is sometimes shortened to just “hex” in some contexts.
Unless you're a native speaker of Greek, which is where the word
“hexadecimal” comes from [*hex-*: six, *deci-*:
ten] you may be tempted to make some Harry Potter reference right
about now.)

__Lesson
1-2: A Bit of Vocabulary__

p1.2.0

Now,
if you really want to learn ASM, there might be a few terms you'll
want to know. First of all, a *bank* is
a block of 65,536 bytes of data (10000 in hexadecimal). BUT...not all
data is used for the same purposes. Plain and simple:

-$0000-$1FFF are RAM, and their values are the same in every bank. They can and do change value during the game.

-$8000-$FFFF, except in banks 70-7F, are ROM, and their values are different in every bank. ROM data never changes; once it's there, it's there to stay.

-$2000-$FFFF
in bank 7E and $0000-$FFFF in bank 7F are *not* ROM, but RAM.
They work like any other RAM, but the bank has to be specified. You
can't use just $8200; it has to be $7E8200 (or $7F8200).

-$0000-$FFFF in banks 70-7? are SRAM; this is used for things in the game that save, such as the number of exits found.

A big thank-you goes to Alcaro for compiling most of the bank data in the first place.

p1.2.1

A
*bit* is a single binary digit. In fact, the very word “bit”
is short for __bi__nary digi__t__. This digit, obviously, can
be either 0 or 1; if the bit is *clear* or *reset*, then it
is equal to 0. If it is *set*, then it is equal to 1.

p1.2.2

A
*byte* is a two-digit hexadecimal number that can have any value
from 00 to FF and is composed of 8 bits. The bits are in the
following order:

76543210

The bit farthest to the right, therefore, is bit 0, the bit farthest to the left is bit 7, and all the ones in between are...well, all the numbers in between. Now each bit has a certain hexadecimal value. If our byte has the value 00000001, then bit 0 is set and all other bits are clear. This is equal to 01 in hexadecimal. If our byte has the value 00000010, then bit 1 is set and all other bits are clear. This is equal to 02 in hexadecimal. You can continue the pattern by simply setting the next bit and clearing all others, so that 00000100 = 04, 00001000 = 08, 00010000 = 10, 00100000 = 20, 01000000 = 40, and 10000000 = 80. One way you may see bits expressed is as letters, with a lowercase letter indicating a clear bit and a capital letter indicating a set one. (E.g. if “bbbbbbbb” are our eight bits, then BbbbbbbB would indicate that bits 0 and 7 are both set and all others are clear.) To recap:

- A byte is a 2-digit hexadecimal number (any).

- A byte is composed of 8 bits.

- The bits each have a certain number, 76543210.

- They are often called by their numbers, e.g. “bit 2”.

- They are sometimes represented by letters, with a lowercase letter indicating a clear bit (a 0) and a capital letter indicating a set bit (a 1).

p1.2.3

An
*address*, or RAM address, is a byte—or sometimes 2
bytes—at a specific point in the RAM. Examples: $13, $0680,
$13BF, $7FAB10. A single byte of RAM can contain any hexadecimal
value from 00 to FF at any given moment. RAM is essentially used to
tell the game what to do. In Super Mario World, doing a certain thing
to one RAM address, for example, could activate the P-switch, but
doing the same thing to a different RAM address could give the player
invincibility.

p1.2.4

An
*opcode* is basically a command that tells the computer what to
do. In 65c816 ASM (that's the formal name of the ASM Super Mario
World uses, but you don't need to know it), all opcodes consist of
three letters, such as LDA, RTS, and EOR. There are a total of 99
opcodes in 65c816 ASM (I counted) including useless ones, although
you'll mainly be using just a fraction of them. (If you're curious,
that list at the beginning of this document has all of them in
alphabetical order.) An opcode tells the processor whether to put a
number into *this* place, into *that* place, compare it to
another number, decrease its value by 1, multiply it by 2, jump to
another part of the code, copy something to something else, and more.
Also, many opcodes are immediately followed by a number or sometimes
other data, which, of course, is the number, address, etc. that the
opcode will affect.

p1.2.5

An
*addressing mode* is...well, essentially, you can make a single
opcode do slightly different things by using different addressing
modes. For example, you can make some opcodes affect a RAM address
(which you may or may not know the exact value of), or you can make
them affect a specific number. These are two different addressing
modes. So an addressing mode is basically how an opcode uses the data
that follows it. (If it's a stand-alone opcode and doesn't have
anything after it, then it has only one addressing mode or doesn't
exactly use addressing modes per se.)

p1.2.6

There
are also three *registers* that are used in ASM. The
*accumulator*, also known as A, is the most common. “A”
is..sort of a temporary place to hold data. You can *load* a
number into A, you can *store* the value in A to a RAM address,
you can *compare* the value in A to another value, and more.
There are two more registers you should know about: the X register
and the Y register. These are very similar to A, but their opcodes
are more limited and they are usually used for different purposes.
The most common use of the X and Y registers is for *indexing*,
which is covered in section 2-6. Together, A, X, and Y are the three
variables you use to get data into and out of the game. Think of each
of them like...one of those vertical carts that you can push. You
take a box off of a shelf, put it onto the cart, and then transfer it
to another shelf. Similarly, in ASM, you can take a number or the
value of a RAM address, put it into A, X, or Y, and then transfer it
to another RAM address.

p1.2.7

An
*operand* is something, such as
a RAM address or value, that is affected by an opcode. For example,
in “LDA #$05”, the #$05 is the operand (the LDA is the
opcode). It sounds more difficult to understand than it is.

p1.2.8

I think we're ready for some actual lessons, don't you? Now that you have a foundation...LET'S DO IT!!