I was recently asked to determine where the code that handles drawing the fireball sprite in Super Mario Bros. 3 for the NES is located by dwangoAC. I wanted to quickly document the work I did in case this is useful for anyone else in the future. For the work below, I used FCEUX to execute my SMB3 ROM and step/trace what I needed. I also used Binary Ninja to disassemble the ROM and mark it up in a way that would help me understand what was going on.
UPDATE (2016-10-04): The code and analysis below is still useful, but I wanted to point out that there's little need to reverse engineer SMB3 anymore. I was just made aware of Southbird's SMB3 disassembly, which is amazing and incredibly detailed. He's got a nice little overview page as well that's pretty helpful. I would recommend anyone doing anything with SMB3 look at this as they're going through the code. It's an invaluable resource and will get you quickly up-to-speed. Huge props to Southbird for the work and I really wish I'd seen it sooner...
By playing and stepping through in a debugger, I found what I believe to be the beginning of the projectile drawing code at $A329 in ROM bank 7:
sub_a329:
0000a329 a201 ldx #$01 // ensures we run this code twice
// with x == 1 and x == 0 (see $A330)
// (assuming one slot per player?)
0000a32b 86cd stx data_cd
sub_a32d:
0000a32d 2033a3 jsr sub_a333
0000a330 c6cd dec data_cd // x = 0 now
0000a332 ca dex
sub_a333:
0000a333 bde17c lda $7ce1, x // load the projectile slot
// referred to by x
0000a336 f0f0 beq $a328
loc_a328:
0000a328 60 rts // return if slot was empty
If the two projectile slots are empty, we'll bail out of the subroutine after executing only the code above. I'm assuming there are two projectile slots because there can be two players, but I'm not 100% sure.
At any rate, if one of the projectile slots is not empty, we'll head here:
loc_a338:
0000a338 c903 cmp #$03 // branch to $A33F if type < 3
0000a33a 9003 bcc $a33f
I'm not sure what the other projectile types are, but I know from the debugger that fireballs are apparently #$01. As a result, we'll take the branch and head here:
loc_a33f:
0000a33f adfe05 lda data_5fe
0000a342 f01d beq $a361 // not sure, but we don't take this
loc_a344:
0000a344 a5ce lda data_ce
0000a346 d019 bne $a361 // don't take this, either
loc_a348:
0000a348 bde57c lda $7ce5, x // begin math to adjust x/y position of projectile
0000a34b 18 clc
0000a34c 6d5279 adc data_7952
0000a34f 9de57c sta $7ce5, x
0000a352 bde37c lda $7ce3, x
0000a355 18 clc
0000a356 6d5379 adc data_7953
0000a359 9de37c sta $7ce3, x
0000a35c 9003 bcc $a361 // skip next instruction if no carry
loc_a35e:
0000a35e fefa05 inc $05fa, x // add in carry if there was one
loc_a361:
0000a361 a4ce ldy data_ce
0000a363 d076 bne $a3db // don't take this
This code appears to adjust what I believe is the X and Y position of the projectile ($7CE5 for X, $7CE3 for Y). Remember that we're using the X register as an index here, so there are two values (one for each projectile slot) and we're only operating on one at a time here. I'm not too sure why we're adjusting it, but my guess is that it has something to do with the screen scrolling if the player is moving in a given direction (or if you're in a level where it automatically scrolls the screen for you).
Next, we'll head here:
loc_a365:
0000a365 bde77c lda $7ce7, x // adjust projectile's y velocity
0000a368 acfc05 ldy data_5fc // load something (this seems to always match)
0000a36b f004 beq $a371 // take this branch
loc_a371:
0000a371 48 pha // save updated y velocity
0000a372 a000 ldy #$00 // y = #$00
0000a374 68 pla // load y velocity again
0000a375 1001 bpl $a378 // skip next instruction if not negative
loc_a377:
0000a377 88 dey // y = #$FF
loc_a378:
0000a378 18 clc
0000a379 7de37c adc $7ce3, x // add y velocity to y position
0000a37c 9de37c sta $7ce3, x // update y position
0000a37f 98 tya // not too sure what this stuff does
0000a380 7dfa05 adc $05fa, x
0000a383 9dfa05 sta $05fa, x
0000a386 feed7c inc $7ced, x // this value determines future y velocity later
0000a389 bde17c lda $7ce1, x // get projectile type again
0000a38c c902 cmp #$02 // check if it's #$02 (not sure what this is)
sub_a38e:
0000a38e d030 bne $a3c0 // take this branch because fireball is #$01
This code looks like it updates the projectile's Y velocity and uses it to update the projectile's Y position. There's some other math I'm not sure about, but I'm sure it's related. At the end, we'll check for projectile type #$02, which we aren't (fireball is #$01 as I said earlier), so we'll take the branch to here:
loc_a3c0:
0000a3c0 bde57c lda $7ce5, x // load x position of projectile
0000a3c3 18 clc
0000a3c4 7de97c adc $7ce9, x // add x velocity to x position
0000a3c7 9de57c sta $7ce5, x // update x position
0000a3ca bde77c lda $7ce7, x // if y velocity is 4, skip to $A3DB
0000a3cd c904 cmp #$04
0000a3cf f00a beq $a3db
loc_a3d1:
0000a3d1 bded7c lda $7ced, x // increase y velocity if $7CED is 0-3
0000a3d4 2903 and #$03
0000a3d6 d003 bne $a3db
loc_a3d8:
0000a3d8 fee77c inc $7ce7, x
loc_a3db:
0000a3db bde57c lda $7ce5, x // adjust x position and store in $0001
0000a3de 38 sec
0000a3df e5fd sbc data_fd // (not sure what we're subtracting)
0000a3e1 8501 sta data_1
0000a3e3 18 clc // jump to $A3F0 if x position is >= 19
0000a3e4 690b adc #$0b
0000a3e6 c913 cmp #$13
0000a3e8 b006 bcs $a3f0
loc_a3ea:
0000a3ea a900 lda #$00 // remove projectile and return
0000a3ec 9de17c sta $7ce1, x
0000a3ef 60 rts
This code looks like it updates the projectile's X velocity and uses it to update the projectile's X position. It also looks like it increases the projectile's Y velocity based on the value ($7CED) we incremented earlier. I'm guessing this is their implementation of "gravity" on the fireball? The rest of this code does an adjustment on the projectile's X position that I don't fully understand before removing the projectile entirely if the X position is under #$14 (20).
If the projectile's X position is high enough, we'll execute this code instead:
loc_a3f0:
0000a3f0 69f8 adc #$f8 // not sure what this does
0000a3f2 850d sta data_d
0000a3f4 bde17c lda $7ce1, x // jump to $A400 if we aren't a fireball (#$01)
0000a3f7 c901 cmp #$01
0000a3f9 d005 bne $a400
loc_a3fb:
0000a3fb bde77c lda $7ce7, x // get y velocity
0000a3fe 300e bmi $a40e // jump to $A40E if it is positive
loc_a400:
0000a400 bde37c lda $7ce3, x // get y position
0000a403 cd4305 cmp data_543 // check it against something
0000a406 bdfa05 lda $05fa, x // not sure what we're doing here
0000a409 ed4205 sbc data_542
0000a40c 30e1 bmi $a3ef
loc_a40e:
0000a40e 8a txa // loading a sprite's location
0000a40f 0a asl a
0000a410 0a asl a
0000a411 18 clc
0000a412 6d9505 adc data_595
0000a415 a8 tay
0000a416 a501 lda data_1
0000a418 990302 sta data_203, y // save sprite to sprite RAM
0000a41b bde37c lda $7ce3, x // get y position
0000a41e 38 sec
0000a41f ed4305 sbc data_543
0000a422 c9c0 cmp #$c0 // if y position - something is >= #$C0
0000a424 b0c4 bcs $a3ea // remove projectile (see above)
loc_a426:
0000a426 990002 sta data_200, y // save sprite to different sprite RAM location
0000a429 690e adc #$0e // not too sure what all this is below
0000a42b 850c sta data_c
0000a42d bde97c lda $7ce9, x
0000a430 4a lsr a
0000a431 2940 and #$40
0000a433 8502 sta data_2
0000a435 bde17c lda $7ce1, x
0000a438 c902 cmp #$02 // jump to $A471 if projectile type isn't #$02
0000a43a d035 bne $a471
I'm not too sure what half of this does, but I'm confident it's loading and setting sprites due to the references to
the data_200
area. According to this page,
this is DMA to sprite RAM.
Since we're not projectile type $#02, we'll head here next:
loc_a471:
0000a471 ad6505 lda data_565 // calculate an index value
0000a474 4a lsr a
0000a475 4a lsr a
0000a476 2903 and #$03
0000a478 aa tax
0000a479 bd17a3 lda $a317, x // load the fireball we want from $A317
0000a47c 990102 sta data_201, y // save it to sprite RAM
0000a47f a502 lda data_2 // load a value we stored at $A433
0000a481 5d1ba3 eor $a31b, x // set a bit in the value we loaded from $A317
// this bit appears to flip the fireball around
0000a484 18 clc // avoid the branch below at $A495
0000a485 ae8805 ldx data_588
0000a488 f002 beq $a48c // jump to $A48C for something? (take this)
loc_848c:
0000a48c 990202 sta data_202, y // save value to sprite RAM
0000a48f a6cd ldx data_cd // load projectile slot
0000a491 a5ce lda data_ce // load something else
0000a493 d00d bne $a4a2 // don't take this branch
loc_a495:
0000a495 b003 bcs $a49a // don't take this either (carry cleared at $A484)
loc_a497:
0000a497 20a3a4 jsr sub_a4a3 // jump to what looks like hit detection maybe?
There's a lot more code that gets executed after this, but we don't care. We've found what we came for. The code above constitutes the remainder of what draws the sprite on the screen.
The specific problem I was asked to solve once I found this code was to make the fireballs stop rotating. To do this,
it looks like you need to make the index value calculated at $A471 (top block) always be the same value. I'm not too
sure what this is causing to be pulled out of $A317, but it definitely changes how the fireballs are displayed.
I chose to patch the and
instruction at $A476 to be 2900
for and #$00
to ensure this was the case.
Even after that, though, the fireball still flips around. It seems like the eor
at $A481 is the culprit here. To fix
this, I changed the lda
at $A47F to a900
for lda #$00
.
Once these two patches are made, the fireballs keep working, but stop rotating as they bounce around:
UPDATE (2016-10-07): Changed the image above to be an MP4, which is tiny in comparison. For posterity, here's the
command I ran to convert the raw AVI file I'd exported from FCEUX to what I wanted:
ffmpeg -i capture.avi -vf scale=iw*4:ih*4 -vf setpts=4.0*PTS -ss 00:02:40 -t 00:00:04 -c:v h264 -an fireball.mp4
.
The part I wanted was 40 seconds in and 1 second long, but I slowed the video down by 4x with the setpts
filter.
Hence, the start and time options needed to be each multiplied by 4.