I was recently asked to determine where the code that handles drawing the fireball sprite in Super Mario Bros. 3 for the NES is located by dwangoAC. I wanted to quickly document the work I did in case this is useful for anyone else in the future. For the work below, I used FCEUX to execute my SMB3 ROM and step/trace what I needed. I also used Binary Ninja to disassemble the ROM and mark it up in a way that would help me understand what was going on.
UPDATE (2016-10-04): The code and analysis below is still useful, but I wanted to point out that there's little need to reverse engineer SMB3 anymore. I was just made aware of Southbird's SMB3 disassembly, which is amazing and incredibly detailed. He's got a nice little overview page as well that's pretty helpful. I would recommend anyone doing anything with SMB3 look at this as they're going through the code. It's an invaluable resource and will get you quickly up-to-speed. Huge props to Southbird for the work and I really wish I'd seen it sooner...
By playing and stepping through in a debugger, I found what I believe to be the beginning of the projectile drawing code at $A329 in ROM bank 7:
sub_a329:
0000a329  a201               ldx     #$01    // ensures we run this code twice
                                             // with x == 1 and x == 0 (see $A330)
                                             // (assuming one slot per player?)
0000a32b  86cd               stx     data_cd
sub_a32d:
0000a32d  2033a3             jsr     sub_a333
0000a330  c6cd               dec     data_cd  // x = 0 now
0000a332  ca                 dex
sub_a333:
0000a333  bde17c             lda     $7ce1, x // load the projectile slot
                                              // referred to by x
0000a336  f0f0               beq     $a328
loc_a328:
0000a328  60                 rts              // return if slot was empty
If the two projectile slots are empty, we'll bail out of the subroutine after executing only the code above. I'm assuming there are two projectile slots because there can be two players, but I'm not 100% sure.
At any rate, if one of the projectile slots is not empty, we'll head here:
loc_a338:
0000a338  c903               cmp     #$03     // branch to $A33F if type < 3
0000a33a  9003               bcc     $a33f
I'm not sure what the other projectile types are, but I know from the debugger that fireballs are apparently #$01. As a result, we'll take the branch and head here:
loc_a33f:
0000a33f  adfe05             lda     data_5fe
0000a342  f01d               beq     $a361     // not sure, but we don't take this
loc_a344:
0000a344  a5ce               lda     data_ce
0000a346  d019               bne     $a361     // don't take this, either
loc_a348:
0000a348  bde57c             lda     $7ce5, x  // begin math to adjust x/y position of projectile
0000a34b  18                 clc
0000a34c  6d5279             adc     data_7952
0000a34f  9de57c             sta     $7ce5, x
0000a352  bde37c             lda     $7ce3, x
0000a355  18                 clc
0000a356  6d5379             adc     data_7953
0000a359  9de37c             sta     $7ce3, x
0000a35c  9003               bcc     $a361     // skip next instruction if no carry
loc_a35e:
0000a35e  fefa05             inc     $05fa, x  // add in carry if there was one
loc_a361:
0000a361  a4ce               ldy     data_ce
0000a363  d076               bne     $a3db     // don't take this
This code appears to adjust what I believe is the X and Y position of the projectile ($7CE5 for X, $7CE3 for Y). Remember that we're using the X register as an index here, so there are two values (one for each projectile slot) and we're only operating on one at a time here. I'm not too sure why we're adjusting it, but my guess is that it has something to do with the screen scrolling if the player is moving in a given direction (or if you're in a level where it automatically scrolls the screen for you).
Next, we'll head here:
loc_a365:
0000a365  bde77c             lda     $7ce7, x  // adjust projectile's y velocity
0000a368  acfc05             ldy     data_5fc  // load something (this seems to always match)
0000a36b  f004               beq     $a371     // take this branch
loc_a371:
0000a371  48                 pha               // save updated y velocity
0000a372  a000               ldy     #$00      // y = #$00
0000a374  68                 pla               // load y velocity again
0000a375  1001               bpl     $a378     // skip next instruction if not negative
loc_a377:
0000a377  88                 dey               // y = #$FF
loc_a378:
0000a378  18                 clc
0000a379  7de37c             adc     $7ce3, x  // add y velocity to y position
0000a37c  9de37c             sta     $7ce3, x  // update y position
0000a37f  98                 tya               // not too sure what this stuff does
0000a380  7dfa05             adc     $05fa, x
0000a383  9dfa05             sta     $05fa, x
0000a386  feed7c             inc     $7ced, x  // this value determines future y velocity later
0000a389  bde17c             lda     $7ce1, x  // get projectile type again
0000a38c  c902               cmp     #$02      // check if it's #$02 (not sure what this is)
sub_a38e:
0000a38e  d030               bne     $a3c0     // take this branch because fireball is #$01
This code looks like it updates the projectile's Y velocity and uses it to update the projectile's Y position. There's some other math I'm not sure about, but I'm sure it's related. At the end, we'll check for projectile type #$02, which we aren't (fireball is #$01 as I said earlier), so we'll take the branch to here:
loc_a3c0:
0000a3c0  bde57c             lda     $7ce5, x  // load x position of projectile
0000a3c3  18                 clc
0000a3c4  7de97c             adc     $7ce9, x  // add x velocity to x position
0000a3c7  9de57c             sta     $7ce5, x  // update x position
0000a3ca  bde77c             lda     $7ce7, x  // if y velocity is 4, skip to $A3DB
0000a3cd  c904               cmp     #$04
0000a3cf  f00a               beq     $a3db
loc_a3d1:
0000a3d1  bded7c             lda     $7ced, x  // increase y velocity if $7CED is 0-3
0000a3d4  2903               and     #$03
0000a3d6  d003               bne     $a3db
loc_a3d8:
0000a3d8  fee77c             inc     $7ce7, x
loc_a3db:
0000a3db  bde57c             lda     $7ce5, x  // adjust x position and store in $0001
0000a3de  38                 sec
0000a3df  e5fd               sbc     data_fd   // (not sure what we're subtracting)
0000a3e1  8501               sta     data_1
0000a3e3  18                 clc               // jump to $A3F0 if x position is >= 19
0000a3e4  690b               adc     #$0b
0000a3e6  c913               cmp     #$13
0000a3e8  b006               bcs     $a3f0
loc_a3ea:
0000a3ea  a900               lda     #$00      // remove projectile and return
0000a3ec  9de17c             sta     $7ce1, x
0000a3ef  60                 rts
This code looks like it updates the projectile's X velocity and uses it to update the projectile's X position. It also looks like it increases the projectile's Y velocity based on the value ($7CED) we incremented earlier. I'm guessing this is their implementation of "gravity" on the fireball? The rest of this code does an adjustment on the projectile's X position that I don't fully understand before removing the projectile entirely if the X position is under #$14 (20).
If the projectile's X position is high enough, we'll execute this code instead:
loc_a3f0:
0000a3f0  69f8               adc     #$f8       // not sure what this does
0000a3f2  850d               sta     data_d
0000a3f4  bde17c             lda     $7ce1, x   // jump to $A400 if we aren't a fireball (#$01)
0000a3f7  c901               cmp     #$01
0000a3f9  d005               bne     $a400
loc_a3fb:
0000a3fb  bde77c             lda     $7ce7, x   // get y velocity
0000a3fe  300e               bmi     $a40e      // jump to $A40E if it is positive
loc_a400:
0000a400  bde37c             lda     $7ce3, x   // get y position
0000a403  cd4305             cmp     data_543   // check it against something
0000a406  bdfa05             lda     $05fa, x   // not sure what we're doing here
0000a409  ed4205             sbc     data_542
0000a40c  30e1               bmi     $a3ef
loc_a40e:
0000a40e  8a                 txa                // loading a sprite's location
0000a40f  0a                 asl     a
0000a410  0a                 asl     a
0000a411  18                 clc
0000a412  6d9505             adc     data_595
0000a415  a8                 tay
0000a416  a501               lda     data_1
0000a418  990302             sta     data_203, y // save sprite to sprite RAM
0000a41b  bde37c             lda     $7ce3, x    // get y position
0000a41e  38                 sec
0000a41f  ed4305             sbc     data_543
0000a422  c9c0               cmp     #$c0        // if y position - something is >= #$C0
0000a424  b0c4               bcs     $a3ea       // remove projectile (see above)
loc_a426:
0000a426  990002             sta     data_200, y // save sprite to different sprite RAM location
0000a429  690e               adc     #$0e        // not too sure what all this is below
0000a42b  850c               sta     data_c
0000a42d  bde97c             lda     $7ce9, x
0000a430  4a                 lsr     a
0000a431  2940               and     #$40
0000a433  8502               sta     data_2
0000a435  bde17c             lda     $7ce1, x
0000a438  c902               cmp     #$02        // jump to $A471 if projectile type isn't #$02
0000a43a  d035               bne     $a471
I'm not too sure what half of this does, but I'm confident it's loading and setting sprites due to the references to
the data_200 area. According to this page,
this is DMA to sprite RAM.
Since we're not projectile type $#02, we'll head here next:
loc_a471:
0000a471  ad6505             lda     data_565    // calculate an index value
0000a474  4a                 lsr     a
0000a475  4a                 lsr     a
0000a476  2903               and     #$03
0000a478  aa                 tax
0000a479  bd17a3             lda     $a317, x    // load the fireball we want from $A317
0000a47c  990102             sta     data_201, y // save it to sprite RAM
0000a47f  a502               lda     data_2      // load a value we stored at $A433
0000a481  5d1ba3             eor     $a31b, x    // set a bit in the value we loaded from $A317
                                                 // this bit appears to flip the fireball around
0000a484  18                 clc                 // avoid the branch below at $A495
0000a485  ae8805             ldx     data_588
0000a488  f002               beq     $a48c       // jump to $A48C for something? (take this)
loc_848c:
0000a48c  990202             sta     data_202, y  // save value to sprite RAM
0000a48f  a6cd               ldx     data_cd      // load projectile slot
0000a491  a5ce               lda     data_ce      // load something else
0000a493  d00d               bne     $a4a2        // don't take this branch
loc_a495:
0000a495  b003               bcs     $a49a        // don't take this either (carry cleared at $A484)
loc_a497:
0000a497  20a3a4             jsr     sub_a4a3     // jump to what looks like hit detection maybe?
There's a lot more code that gets executed after this, but we don't care. We've found what we came for. The code above constitutes the remainder of what draws the sprite on the screen.
The specific problem I was asked to solve once I found this code was to make the fireballs stop rotating. To do this,
it looks like you need to make the index value calculated at $A471 (top block) always be the same value. I'm not too
sure what this is causing to be pulled out of $A317, but it definitely changes how the fireballs are displayed.
I chose to patch the and instruction at $A476 to be 2900 for and #$00 to ensure this was the case.
Even after that, though, the fireball still flips around. It seems like the eor at $A481 is the culprit here. To fix
this, I changed the lda at $A47F to a900 for lda #$00.
Once these two patches are made, the fireballs keep working, but stop rotating as they bounce around:
UPDATE (2016-10-07): Changed the image above to be an MP4, which is tiny in comparison. For posterity, here's the
command I ran to convert the raw AVI file I'd exported from FCEUX to what I wanted:
ffmpeg -i capture.avi -vf scale=iw*4:ih*4 -vf setpts=4.0*PTS -ss 00:02:40 -t 00:00:04 -c:v h264 -an fireball.mp4.
The part I wanted was 40 seconds in and 1 second long, but I slowed the video down by 4x with the setpts filter.
Hence, the start and time options needed to be each multiplied by 4.
