ARM intro/notes

One of the first things to note is the relatively limited address handling. While you can branch to 24 bits (+/-32MB), you can only load an address using a 12 bit offset (4K), although you can re-constitute anything using multiple instructions (in 8 bit chunks). Thankfully the compiler has a pretty good idea where all the (static and/or frame-relative) data is going to be, before it has to emit the first instruction.

              
.data
szTitle:    .asciz "This is the title\n"    @ (say this is at #40123456)
.text
.global main 
iAdrszTitle:     .int szTitle

main:                           @ entry of program 
    ldr r0,iAdrszTitle          @ (similar to x86’s lea)
@ or (as sub-4K style)
    ldr r0,=szTitle
@ or (Phix-style - don’t panic!):
    mov r0,#40000000 @ (lsl 24)
    orr r0,#00120000 @ (lsl 16)
    orr r0,#00003400 @ (lsl 8)
    orr r0,#00000056 @ (lsl 0)
    bl displayMsg               @ (similar to x86’s call)

All you really need to know/accept from that is that values (often) need to be loaded in 8-bit chunks.
NB: in "as" (ie the RPi assembler) syntax, #NN is a decimal number, but pilasm.e treats that as hex.
For pilasm.e, I think I’ll allow an "lea [var]" and have it spit out three instructions.

There are several other non-standard quirks of Phix’s (internal) ARM assembly langugage, such as:
"add r5, r4, r2, lsl 3" <===> "add r5, r4, r2 lsl 3" <===> "add r5, r4, r2<<3" <===> "add r5, r4, r2*8",
"ldr r3, [r5, 4]" <===> "ldr r3, [r5+4]".
In other words I’ll make up my own rules/syntax as I see fit, to preserve my own sanity.
Some will no doubt disagree, but for me the pre-UAL syntax is all round far nicer to deal with.
By default (in filedump.e unless UsePhixSyntax is true) it will disassemble to a standard syntax, whereas when disassembling for a list.asm file it will try as best it can to match the #ilASM{[ARM]} source input.

Stages/Progress:
1) Dissassemble [IN PROGRESS]: There are 216 entries on rosettacode (just 4 selected for now) that I can copy, compile, test, then transfer the binary file and throw it at filedump.exw
2) Update platform(), format, etc. (pretty basic stuff)
3) Output the basic ELF ARM headers etc. ("")
4.1) Extend #ilASM{} with an [ARM] guard. (getting tricky)
-- #ilASM{ [32] ret [64] pop al } is the correct way to say "64bit not yet written",
4.2) Extend all 300+ uses of #ilASM in the backend with [ARM] sections (some dummy for now). (huge)
4.3) Extend pilx86.e/ilxlate() or write a separate include to emit ARM binary. (huge)
4.4) Merge/start from p2js (use new tokeniser/parse tree), add register allocation, etc.
The compiler must know when a register contains:
The address of a variable
The ref of a variable (== value for ints)
The shifted ref of a variable
Such states on entry and exit from a basic block/sub-tree-node.
...
5) Test.
6) ex.err creation.
7) pGUI, mpfr, etc...
8) trace, profile, etc...

Of necessity 4.1-4 will intertwine and overlap, and take many months and quite probably years, but with a bit of luck I just might uncover some shortcuts along the way.

fasmarm follows this principle: Find a way to encode it, the smaller the better

What this means is that even though you may write a particular opcocde, fasmarm may assemble another as a replacement. This can only happen where the functionality is exactly the same. The reason for doing this is to increase the likelihood of a successful assembly. This will only affect you if you want to look at a disassembly listing and match it to the original source. Example:

        What you coded     | What is assembled  ;reason
        -------------------+---------------------------------------------------
                          ARM
        and r0,0xfffffff0  | bic r0,0x0000000f  ;immediate could not be encoded
        add r1,-4          | sub r1,4           ;immediate could not be encoded
        cmp r2,0xffffe500  | cmn r2,0x00001b00  ;immediate could not be encoded
        lsr r1,r2,r3       | mov r1,r2,lsr r3   ;ARM mode does not have LSR
        pop {r0-r3}        | ldmfd r13!,{r0-r3} ;ARM mode does not have POP
        mul r4,r4,r11      | mul r4,r11,r4      ;encoding restriction for
                           |                    ;CPU versions before v6

        mov r1,0x4567      | mov r1,0x4500      ;immediate could not be encoded
                             orr r1,0x0067

blx is not supported, but you can always replace:

@   blx r4                               @ call of the function to be measured
    mov lr,pc                            @ pc is always +8!! (+4 in thumb mode)
    bx r4

nb: bx pc is undefined... (compiler error needed?)