= Hardware Motivation for Address Register Modifying Addressing Modes =
Apart from the software datastructure motivation,
there is a hardware motivation for addressing modes such as pre-increment or decrement.
In more generality - addressing modes that calculate an address, and then save the results of that calculation
in a register - typically one of the registers involved in the addressing mode.
Consider [[Mitch Alsup's Favorite Addressing Mode]],
the x86's [[Base+Scaled-Index+Offset]].
A succession of such instructions might look like:
;an [[RMW]]
r1 := load( M[ rB+rO*4+offset1 ] )
r1 := ... some calculation involving r1 ...
store( M[ rB+rO*4+offset1 ] := r1 )
or
; nearby loads
r1 := load( M[ rB+rO*4+offset1 ] )
r2 := load( M[ rB+rO*4+offset2 ] )
r3 := load( M[ rB+rO*4+offset3 ] )
Notice the possibility of [[CSE (Common Subexpression Elimination)]]:
For the [[RMW]] case, this is straightforward:
rAtmp := [[lea]]( M[ rB+rO*4+offset1 ] )
r1 := load( rAtmp ] )
r1 := ... some calculation involving r1 ...
store( M[ rAtmp ] := r1 )
Is this a win? It depends:
* in terms of performance, on a classic RISC, probably note: the extra instruction adds latency, probably costs a cycle.
* in terms of power, possibly
For the nearby load case, we could do:
rAtmp := lea( M[ rB+rO*4 ] )
r1 := load( M[ rAtmp+offset1 ] )
r2 := load( M[ rAtmp*4+offset2 ] )
r3 := load( M[ rAtmp*4+offset3 ] )
or
rAtmp := lea( M[ rB+rO*4+offset1 ] )
r1 := load( M[ rAtmp ] )
r2 := load( M[ rAtmp*4+(offset2-offset1) ] )
r3 := load( M[ rAtmp*4+(offset3-offset1) ] )
Is this a win? Again, it depends:
* probably not in performance
* possibly in terms of power.
- As an aside, let me note that if the base address rB+rO*4 is aligned,
then the subsequent addresses could use [[OR indexing]] rather than [[ADD indexing]].
Which may further save power.
At the cost of yet another piece of crap in the instruction set.
To avoid conundrums like this - is it better to [[CSE]] the addressing mode or not?
- processors like the [[AMD 29K]]
abandoned addressing modes except for [[absolute addressing mode]] and [[register indirect addressing mode]]
- so to form any nomn-trivial address you had to save the results to a register, and thereby were
encouraged to CSE it whenever possible.
Generalized pre-increment/decrement is a less RISCy approach to this.
Instead of encouraging CSE via separate instructions,
it encourages CSE by making it "free" to save the result ofthe addressing mode calculation.
:Let's call this [[Address Generation Saving]], or, if it modifies a register used in the addressing mode, [[Address Register Modification]]:
In our examples:
;RMW
rAtmp := [[lea]]( M[ rB+rO*4+offset1 ] )
r1 := load( rAtmp := rB+rO*4+offset1 ] )
r1 := ... some calculation involving r1 ...
store( M[ rAtmp ] := r1 )
; nearby load
r1 := load( M[ rAtnp :=rB+rO*4+offset1 ] )
r2 := load( M[ rAtmp*4+(offset2-offset1) ] )
r3 := load( M[ rAtmp*4+(offset3-offset1) ] )
Note that the nearby case that changes the offsets is simpler than the nearby case that saves only part ofthe address calculation:
;Using C's comma expression:
r1 := load( M[ (rAtmp := rB+rO*4, rAtmp+offset1) ] )
r2 := load( M[ rAtmp*4+offset2 ] )
r3 := load( M[ rAtmp*4+offset3 ] )
:: Again, [[OR indexing]] may benefit, for addressing mode [[(Base+Scaled-Index)}Offset]]
What is the probem with this?
* Writeback ports
Such an [[Address Generation Saving]] requires an extra writeback port, in addition to the result of an instruction like load.
For this reason, some instruction sets have proposed to NOT have [[address generation writeback]] for instructions that write other results, such as load,
but only to have [[address generation writeback]] for instructions that do not have a normal writeback, such as a store.
: Glew opinion: workable, minor benefit, but ugly.
Except that pre-increment/decre more general form:
Post-increment and decrement goes further, generalized in this way:
then the address calculation looks like
{ old := address_reg; address_reg := new_address; return old }
Not only does this require a writeback port for the updated address register,
but it also requires two paths from the AGU or RF to where the address is used
- one for the address, the other for the updated address_reg.
OVERALL OBSERVATION: addressing modes begin the slippery slope to VLIW.
No comments:
Post a Comment