Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Sunday, June 19, 2011

A vocabulary of terms for memory operation combining

There are many different terms for the common optimization of memory operation combining or coalescing.
See the discussions of the [[difference between write combining and write coalescing]],
as well as [[write combining]] and [[write coalescing]],
and [[NYU Ultracoomputer]] [[fetch-and-op]] [[combining networks]].

Here is a by no means complete list of such terms:

* [[combining]]
** [[write combining]]
*** [[write combining buffer]]s
*** [[write combining cache]]
** [[read combining]]
** [[fetch-and-op combining]]
*** [[NYU Ultracomputer]] [[fetch-and-op]] [[combining networks]]

* [[coalescing]]
:: on GPUs...
** [[write coalescing]]
** [[read coalescing]]

I have not heard anyone talk about fetch-and-op or atomic RMW coalescing
as distinct from the historic
[[NYU Ultracomputer]] [[fetch-and-op combining]].
But I suppose that inevitably this will arise.

* [[squashing]]
:: mainly refers to finding that an operation is unnecessary, and cancelling it - or at least the unnecessary part
** [[load squashing]]
::: On P6, referred to a cache miss finding that there was already a cache miss in flight for the same cache line. No need to issue a new bus request, but the squashed request was not completely cancelled - arrangement was made so that data return from the squashing cache miss would
** [[store squashing]]
::: I am not aware of this being done, but it could refer to comparing store addresses in a [[store buffer]], and determining that an older store is unnecessary since a later store completely overwrites it (and there would be no intervening operations that make it visible according to the [[memory ordering model]]). (Actually, I am not sure that there could be such, but I am leaving the possibility as a placeholder.)
::: [[Write combining]] accomplishes much the same thing, although [[store squashing]] in the store buffer gets it done earlier.
::: Note that this is similar to [[store buffer combining]] - combining entries in the store buffer, separate from a [[write combining buffer]].

Again, I have not heard of fetch-and-op squashing, although I am sure that it could be done, e.g. for lossy operations such as AND and OR (a later fetch-and-OR with a bitmask that completely includes an earlier...).

* [[snarfing]]
:: I have usually seen this term used in combination with a [[snoopy bus]], although it can also be used with other interconnects. [[Read snarfing]] means that a pending request from P1 that has not yet received the bus in arbitration observes the data for the same cachelines from a different processor P0, and "snarfs" the data as it goes by.
Depending on the cache protocol, it may be necessary to assert a signal to put data into [[shared (S) state]] rather than [[exclusive (E) state]].

I am not sure what [[write snarfing]] would look like.
An [[update cache]] is somewhat like, but it updates an existing cache line, not an existing request.
I.e. an update cache protocol snarfs write data from the bus or interconnect to update a cache line.
Whereas a pending load can snarf data either from read replies or write data transactions on the bus or interconnect.
I.e. a read can snarf from a read or a write.
But does a write "snarf"?
More like a write may combine with another write -
[[snoopy bus based write combining]],
as distinct from [[buffer based write combining]].

No comments: