Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Saturday, June 20, 2009

Blogging from ISCA: AMAS-BT: Pardo, SMC

Dave Keppel, Pardo, Google: Self Modifying Code

Everyone knows SMC is dead, but SMC is alive in the very tools that complain about it: dynamic optimizers, dynamic linkers, etc.

Pardo talks fast.

Detecting SMC via page protection. Slow if data and code in same page.

BitBLT - recomile every 10K instructions.

Debugger watchpoints. Change immediates in code.

Present in real commercial workloads.

Coherency events:

x86 - non.

Hardware instructions: ISCP "something changed". iflush addr. coherency(base,length).
Poor match between application and simulator/emulator. Need to detect what really changed.

Adaptive: default-write protect. Change strategy if too many faults. Fall back to default after a while.

Self checking strategy: check current ibytes against saved copy of original ibytes.

Pardo noted that invalidated code often reappears - thing like debugger watchpoints maychange back to original. "Revalidation". Another use for invaid cache entries.

Shade. SPARC. Iflush addr. But, there were some applications that did not use iflush, but which worked on real hardware.

Transmeta Crusoe. Subpage write protection.

"Fetch imediates" - translate code, but fetch immediates that might have been patched.

Crusoe: lots of retranslations when falling through the gears.

Deoptimized translators: fetch immediates. Translation calls interpreter.

Bad: BT leads to more implementations, more chances of bugs, reduced test coverage.

Performance stability, lack of. Consistent sometimes better than fast.

SMC/ISC. Q: what does ISC stand for?

Hardware support:

Crusoe: 2 write protect bits per page. Subpage WP cache.

Shade: 100 instructions to translate an instruction.

Gill51 - universa simulator.

No comments: