Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Sunday, June 21, 2009

Blogging from ISCA: CARD: Panel Session, Multicore Programming

Arvind, David August, Keshav Pingali

Phrased as a debate between A and P

The motion: How should the potential of multicore processors be extracted by applications programmers?


Arvind: Intro



Implicit vs. Explicit

...composability...

Too much detail => programming to hard. But is there an appropriate level of detail.

(AFG: infinite fine grain CPUs. Expose memory, cache, line size.)

Is expressing non-determinism essential to ||?

Speculation? Or not?

Debaters: August for (implicit), Pingali against (implicit, for explicit).

The motion: How should the potential of multicore processors be extracted by applications programmers.

August



NOT parallel programming

NOT parallel compilers

FOR implicitly parallel programming, + parallelizing compilers.

What is implicitly parallel programming.

E.g. "inline" directive on function. - explicit. Implicit inlining done automatically by compiler. Often better than human.

Tim Mattson's list of >150 explicit parallel programming languages/systems.

Q: Is explicit parallel programminginherently more difficult than implicit parallel programming?

Quotes Pingali PLDI, quoting Tim Sweeney, who said that Unrel || tripled SW development.

SQL: ... implicit parallel. Another Pingali quote.


Example: commutative annotation. e.g. indicating orderof rand calls between loop iterations. "It's okay if I get my random numbers out of order."

SPECint2000, modified ~50 lines of code of 2M. "Restored the trend" ...

Is it important to be able to express non-dererminism: YES.

Speculative Execution? YES

Is explicitly || needed by anyone other than compiler writers and systems programmers? NO

Only explicit indications needed are of the form function/anti-function (ad-to-list/remove-from-list).

Allies: Church Turing Thesis

Full Employment for Compiler Writer's Thesis.

Pingali



For explicit.

Necessary evil for applications programmers writing irregular programs, e.g. linked data structures.

YES to non determinism. YES to speculation. (Data value dependent parallelism.)

Delaunay Mesh Refinement.

Don't care non-determinism.

Parallelism depends on run-time analysis. Compiler can't...

(AFG: can compiler generate the code to do the analysis?)


Galois model (non deterministic set iterators.)

Collections approach: libraries implemented parallel by experts,
used by higher level programmers. E.g. database B-trees.

Problems: no fixed set of datastructures.

Even generic datastructures need to be tuned.

In many applications the API functions do little computation. Little time in the datastructures. (AFG: but ||ing e.g. a map execute-over-all...)

August counterexamination of Pingali



A: Are there more libraries than programs?

P: Doesn't matter. App programmer still needs to optimize data structure for his app.

A: since in Q&A, I won't explain how you were wrong.

A: Are there patterns? Cliches? Compiler support for same.
In the past when compilers were a 15% solution, not enough money to develop.

(AFG: how many compilers handle even the Gang of 4 patterns? Let alone parallel programming.)

P: functional languages, compilers, haven't gone anywhere over the years.

A: analysis wrong crutch to lean on. E.g. perl (perlbmk). Parallelism in input set. No amount of analysis, neither compiler nor Perl programmer, will know ||ism.
Runtime analysis.

Audience



Mattav (sp?) UT prof: Parallelism not a problem. Implicit works. Explicit works. Tuning is the problem.

A: compiler needs to observe effects of transformation. Run-time feedback.

Mattav summarized P and A's response as "both of you are saying implicit for locality".

I asked my STL question.

To P: not using STL grounds for firing. Customizing grounds for firing. Maybe just fire them all / wait until ew generation hired?

P: STL doesn't parallelize. Is just the datastructure. (He didn't follow the map applicator). Joe programmers, doman specific. Stephanie programmers, ||. (He didn't say ninja/padawab.)

To A: people using iterators, but not map application. Can your compiler convert?

A responded: there exist tools that change datastructure, e.g. list to vector,
or to skiplist.

...

Audience Q: to A: where do you draw the line between explicit ad explicit.

A: if the annotation brings information that cannot be determined automatically, that's implicit. (AFG: copout. That's explicit. But nevertheless still appropriate.)

Arvind's close



If a course is important enough to teach to grad students, why not teach to freshmen?

Should || programming be taught to freshmen?

A: explicitly || programming taught to junior/senior (3rd/4th year)

P: Teach about parallelism in algorithms first. Later, implicit parallel at Joe level.
Later still, fully explicitly parallel.

Vote



Applications programmers do not need to write explicitly parallel programs.

The motion does not carry.

No comments: