Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Wednesday, September 14, 2016

Are there any good uses for multiple Perl fat commas in series ( a => b => 1 )? - Stack Overflow

Are there any good uses for multiple Perl fat commas in series ( a => b => 1 )? - Stack Overflow: "Are there any good uses for multiple Perl fat commas in series ( a => b => 1 )?"



'via Blog this'



Making a copy of my own post.

It sucks that can't cut and paste pseudo-formatted text between sites like StackOverflow and Blogger. Wasn't HTML supposed to solve that?  Oh, yeah, scripting attacks.  Won't bother to fix the formatting. (Started by hand, but must stop wasting time. So much time is wasted fixing the formatting when copying between tools.)


**---+ BRIEF**



In addition to notation for graphs and paths (like Travelling Salesman, or critical path), multiple serial fat arrow/commas can be nice syntactic sugar for functions that you might call like



    # Writing: creating $node->{a}->{b}->{c} if it does not already exist

    assign_to_path($node=>a=>b=>c=>"value");

 

    # Reading

    my $cvalue = follow_path($node=>a=>b=>c=>"default value);



the latter being similar to



    my $cvalue = ($node->{a}->{b}->{c})//"default value);



although you can do more stuff in a pointer chasing / hashref path following function than you can with //



It turned out that I already had such functions in my personal library, but I did not know that you could use `a=>b=>"value"` with them to make them look less ugly where used.



**---+ DETAIL**



I usually try not to answer my own questions on this forum, encouraging others to - but in this case, in addition to the contrived example I posted inside and shortly after the original question, I have since realized what I think is a completely legitimate use for multiple fat arrow/commas in series.



I would not complain if multiple fat arrows in series were disallowed, since they are quite often a real bug, but there are at least two places where they are appropriate.



**(1) Entering Graphs as Chains**



Reminder:  my first, totally contrived, use case for multiple fat pointer/commas in series was to make it easier to enter certain graphs by using "chains".  E.g. a classic deadlock graph would be, in pairs `{ 1=>2, 2=>1 }`, and as a "chain" `(1=>2=>1)`.  If you want to show a graph that is one big cycle with a "chord" or shortcut, it might look like `([1=>2=>3=>4=>5=>6=>1],[3=>6])`.



Note that I used node numbers: if I wanted to use node names, I might have to do `(a=>b=>c=>undef)` to avoid having to quote the last node in a cycle `(a=>b=>"c")`. This is because of the implicit quote on the left hand but not the right hand argument. Since you have to but up with undef to support node names anyway, one might just "flatten" `([1=>2=>3=>4=>5=>6=>1],[3=>6])` to `([1=>2=>3=>4=>5=>6=>1=>undef,3=>6=>undef)`. In the former end of chain is indicated by end of array `[...]`.  In the latter, by undef.  Using undef makes all of the nodes at the left hand of a =>, so syntactically uniform.



I admit that tis is contrived - it was just the first thing that came to mind.



**(2) Paths as a data type**



Slightly less contrived: imagine that you are writing, using, or testing code that is seeking "paths" through a graph - e.g. Hamiltonians, Traveling Salesman, mapping, electronic circuit speed path analysis.  For that matter, any critical path analysis, or data flow analysis.



I have worked in 4 of the 6 areas I just listed.  Although I have never used Perl fat arrow/commas in such code (usually Perl is to slow for such code when I have been working on such tasks), I can certainly avow that, although it is GOOD ENOUGH to write (a,b,c,d,e) in a computer program, in my own notes I usually draw arrows (a->b->c->d->e).  I think that it would be quite pleasant to be able to code it as `(a=>b=>c=>d=>e=>undef)`, even with the ugly undefs.    `(a=>b=>c=>d=>e=>undef)` is preferable to  `qw(a b c d e)`, if I were trying to make the code resemble my thinking.



"Trying to make the code resemble my thinking" is often what I am doing.  I want to use the notations common to the problem area.  Sometimes I will use a DSL, sometimes write my own, sometimes just write some string or text parsing routines  But if a language like Perl has a syntax that looks almost familiar, that's less code to write.



By the way, in C++ I often express chains or paths as



    Path p = Path()->start("a")->link_to("b")->link_to("c")->end("d");



This is unfortunately verbose, but it is almost self-explanatory.



Of course, such notations are just the programmer API: the actual data strcture is usually well hidden, and is seldom the linear linked list that the above implies.



Anyway - if I need to write such "path-manipulating" code in Perl, I may use `(a=>b=>c=>undef)` as a notation --  particularly when passed to a constructor like Path(a=>b=>c=>undef) which creates the actual data structure.



There might even be some slightly more pleasant ways of dealing with the non-quoting of the fit arrow/comma's right hand side:   eg. sometimes I might use a code like 0 or -1 to indicate closed loops (cycles) or paths that are not yet complete: `Path(a=>b=>c=>0)` is a cycle, `Path(a=>b=>c=>-1)` is not. 0 rather looks like a closed loop.  It is unfortunate that this would mean that you could not have numeric nodes.   Or one might leverage more Perl syntax:   `Path(a=>b=>c=>undef), Path(a=>b=>c=>[]), Path(a=>b=>c=>{})`.



All we are doing here is using the syntax of the programming language to create notations that resemble the notation of the problem domain.





**(3) Finally, a use case that is more "native Perl"-ish.**



Have you ever wanted to access   `$node->{a}->{b}->{c}`, when it is not guaranteed that all of the elements of the path exist?



Sometimes one ends up writing code like



When writing:



    $node = {} if not defined $node;

    $node->{a} = {}  if not exists $node->{a};

    $node->{a}->{b} = {}  if not exists $node->{a}->{b};

    $node->{a}->{b}->{c} = 0;



When reading ... well, you can imagine. Before the introduction of the // operator, I would have been too lazy to enter it. With the // operator, such code might look like:



    my $value = $node->{a}->{b}->{c}//"default value if the path is incomplete";



Yeah, yeah...  one should never expose that much detail of the datastructure.  Before writing code like the above, one should refactor to a nice set of object oriented APIs.   Etc.



Nevertheless, when you have to deal with somebody else's Perl code, you may run into the above.  Especially if that somebody else was an EE in a hurry, not a CS major.



Anyway: I have long had in my personal Perl library functions that encapsulate the above.



Historically, these have looked like:



    assign_to_hash_path( $node, "a", "b", "c", 0 )

    # sets $node->{a}->{b}->{c} = 0, creating all nodes as necessary

    # can follow or create arbitrarily log chains

    # the first argument is the base node,

    # the last is the value

    # any number of intermediate nodes are allowed.



or, more obviously an assignment:



    ${hash_path_lhs( $node, "a", "b", "c")} = 0

    # IIRC this is how I created a left-hand-side

    # by returning a ref that I then dereffed.



and for reading (now usually // for simple cases):



    my $cvalue = follow_hash_path_undef_if_cannot( $node, "a", "b", "c" );



Since the simple case of reading is now usually //, it is worth mentioning less simple cases, e.g. in a simulator where you are creating (create, zero-fill, or copy-on-read), or possibly tracking stats or modifying state like LRU or history



    my $cvalue = lookup( $bpred_top => path_history => $path_hash => undef );  

    my $cvalue = lookup( $bpred_top => gshare => hash($pc,$tnt_history) => undef );  



Basically, these libraries are the // operator on steroids, with a wider selection of what to do is the full path does not exist (or even if it does exist, e.g. count stats and cache).





They are slightly more pleasant using the quote operators, e.g.



    assign_to_hash_path( $node, qw{a b c}, 0);

    ${hash_path_lhs( $node, qw{a b c})} = 0;

    my $cvalue = follow_hash_path_undef_if_cannot( $node, qw{a b c});



But now that it has sunk into my thick head after many years of using perlobj, I think that fat arrow/commas may make these look much more pleasant:



    assign_to_hash_path( $node => a => b => c => 0);

    my $cvalue = follow_hash_path( $node => a => b => c => undef );



Unfortunately, the LHS function doesn't improve much because of the need to quote the last element of such a path:



    ${hash_path_lhs( $node=>a=>b=>"c"} = 0;

    ${hash_path_lhs( $node=>a=>b=>c=>undef} = 0;



so I would be tempted to give up on LHS, or use some mandatory final argument, like



    ${hash_path_lhs( $node=>a=>b=>c, Create_As_Needed() ) = 0;

    ${hash_path_lhs( $node=>a=>b=>c, Die_if_Path_Incomplete() ) = 0;



The LHS code looks ugly, but the other two look pretty good, expecting that the final element of such a chain would either be the value to be assigned, or the default value.



    assign_to_hash_path( $node => a => b => c => "value-to-be-assigned");

    my $cvalue = follow_hash_path( $node => a => b => c => "default-value" );



Unfortunately, there is no obvious place to hand keyword options - the following does not work because you cannot distinguish optional keywords from args, at either beginning or end:



    assign_to_hash_path( $node => a => b => c => 0);

    assign_to_hash_path( {warn_if_path_incomplete=>1}, $node => a => b => c => 0);

    my $cvalue = follow_hash_path( $node => a => b => c => undef );

    my $cvalue = follow_hash_path( $node => a => b => c => undef, {die_if_path_incomplete=>1} );



I have occasionally used a Keyword class, abbreviated KW, so that a type inquiry can tell us which is the keyword, but that is suboptimal - actually, it's not bad, but it is just that Perl has no single BKM (yeah, TMTOWTDI):



    assign_to_hash_path( $node => a => b => c => 0);

    assign_to_hash_path( KW(warn_if_path_incomplete=>1), $node => a => b => c => 0);

    my $cvalue = follow_hash_path( $node => a => b => c => undef );

    my $cvalue = follow_hash_path( KW(die_if_path_incomplete=>1), $node => a => b => c => undef );

    my $value = follow_hash_path( $node => a => b => c => undef, KW(die_if_path_incomplete=>1) );



**Conclusion: Foo(a=>b=>c=>1) seems strange, but might be useful/nice syntactic sugar**



So: while I do rather wish that `use warnings` had warned me about `foo(a=>a=>1)`, when a keyword was duplicated by accident, I think that multiple fat arrow/commas in series might be useful in making some types of code more readable.



Although I haven't seen any real-world examples of this, usually if I can imagine something, a better and more perspicacious Perl programmer has already written it.



And I am considering reworking some of my legacy libraries to use it. In fact, I may not have to rework - the library that I designed to be called as



    assign_to_hash_path( $node, "a", "b", "c", 0 )



may already work if invoked as



    assign_to_hash_path( $node => a => b=> c => 0 )



**Simple Working Example**



For grins, an example of a simple path following function, that does a bit more error reporting than is convenient to do with //



    $ bash 1278 $>  cat example-Follow_Hashref_Path.pl

    use strict;

    use warnings;

 

    sub follow_path {

        my $node=shift;

        if( ref $node ne 'HASH' ) {

    print "Error: expected \$node to be a ref HASH,"

     ." instead got ".(

         ref $node eq ''

    ?"scalar $node"

    :"ref ".(ref $node))

     ."\n";

    return;

        }

        my $path=q{node=>};

        my $full_path = $path . join('=>',@_);

        foreach my $field ( @_ ) {

    $path.="->{$field}";

    if( not exists $node->{$field} ) {

       print "stopped at path element $field"

         ."\n    full_path = $full_path"

         ."\n    path so far = $path"

         ."\n";

       return;

    }

    $node = $node->{$field}

        }

    }

 

    my $node={a=>{b=>{c=>{}}}};

 

    follow_path($node=>a=>b=>c=>"end");

    follow_path($node=>A=>b=>c=>"end");

    follow_path($node=>a=>B=>c=>"end");

    follow_path($node=>a=>b=>C=>"end");

    follow_path({}=>a=>b=>c=>"end");

    follow_path(undef=>a=>b=>c=>"end");

    follow_path('string-value'=>a=>b=>c=>"end");

    follow_path('42'=>a=>b=>c=>"end");

    follow_path([]=>a=>b=>c=>"end");

and use:

    $ perl example-Follow_Hashref_Path.pl
    stopped at path element end
        full_path = node=>a=>b=>c=>end
        path so far = node=>->{a}->{b}->{c}->{end}
    stopped at path element A
        full_path = node=>A=>b=>c=>end
        path so far = node=>->{A}
    stopped at path element B
        full_path = node=>a=>B=>c=>end
        path so far = node=>->{a}->{B}
    stopped at path element C
        full_path = node=>a=>b=>C=>end
        path so far = node=>->{a}->{b}->{C}
    stopped at path element a
        full_path = node=>a=>b=>c=>end
        path so far = node=>->{a}
    Error: expected $node to be a ref HASH, instead got scalar undef
    Error: expected $node to be a ref HASH, instead got scalar string-value
    Error: expected $node to be a ref HASH, instead got scalar 42
    Error: expected $node to be a ref HASH, instead got ref ARRAY
    ✓
    $

**Another Example `($node->{a}->{B}->{c}//"premature end")`**


    $ bash 1291 $>  perl -e 'use warnings;my $node={a=>{b=>{c=>"end"}}}; print "followed path to the ".($node->{a}->{B}->{c}//"premature end")."\n"'

    followed path to the premature end

    $ bash 1292 $>  perl -e 'use warnings;my $node={a=>{b=>{c=>"end"}}}; print "followed path to the ".($node->{a}->{b}->{c}//"premature end")."\n"'

    followed path to the end

I admit that I have trouble keeping the binding strength of // in my head.

**Finally**


By the way, if anyone has examples of idioms using `//` and `->` that avoid the need to create library functions, especially for writes, I'd love to hear of them.

It's good to be able to create libraries to make stuff easier or more pleasant.

It is also good not to need to do so - as in `($node->{a}->{B}->{c}//"default")`.

Later:

At Stack Overflow: @mp3:pointed out that fat arrow/comma can be a terminator, e.g. (a=>b=>c=>). Doesn't help much in general when you have multiple chains, or to separate keywords in follow(path$node=>a=>b=>c=>"default",want_keyword=>1), but looks not-so-bad for Path(a=>b=>c=>). 

Inspires EVIL PERL TRICK print Do => You =>Like=>Barewords=>

May not want to be associated with such evil. 

I have often thought that the reason that we don't actually use Perl as our interactive shell like bash is that bash defaults to barewords, whereas Perl usually requires quotes.

Methinks that it should be possible to create a single language that with the same keywords and operators, that can be turned "inside out":

One mode where strings require quotes:

var a ="string-value"

a second mode where things are string by default, and it is the keywords and syntax that needs to be quoted (here by {}):

{var} {a} {=} string-value

The latter might be useful in literate programming. Same programing language constructs, just inverted.  Although the embedded prtogramming language syntax might be most l;ike Perl interpolation - might need different quotes for code producing a value within the text, and code operating on the text./

The minimal aspect of command line shells, for the most part, is a hybrid: the first word on a line is special, a command - everything else is strings by default.