Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Friday, August 31, 2012

Brackets - XML, and non

I like XML's "brackets":  <longname attributes="attributes"> ... </longname>.

Makes XML much less vulnerable to mismatches like dropping a paren:

<A> <B> <C> <D> ... </D> </B> </A> - the missing &lt/C> can be inferred, and possibly even repaired

(A: (B: (C: (D: .... ) ) )
which closing paren is missing? A, B, C, or D's?

But... sometimes I end up editing XML by hand. (Like on a wiki. (Or in blogger :-( )  And sometimes typing <longname>...</longname> is just too much.

Sometimes I would like the option of having a more compact, but less robust, bracket notation. Like the above mentioned (A: ... )

Or [A: ... ].  Or {B: ...}.  Or (although that latter is hard to tell from XML.)

Consider

<A> (B: {C: [D: ... ] ) </A>

it is still pretty easier to infer that the closing } for {C:  is missing.

Here's a thought: the matchingness of brackets is orthogonal to the actual name of the construct that uses the bracket - in XML parlance, the element.

LISP dialects that use only parens ( and ) exclusively are at one end of the spectrum.

XML, with arbitrary <tag> ... </tag> is at the other end.

And in between, we can have notations that use matching { and }, [ and ], < and >, << and >> - heck, Perl and regexps show that there don't even need to be matching pairs.  ' and '. " and ". Or ‘ and ’, etc.

These could be intermixed, as I have done before.

Indeed, the same entity, the same statement type, could use either bracketing form: &ltA> (A: ... ) </A>

I can imagine tools that translate between these forms.

From XML: &ltA> <A> ... </A> </A>

To strictly lisp-like parens: (A: (A: ... ) )

To forms that employ multiple bracket types for "clarity": (A: {A: ... } )

 ---

I guess that I am using (A: ...) as the compact representation of <A>...</A>.

Q: what about attributes? (A at1="at1" at2="at2" ... : ...)

 ---

There seems to be little lossage here. Except for the lost opportunity to place attributes on closing XML tags, <tag attributes>...</A closing-attributes> --- something that XML does not allow, but which I have supported in my own pseudo-XML dialects.

Thursday, August 30, 2012

The HTML Mediawiki passes through

http://en.wikipedia.org/wiki/Help:HTML_in_wikitext

I've ranted before about quotification.

In the link is described the HTML that Mediawiki allows in wiki pages - i.e. the stuff it passes through.

Now, Mediawiki has to have a white list, because it cannot allow arbitrary HTML constructs that a maluser might use, e.g. to inject malware onto wiki pages.  E.g. all, or mostly all, user provided JavaScript must not be allowed. Even basic formatting may not be allowed - e.g. an attacker may be able to use CSS styles creatively to render invisble much text on a page, and thereby creating a phishing page with what remains. Plus the usual issues with tracking links, etc.

More basically, Mediawiki must not allow arbitrary user text. In particular, it must not allow text that would interfere with the HTML that Mediawiki is itself producing.

Now, that last is what my "modest proposal for quotification" specifically attacks: basically add a tag bit to every character that the user inserts, so that it can be distinguished from mediawiki added HTML.

But my "modest proposal" would disallow any user added HTML.  Except for the user added HTML that Mediawiki specifically whitelists, finds, filters, and explicitly removes the tag bit from. And, of course, any bugs in that procedure could lead to security holes...

At least HTML, XML, etc. is easy enough to parse that Mediawiki can filter out anything that isn't on the whitelist.  It does not have to worry TOO MUCH that new syntax will be added that it does not know about.  That's the joy about HTML/XML:  the syntax is (relatively) stable. Extensions can be added by adding new elements and attributes, but existing parsers can recognize such additions, and decide to pass them through or filter them out.

But, it would be nice if Mediawiki (or my own tools) could pass through much, much, more mediawiki.  If instead of having to whitelist specific constructs, they could say "Evaluate all HTML that does not consistute a security risk".  And especially Javascript.

E.g.

  • do not evaluate any HTML or JavaScript that opens client files (without special permission)
  • do not evaluate any HTML or Javascript that renders existing page elements invisible (except for 
  • specially marked stuff.)  I.e. no switching to white on white zero size.
the above is a blacklist.  Or a whitelist:
  • only evaluate HTML or JavaScript or CSS that changes colors in a likmited way, or fonts in a limited way, or text size in a limited way...
and so on.

I.e. it would be nice to be able to EVALUATE arbitrary HTML and Javascript, in a sandbox whose capabilities are explicitly circumscribed.

Whitelist the capabilities. 

Not necessarily the text.

Whitelist what the code does.  Not the input.





Tuesday, August 14, 2012

windows hang, android bitches

I hate it when Windows (7)
stops accepting keyboard input & must be rebooted.

but at least i can bitch about it using my android.

Monitoring histamine like blood glucose

OK, I admit it in public: I have Type 2 diabetes. Controlled by diet.

Regularly monitoring my blood sugar with the home tests made  a great difference:
  • I learned how much of my mental state - alertness, fatigue, ability to concentrate - is related to blood sugar. Not all, but a lot.
  • I learned that what I thought was low blood sugar, hypoglycemia, is more often than not high blood sugar, hyperglycemia.
Regular tracking has helped me keep to my diet and exercise program.

--

Since tracking my blood sugar has helped so much, I wonder about tracking another health issue for me: allergies.

Since I have started tracking and logging and journaling more diligently, I have noticed what I call a "pre-allergy" pattern: there are days when I definitely have allergies, sniffling, etc. And days when I am definitely clear of allergies.  But there are also days when I don't feel clear of allergies, but when I don't really have allergy syndromes: instead of my nose being blocked (as it is today), I just feel a little tickle.

The funny thing is that these "pre-allergy" days seem to be the days that I am most likely to be irritable and anxious and have trouble concentrating.  Sure, sniffling all day long can be distracting, but I have started feeling relieved when I wake up sniffling, because that usually means I will have a relatively good day at work. Except for the sniffling.  A full blown allergy attack - constant sniffling, eyes sore, skin itching - is distracting, but a mild one can be worked through.

I call these days of low but present allergy symptoms "pre-allergy" because quite often after a few days of "pre-allergy" I will have more intense allergy symptoms.  Although sometimes it is post-allergy as well.  I imagine that it is the "shoulders" of my allergy intensity curve, at levels just below the levels that trigger full allergy attacks.

I wonder if there is a home blood test to measure allergic intensity, i.e. histamine levels?  I can see that there are lab tests for histamine, http://www.integrativepsychiatry.net/histamine_level_whole_blood.html.  But they are heavyweight - go to a lab.  Not something you can measure 2-3 times a day, as I measue my blood sugar.

--

OK, I admit it further: since I started tracking my blood sugar, I have gotten into Quantified Self / Personal Monitoring.

For manual tracking of ...probably too many things... I use KeepTrack on my Samsung Galaxy Player.

But its best when the tracking is done automatically.  E.g. I splureged on the Withings wifi connected scale, http://www.withings.com/en/bodyscale, that uploads my weight to their website every time I weigh myself. Without me having to intervene.

I had hopes for MapMyHike, http://www.mapmyhike.com/.  It's great when it works.  But (1) the GPS on my Samung device is unreliable, at least where I live, an area with deep ravines and canyons and poor sky exposure, and (2) even tolerating the GPS fragility, the MapMyHike app is much slower than most other aps, and often loses data that it has promised to upload later.

I wish there was an aggregator for these cloud based tracking tools.  I would like to see all of my stuff in one place.

etc, etc.



How to search within an Outlook message (hint: can't search in preview pane)


Quick stupid Outlook question:

How to search within an Outlook message?  So that I can jump quickly to the details for XXX in a long message.

I guess I could use Outlook Web Access, and just use ^F in my browser.  .. Well, I could, if OWA was working.

But there must be some way to do this in Outlook itself.

(I suspect that I have asked this question before, and forgotten the answer. For the life of me, can't see a button or menu item. Yes, I'm trying helpナ)

...

Ah.

F4 to search

- but it doesn't work in the preview or reading pane. Must open the message in a window of its own.

Google works better than MS help.  When something as basic as this needs a help page on about.com, it must be a UserInterface bug.

http://email.about.com/od/outlooktips/qt/et102904.htm

How to Search Inside a Message in Outlook

Finding messages is easy, accessible and reasonably fast in Outlook, but finding text inside a message I find more challenging. It can be done, though a few detours are involved.

Double-click the message to open it in its own window.
" You cannot search inside a message shown in the Outlook preview pane.

---

Posting this on my blog so that I can find it quickly.

Monday, August 13, 2012

Ron Jeffries on Exceptions => I want the best of both worlds

In an earlier blog entry, I expressed my liking for Andrei Alexandresciu's presentation about the D programming language:
D tries to make the easiest code to write also correct. Handle errors. E.g. throwing exceptions, implicitly caught outside main, usual path for library error reporting.http://blog.andy.glew.ca/2009/07/andrei-alexandrescu-case-for-d-attended.html
I.e. it made me look more favorably than before on exceptions.

Moreover, I have long found that "stacked exceptions" are one of the best ways of providing useful error messages: throw an exception, with a string message explaining the context If caught and handled by the caller, great. If not caught and handled by the caller, but caught somewhere higher up, then that place  can either handle it - or can concatenate more context information. And so on.

So, the exception either gets handled... or you get a useful set of nested, stacked, context error mssages:
  • error: file Foo.tmp could not be created, already existed, not overwiteable.
  • error: Unpacking archive Foo.tgz
  • error: installing software package Foo
I.e. not a useless error message like "could not open file of unrecognizable name".  Not a segmentation fault. And not a buffer overflow.
--
However, what Rin Jeffries wrote in the [XP] mailing list thread "Exceptions are evil?" also appeals:
    ** Exceptions are not an object oriented mechanism
    This is true, they are not OO, although one can implement an exception object.

    ** In reality, all exceptions are is a mechanism for a multi-level return.
    ** A 
    method may return its intended value, or an exception.
    That's a bit like saying "all a nuclear weapon is is a big firecracker".
    It may be true but somehow it misses something.


    ** To me, calling something and getting an exception is NOT a bug,
    ** It is using an interface in a (hopefully) documented way.
    My concerns about them include:

    as used on the ground, the exception handling often occurs multiple levels up, and then it's not handled after all.

    Used in a single class or method, the good case and the bad case or separated in the code and even in a single class we often don't really know where the exception came from. Either way the code flow is inherently obfuscated.

    Exceptions are often used to pass the buck instead of dealing properly with an unusual condition at the level where it could best be dealt with.

    The alternative to exceptions is not "return a flag which you must then check all the time and that is a stupid pain in the ass".

    The often better alternative is "return a result which can be interrogated if you wish to try something different or which can be returned blindly to the next guy up the chain, who has the same option". If no one looks at the object and deals with it, it embodies sensible, generally mostly null behavior. This generates code that works better, is easier to read, and discourages laziness.

    Once you get into a language style that runs on exception thinking, you kind of have no choice. If you build the software better from the beginning, there is a choice and IMO it is much nicer.
Ron's point is valid. Exceptions are hard to follow.

But, on the other hand, exceptions work when callers buggily do not check for errors returned.

I want the best of both worlds.

This suggests using extended or wrapper types, like the Valid template I have used for years in simulators.  Let's be a bit more explicit, and call it Valid_or_Throw_Exception, although it is possible that there is no need for a difference - in fact, my Valids often throw exceptions if accessed when invalid.   The only big reason for a new type Valid_or_Throw_Exception is that you might want to carry an error message around, in my usual stacked error message approach - whereas my ordinary Valids are stripped down to be as efficient as possible.

Something like, in C++
template Valid_or_Throw_Exception
     : public Wrapper_Type
{
     bool valid = false;
     string errmsg;
public:
     Valid_Valid_or_Throw_Exception(const T& init) : Wrapper_Type(init), valid(true) {}
     Valid_Valid_or_Throw_Exception() : valid(false), errmsg("uninitialized") {}
     Valid_or_Throw_Exception Error(string msg) {
          Valid_or_Throw_Exception ret;
          ret.valid = false;
          ret.errmsg = msg;
          return ret;
      }
private:
     virtual void wrapper_pre_check() {
           if( ! this->valid ) {
                  throw "accessing invalid data item returned as an error by ..." + this->errmsg;
           }
      }
}
Where Wrapper_Type is a template that wraps the base type T, and arranges for all of the methods, etc., of T to be callable on the object that is wrapping.  In this case, first calling wrapper_pre_check().

(I may have played a bit fast and loose here.  My Valids so far have always been performance critical, so I would never have called a virtual function inside them. (Yes, I have measured the performance - that's what I do.) But outside of performance critical code, this seems reasonable.)

This allows values to be returned.  If the values are good, then no worries.  If the values are bad, exceptional, indicating an error, then the caller can still check. Or arrange to pass up the call stack nicely without checking, copy around, etc.

But if somebody tries to use the value without handling a possible error that actually occurred, bang!

But at least then you get the stacked error messages, which are nicer than not.

--

This makes me feel better about my nested exception string messages.  They are useful when you aren't really handling error exceptions, but are just trying to provide the most meaningful error message possible.  More context that "seg fault", but less context than a raw stack dump.  (I have always imagined a clickable interface, so that you could see the outermost, top level, error message, and then click deeper and deeper if necessary.)

My preference is to throw strings.  And only ever to throw strings, or possibly lists of strings in lieu of concatenation.

But this gets in the way of throwing proper exception objects for proper exception based error handling.

This hybrid approach gives both:  well-behaved error handling via the objects returned, and nested exception string contexts when the well-behaved stuff fails.

===

Cool: attempting to reply to the [XP] mailing list via Yahoo's web posting facility gave the error:

Post Message

Post Message Help


PythonError: exception.NameErrot at d8c9c0 

Which is an example both good and bad of exceptions.
It gets better -> cutting and pasting the error message into Blogger resulted in lossage because of angle brackets. That's the sort of "null behavior" that is annoying. I tweaked the error message.
First off, this is a fairly useless error message to provide to the user of a web page. I can only gues that NameError might be associated with my login name. Or maybe not.

Also, hex addresses like d8c9c0 are the sort of thing that make black hats drool. This *might* be a machine address.  It gives me hints as to what addresses I might put into a code packet, if I could find a buffer overflow. This isn't a buffer overflow, but now I can go and look up Python exploits.

So far, Ron wins.  It would be much nicer if the error message was something like

Bad login name: XXXXX
Turned out to be not having a profile. See how useless the Python exception is to the end user?
But... at least the error was checked and thrown. At least this was Python. In C++, it might have been unchecked, and there might have been a buffer overflow that a bad guy could use to break in.

It would be nice to have an appropriate error message. Ron's point about having the good and bad paths together wins: the programmer is much less likely to forget to handle the error if he can see "return[ed] a result which can be interrogated".

But sometimes these errors occur deep, deep in library functions.  Possibly libraries that presently assert. Sometimes changing the library is not an option, no matter how often one chants the XP "courage" mantra.

It's easy to #define assert to throw an exception. (Harder from a signal handler.)  This gets you to Andrei's place.

And the next step is to my nested or stacked error messages:

Post Message

Post Message Help


PythonError:  exception.NameErrot at d8c9c0 
User does not have Yahoo Profile: you must create one beforre possting.
Error trying to Post Message.

Not as good as what somebody following Ron's prescription could do. But better than nothing.

--

By the way, one of the big reasons why I was attracted to exceptions for error handling in C++ is that I found it made writing my own CppUnit equivalent (actually, not equivalent, minimalist) easy, especially when testing existing code that exit'ed or asserted on an error: rather than having to mess with multiple processes in ways that are often not portable, I culd usually #define assert to throw and/or use one of the exit hooks.

I.e. I used exceptions to make the test jig easier in the presence of ill behaved legacy code.  And faster, fast enough to be used regularly - as opposed to forking processes around often very small test cases, which back in 1996, and even still today, produce horrible slowdowns, discouraging the team from using tests.


Sunday, August 12, 2012

Touchscreens make PDFs almost tolerable

I have long hated reading PDFs on my computer.  Especially 2 column papers. PDFs often require scrolling up and down, back and forth, in order to read.

On a traditional PC or laptop, this requires use of scroolbars and mouse, sometims keys.

The PDF reader, e.g. Acrobat, sometimes allows you to just use arrow down in a text flow, but it often jumps around disconcertingly.

Reading it on my touchscreen tablet makes PDGs almost tolerable.  Direct manipulation to move what I want to look at around is much better tnan scrollimng.

I still wish it was a single directional flow.  But I can live wth this.

--

I conjecture that if the PDF reader just allowed mosuing to drag around the viewport, like Google maps, it would be better, almost as good as touch.

Thursday, August 09, 2012

Every type or class should have a printer


My personal rule:

Whenever I define a type, a class, or whatever, I define either a print function, or a to_string function.

class Foo {public:      std::string to_string() const {            std::string ret;            ret += “”;            ret += fmt(“field1=%d”,this->field1);            …            ret += “
”;
            return ret;      } // or public:      friend std::ostream& operator<< (std::ostream& ostream, const Foo& this_value) {                ostream << “                                << “field1= “ << field1                                …                                << “”;
                return ostream;      } }

Sometimes I have to_string_compact(), to_string_verbose(), etc;
Unfortunately, it is harder to express that with the osttream operator<< syntax.

But the operator<< syntax can be more efficient – less allocation of temporary strings.

And the formatting can be easier.

--

The main reason why I don't use the operator<< everywhere is that I have rn into far too many broken ostreams. not so common any more, but was common for a while.

--

I have several times written tools to automatically generate these.

--

It is straightforward to generate to_string from operator<<, and vice versa.  Since operator<< can be more efficient, usually best to do the former.  But some ostreams can be inefficient.

--

Rarely you may want to make the operator<< a template, accepting an OSTREAM type - because I have run into libraries that look like streams, but which are not in the same type hierarchy, and which do not inherit from std::ostream.

Monday, August 06, 2012

Photos - Google+

On cloud sites, like google docs (drive and picasa) - I want the ability to easily find docs, photos, etc, by permission.

I prefer not to make anything public unless really certain.

But I had accidentally done so.

Heck: I would actually like to have to jump through hoops like retyping a password to make something public.

Thursday, August 02, 2012

Lack of keyword parameters a big cause of programming .... errors, bugs, inefficiency

Sometimes the title says it.

I wonder how much inefficiency has been introduced into programming, how many bugs caused, because positional parameters became the default standard for programming languages long ago.

If not actual bugs because two parameters are flipped in a way that is not detected by the type system, how about inefficiency.  "I can't remember the order of the arguments, so I must look them up..."

Microsoft's IntelliSense helps ... but I'm usually an macs user,  Don't have that.  IntelliSense has almost beenh enough to make me switch.

---

Bjarne says that keyword parameters almost made it into a C++ standard, and lost only because of legacy C issues. He suggests parameter objects.  Cool.  So, I want to automatically create a parameter object for each and every function.  

Hmm....  I think that I could write a preprocessor that did that, munging the symbol table.

Something like

ParameterObject.NamedParameter1()...

Unfortunately, parameter names do not get into the symbol table.

Chaining, or Why You Should Stop Returning Void - Zero Wind :: Jamie Wong

Chaining, or Why You Should Stop Returning Void - Zero Wind :: Jamie Wong:

Bjarne Soustrup's recommendation on how to do keywords in C++:

FunctionFooParameterObject()
    .keyword_parameter1(val1)
    .keyword_paraemeter3(val3)
    .keyword_parameter5(val5)
    .call_it()
Works.

Only real hassle is that it is a lot of dang code to write.

Google "cloud" tool shortcomings

Are Google docs and blogger cloud? Whatever.

Google docs:

  • unintelligible URLs
  • not clear how indexed they are - I do know that docs that I have "cross posted" between my wiki (mediawiki) and Google docs have been indexed, by Google, on the wiki, but not on Google docs.
    • "It doesn't exist if you can't google it"
Blogger:

  • no formatting in comments
  • can't link to a particular comment
? permalinks ?

Jamie Wong's An Argument for Mutable Local History

http://jamie-wong.com/2012/05/25/an-argument-for-mutable-local-history/

Argues for mutable local history, and hence rebase.

I think all of these arguments apply also to global history, at whatever level.

But I want immutability as much as possible.

Again, I conclude that it is not the absolute history we what to mutate. It is our VIEW of the history.

(Although there are valid reasons to really want to edit history, e.. to drop stuff that you are not legally allowed to use.)

Futures make sets safe

Set notation is very convenient.

E.g.  
index_set = { 1, 2, 3}
array[ index_set ] = 0
rather than a loop

But a problem with set notation is that intermediate sets can be very large.

E.g.  
index_set = 1, 2, 3 ...
array[index_set] = random()
array = array[i] where i < 5 
rearranging
index_set = 1, 2, 3 ...
small_index_set =  i IN index_set, where i < 5
array[small_index_set] = random()
If the array[index_set] = assignment makes things real, this statement is impractical. Whereas array[small_index_set] = is finite sized.

Futures make this sort of manipulation feasible.

Pattern: Ordered Lists, with Associative Lookup and Update

One of my friends (TC) complained that XML had too much order.  He said that XML structs should be unordered, like Perl hashes, associative arrays.  Indeed, JSON has unordered key=alue pairs, and ordered lists: the former is not the latter.

I'm not so sure.  I keep running into situations where I want to have an ordered data structure, but where I also want an associative lookup and/or update.

For example, my shell PATH.  It is definitely order dependent.  But I often want to do operations such as

  • is DIR already in PATH
  • add DIR to PATH if it is not already present
    • prepend or append
  • add DIR to PATH next to DIR_already_in_path
    • just before, or just after
  • replace DIR0 in PATH (assuming it is already there) with DIR1
For example, hook lists - like hook lists for emacs.

For example, like Bash PROMPT_COMMAND. This really isn't a hook list, but imagine that it is, using ; as the separator (gets more complicated with IFs, etc.).  Here I have been wanting a more general associative lookup, not just exact matching.
  • E.g. replace any element that looks like a command that invokes my logical-path program.*logical-path.* with a different invocation of that program
Just like associative arrays, hashes, have greaty impoved programmer productivity by being built in data structures in Perl and Python (and to a weaker extent in C++ STL std::map), it may be worth thinking about standardizing ordered associative arrays. Using Perl associative array syntax (not my favorite, but it's easy at hand):
$OAA{key}  # error if more than 1 match?
$OAA{key:1} $OAA{key:2} ... # first match, second match, ...
$OAA{key:#} # number of matches
$OAA{key} = value # replace. error if no match
$OAA{key} = {} # delete - or some other way of indicating droppage
$OAA{key} <<= value # insert just before
$OAA{key} >>= value # insert just after
$OAA{key:*} = ... # replace all occurrences
We might distinguish key from value, as in key=value.  Or we might allow the key and the value to be the same, i.e. the entire value might be matched against.

We might require key to be exactly matched - or we might allow regexps, or arbitrary expressions.
$OAA{i WHERE $OAA{i} < 2} = ... # replace all occurrences
We might allow the value to have structure, with fields.
$OAA{rec WHERE $OAA{rec}.age > 10 and $OAA{rec}.weight <70 br="br">$OAA{.age > 10 and .weight <70 blockquote="blockquote">Nothing that you can't do with map here - but the notation may be more pleasant. And compact, pleasant notation matters. (As opposed to compact unpleasant notation.)

Verges on relations.

---

Extend to multi dimensional?



Quoting is the source of much evil (or the lack thereof)

Quotification, or the lack thereof, is the source of much evil.  SQL injection type errors, etc. 

I've written already about a modest proposal, to apply a simple taint bit to bytes saying "This is data, not language syntax."

I hesitate to expose my own bugs, but what the heck:

Today I just had a bug that looked like
bash: syntax error near unexpected token `('
terminating .bashrc too early, before all path setup was done.

as I tried to make my UNIX and cygwin environments converge.

Bug was insufficient quoting in
eval $pathvar=$elem:"$pathval"
in some code manipulating the PATH. Had problems with filenames containing special characters like space and ()s.  Such fiulenames ae relatively uncommon on *IX, but common on Windows, and Cygwin is both.

Fixed by adding quotes
eval $pathvar="'$elem'":"'$pathval'"
At least I hope it's fixed. What if pathvar has a special character? Nah...


Part of the trouble with quotification is that you have to get it just right.  Too many quotes are bad, just like too few.

I might like to do

eval SAFE(SAFE(SAFE($pathvar)))=SAFE(SAFE(SAFE($elem)))::SAFE(SAFE(SAFE($pathval)))

which might really be considered

eval $pathvar=$elem:$pathval
where I have used green text background to indicate stff that should have no sntax when eval'ed, and red to indicate syntax.

This works okay of the eval is immediate.   But if the results of eval are themselves evalled an unknown number of times, a simple bit is not sufficient.  Nesting...

---

khb commented:

So to be clear, you suggest that we have (for want of a better term) a bit to indicate that something should be treated as data? Was that the sort of thing the near legendary Intel 432 did? 
I'm not saying it's a bad idea, I'm just trying to map it into historical perspective ;>
Responding inline, because one of blogger's weaknesses is no formatting in comments. (Probably deliberate, to prevent complex conversations.)

  • I do want metadata associated with data
    • This might be a tag bit per memory location (but it doesn't have to be)
      • actually, I can't remember if the iAPX 432 had such tag bits.  It may not have needed them, because it treated everything as object oriented
      • one version of the i960, and the IBM AS/400, had such tag bits.  However, they used the tag bit for privilege - what they did would not have helped directly for SQL injection type attacks.
    • But it doesn't need to be a tag bit.  Much discussion on
Yes, it's tags.  But not necessarily HW tags like some old AI machines have.  SW tags are good enough.  And, in this case, only in strings that you are going to give to a language parser, compiler, evaluator. Basically, use wchar, wide char, 16 bit or wider characters, where some of the

Heck: I edit in gnu emacs, where every character in the buffer can have fairly arbitrary attributes - like 'keyword, 'variable-name, etc.  Why not make use of such attributes in the character set that the language parser accepts as input?  Especially if can reduce security bugs.

Note: we might not yet be ready to jump to semantically enforced tags - where keywords like IF THEN ELSE would be *required* to have the 'keyword tag on them.  We probably still need to be able to accept untagged ASCII characters as input.  But, we might be ready for semantic hints - tagging something 'definitely-not-a-keyword.  If the parser tries to take such definitely not tagged character as part of a keyword, that could be an error. (As opposed to making it part of an identifier.)

abort: filename contains ':', which is reserved on Windows: ...

The usual: record one of my stupidities, in the ope that it will save someone else some time in the future - perhaps amnesiac me:

Just spent longer than I should have wondering why I was getting 

abort: filename contains ':', which is reserved on Windows: ...

errors - when all I was trying to do was hg update -r default, back to a version that I had been using just yesterday.

This on cygwin, on windows.

Turns out that my current revision had a broken ~/.bashrc, so instead of using cygwin mercurial, /bin/hg, I was using Windows Mercurial, /cygdrive/c/Program Files/TortoiseHg/hg

And the Windows Mercurial enforces Windows naming conventions.

Fix: just say /bin/hg ...