The content of this blog is my personal opinion only. Although I am an employee - currently of Imagination Technologies's MIPS group, in the past of other companies such as Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Friday, June 23, 2017

ISCA 2017: " Ignore the warnings about the certificate."

ISCA 2017: " Ignore the warnings about the certificate."
It seems wrong that the webpage for ISCA, the International Symposium on Computer Architecture, which has two sessions on security and also includes a workshop on Hardware and Architectural Support for Security and Privacy, cannot do certificate based security properly.

I haven't looked at the certs, but I understand why: creating a cert for a subdomain which such a relatively short active life is a pain. Yes, even with the EFFs free certs.  (Note that conference websites often survive years, sometimes decades - but nobody gets credit for securing such a site after the conference ends. (Hmm, maybe bad guys who want to attack the sorts of people who go to such conferences should make such stale sites infectious.) )

Still: computer architects  should know enough to get valid certs.

And I wish that the world was more comfortable in allowing domain owners to issue signing certs fir subdomains, and so on.

Tuesday, June 06, 2017

(D)VCS branching models: notes in progress

I like branches.  But I don't like YOUR branches

I don't like git's branches. But then again, I don't really like Mercurial's branches. Or Bazaar's branches. Or Perforce/SVN/CVS/RCS branches. I may be polluted: I used CVS branches extensively, back in the day. Heck, I used RCS branches in RCS-wrapper-tools. I've used Mercurial and Bazaar branche extensively.   I like what Brad Appleton has written about branches in
Streamed Lines: Branching Patterns for Parallel Software Development - but then again, Brad and friends are really talked about streams of software development, not branches.  This may really be the problem: several different concepts use the same underlying mechanism.

This post is work in progress. I want to make some notes about branches, often in reaction to statements in other webpages. I will try to properly reference those webpages - but I am more interested in evolving the ideas than being properly academic.

Why I am writing this

1) Writing stuff like this helps me understand the differences between tools, and adapt my work style to new tools.  Although I have been using git for more than 10 years, it has only become my primary VCS recently for personal stuff - and I have to use Perforce at work. Most recently I mainly used Bazarr for personal stuff, but bzr is declining. Mercurial for some projects at work.  Plus, in the more distant past, SVN, CVS, RCS.

Similarly, I may not have noticed features added during git's evolution.

2) I am intrigued by analogies between version control software and OOO speculative hardware.  Git, in particular, is all about rewriting history to be "nice". OOO speculation is similarly about making execution appear to be in a serializable order.  Similarly, memory ordering hardware often performs operations in an order inconsistent with the architectural memory ordering model, but monitors and/or repairs to be consistent.

3) I am just plain interested in version control.

3') I had started writing my own DVCS that I abandoned when git came out.  Mine intended better support for partial checkins and checkouts, not just of workspaces, but of entire repo.  It was intended to be able to handle repos for the same or overlapping source trees that had been created independently - i.e. that did not have common ancestors within recorded history. (Why?  Think about it...)

Immediate Trigger

I knew that git branches are really just refs to versions - with what others might call a branch being some form of transitive closure of ancestry. Not quite the same thing, but tolerable.

Even with this, I felt strongly that git branches are not really first class.

I was flabbergasted when I learned that git branch descriptions are not transferred to remote repositories.

This suggests a definition of a requirement for a DVCS to treat something as a first class concept:  the objects representing that concept should be versionable, and pushable remotely.   

Random Note Snippings

---+ Branch Names and Versions

Several writers on DVCS, usually git advocates, have said that the problem with Mercurial style branches is that they are recorded in the commit history, and that this prevents deleting or renaming branches.

For example:
[Contreras 2011] In Mercurial, a branch is embedded in a commit; a commit done in the ‘do-test’ branch will always remain in such a branch. This means you cannot delete, or rename branches, because you would be changing the history of the commits on those branches. 
although I recall but cannot find a better statement.

(Yeah: Mercurial's obsession with immutable history tends to get in the way of clear thinking.  But HGers (huggers?) rewrite history all the time, eg via rebase. So imagine that we are talking about a hypothetical VCS that wants to keep some of the good things about CVS and HG and BZR style branches.)

So: branch names need to be deleted and renamed.  It would also be nice to be able to hide the branch names and the branch contents. But probably more important, branch names may need to be reused.  And quite likely different developers may want to have different branches that have the same name, i.e. different branches with the same name may need to be distinguished, especially if simultaneously active.

Below, I go on about naming conventions for contours (a set of file versions), and branches. E.g. names that are permanent, e.g. a contour name RELEASE-2017-06-15-03h13UDT_AFG, versus floating LATEST-RELEASE. Or a branch name, like a task branch BUGFIX-BRANCH-ISSUE#24334, versus a more longlasting branch or stream R1+BUGFIXES-MAINTENANCE-BRANCH.

Insight: whenever you are tempted to put uniqifying info like date or unique-number in a name, you are thinking about versioning.

Wait!  We are talking about version control systems!!!  VCSes are all about uniqifying different objects with the same name!   For that matter, so are hierarchical directory structures.  And so on, e.g. object labels and tags.

==> How to distinguish different branch objects with same name.

==> Encourage actions that helpm distinguish branches.

E.g. instead of saying "switch to branch BBB", where BBB is created if it does not already exists,

Prefer "create new branch BBB" which may warn you if name BBB already exists.

==> PROBLEM:  tools that might simply go "merge branchname" might now have to say "branchname is not unique - which instance of branchname do you want to merge?"  Yet another error case, but not necessarily a real error, just an ambiguity.

How do we resolve such ambiguities?

a) query

b) priority - eg. PATH for hierarchical. Choose the branch that is "closest" to the guy doing the merge.

---+ Contours? Who needs Contours?

"Contour" is my old RCS-era name for a set of file versions.  

"Whole repo" VCSes don't need contours, since any commit implies state for all files.

Except... when a project is assembled from multiple repos.  Even here, VCSes that have subrepo support usually are smart enough to include the commit or checkin or version number of all subrepos in their top level commit.

But .. subrepos don't scale all that well.  E.g. not so much for my personal library, where each directory node should be considered seperately versionable.

---+ SVN / Perforce Style branches

As I say elsewhere:
VCSes are all about uniqifying different objects with the same name!   For that matter, so are hierarchical directory structures.  
SVN and Perforce subsume the former in the latter: branches are really just trees in the hierarchical directory structure.

...Pros/cons. Workspaces assembled from multiple branches.   Where does the branch level live? "Floating"

---+ Git Branch Descriptions are not first class

[StackOverflow 2012 - git - pushing branch descriptions to remote]  The description is stored in the config file (here, the local one, within your Git repo), then, no, branch descriptions aren't pushed. Config files are not pushed (ever). See "Is it possible to clone git config from remote location?"

Simple text files are, though, as my initial answer for branch description recommended at the time.
Branch descriptions are all about helping make an helpful message for publishing. Not for copying that message over the other repos which won't have to publish the same information/commits.
 I can't criticize the guy who provided this answer, VonC, because earlier he discussed exactly this issue, proposing using text files to hold pushable branch descriptions - in exactly the same way that I have hacked branch descriptions before in other VCSes, and with exactly the same problems.

Using text files to hold branch descriptions is potentially an example of what I might call a file that wants to cross branch boundaries.  Or, a workspace that is mostly branched, but which usually contains the mainline of the branch description text file.

Sure, you may not always want that.  But it is nice to be able to do so.

---+  [StackOverflow 2009]

[StackOverflow 2009]: Git glossary defines "branch" as an active line of development. This idea is behind an implementation of branches in Git. ... The most recent commit on a branch is referred to as the tip of that branch. The tip of the branch is referenced by a branch head, which is just a symbolic name for this commit.

A single git repository can track an arbitrary number of branches, but your working tree (if you have any) is associated with just one of them (the "current" or "checked out" branch).

GLEW COMMENT: I have often wanted to create working trees which are composed of several branches. Yeah, yeah - you can simulate this by merges - but I want to make it convenient. 

E.g. say that a particular configuration = mainline of most code, but the FOO branch of some library libFoo.   Yes, this is almost equivalent to saying that the this configuration is really all the FOO branch - but it provides more information, in saying that "Yes, the configuration is FOO specific, but in general we expect only the libFoo library to be different with FOO."  

My thoughts on partial checkins and checkouts often involve this. More, partial repositories. Referencing tools and repos that have separate version ciontrol systems.  libXXX may be checked into its own repo in isolation as that repo's mainline.   But from the point of view of some other tool that uses XXX, say T,, libXXX's mainline is not T's mainline. Yet(?).  A partial checkin of libXXX amounts to creating a CANDIDATE for T's mainline.  Once the candidate is tested, it becomes T's mainline, assuming tests pass.  But if tests fail, T's version of libXXX may lag, or may fork and diverge from libXXX's mainline.

This notion of "candidate" maps well to Git's model.  Such a candidate is just a HEAD. Once tested, the candidate label may go away, and no longer clutter our listings of branches and tags and other named references.

---+ [Contreras 2011] and [Contreras 2012] 

[Contreras 2011] and [Contreras 2012] provided good comparisons of the Git and Mercurial branching mechanisms,.  But Contreras is fairly rabid about git, and makes many statements of the form "Which would anyone ever need to do it in that way?  There's a different way to do it in git. Or, you should not need to do it - I never have." That sort of statement pisses me off, even when I agree with it.

[Contreras 2011] Reacting to Google’s analysis  comparing Hg with Git, that says that History is Sacred.
This was an invalid argument from the beginning. Whether history is sacred or not depends on the project, many Git projects have such policy, and they don’t allow rebases of already published branches. You don’t need your SCM to be designed specifically to disallow your developers to do something (in fact rebases are also possible in Mercurial); this should be handled as a policy. If you really want to prevent your developers from doing this, it’s easy to do that with a Git hook. So really, Mercurial doesn’t have any advantage here.


(1) I agree: it MUST be possible to change history. 

(1.1) Or at least to be able to remove some things from the history, e.g. it must be possible to remove code that you do not have a license for, that was inappropriately checked into your repo.  Or possibly code that you HAD a license for at some point in time, but for which the license expired.

I would prefer it if the code with license problems was removed, but some sort of note left behind.  Possibly an automated note, e.g. with a crypto checksum/hash and other metadata, sio that you could determine what the missing code should be if you ever again have a license.

But I can also imagine the need to hide one's tracks: to completely expunged all mention of the unlicensed code.  Trying to avoid lawsuits.

(1.2) Plus, I like the good history rewriting stuff like rebase.

(1.2') Even better if we can change our view of the history, without losing the history

BUT...  I really would prefer that rebase did not lose history.   I think that it can sometimes be useful to know that a branch started off with a different original base, and was rebased later.  If nothing else, it can explain bugs caused by the rebased code using an idiom that was otherwise eliminated between original base and the new rebase's origin.  I think of this as an original branch, and a rebase'd shadow of that original branch.

Yes, clutter:  But I think that we need to create a UI that hides such clutter, that presents only the clean history, but which remembers all the dirty details.

[Contreras 2011]   It’s all in the branches ... Say I have a colleague called Bob, and he is working on a new feature, and create a temporary branch called ‘do-test’, I want to merge his changes to my master branch, however, the branch is so simple that I would prefer it to be hidden from the history.

GLEW COMMENT:  so hide it already.   Hide = leave in the history, but don't show it by default.  As opposed to removing it from the history.

[Contreras 2011]  hg branch != git branch In Git, a branch is merely one of the many kinds of ‘refs’, and a ‘ref’ is simply a pointer to a commit. ... In Mercurial, a branch is embedded in a commit; a commit done in the ‘do-test’ branch will always remain in such a branch. This means you cannot delete, or rename branches, because you would be changing the history of the commits on those branches. You can ‘close’ branches though. As Jakub points out, these “named branches” can be better thought as “commit labels”.
GLEW COMMENT:  Key: git branches are just refs.  Specifically, the ref to the tip of what other models call a branch.  AFAICT there is not much distinguishing a git branch from other refs.  
There should be different types of ref.   E.g. a named ref, i.e. a VERSION of all files.   Some VERSIONS are intended to be fixed, immutable - e.g. "Passes-all-tests-date-YYYY-MM-DD-HH". Other VERSIONS "float" - e.g. "Passes-all-tests-LATEST".   
But such a version named ref is very different from a branch.  A branch is a set of versions, that probably have some parent-child relationship. I.e. a (contiguous) path through the DAG.
[Contreras 2011] In Mercurial, a branch is embedded in a commit; a commit done in the ‘do-test’ branch will always remain in such a branch. This means you cannot delete, or rename branches, because you would be changing the history of the commits on those branches. 
Bullshit. Obviously Mercurial has history rewriting tools, that can do things like deleting or renaming branches.
But, an important point underlies the git-centricity:  Mercurial records the branch a commit was made on in the commit metadata.  By default.  Obviously git can also do this - see [StackOverflow 2015 - add Git branch name to commit message] - but it does not do so by default.
"By default" matters.  One of Glew's Rules: First provide the capabilities. Then design the defaults. Git may provide the capabilities.  But many properties are implicit, convention, in git.  Not first class.
And, yes, branches may need to be renamed. (Although as usual I would like to be able to rename, but also remember the old name).   For gitters that have added branch names to the commit message, you could edit all the commit messages.  But if the branch name is typed metadata, standardized, it could be automatically recognized and renamed.
GLEW COMMENT:  since Git's "branches" are really just the tips of a branch, the set of versions on the branch is really the set of ancestors. Whereas Mercurial's branches, labelled in the commit history, indicate path taken for different reasons.
[Contreras 2011]  I paraphrase: "Mercurial bookmarks are like git refs (bit with no namespace support)."

One poster said that "Mercurial really wants a linear history".   But the git advocates' examples often rewrite a nonlinear history like
to a linear history
Seems to me like the gitters want a linear history, and delete (not hide) the non-linearities.
TBD: put an example of what I mean: messy history, and linearized "clean" view.
GLEW COMMENT: I was pissed first time I created a task branch in git, and then merged. In CVS and Mercurial (and probably others) I expected and wanted to see a node on the master saying "merged task branch".  Even if there had been no intervening changes on the master.  Instead git just pointed the master's HEAD to the task branch - i.e. the task branch lost its identity.  Better have done [StackOverflow 2015 - add Git branch name to commit message] !!! - if the task branch name was the bug number.  (Yeah, yeah, you can just add a hook.  Everything can be hooked. Yeah, yeah.  (That's an example in English of a double affirmative being a mocking negative.))

(Eventually learned about Git --no-ff, disabling "fast forwarding" on merges.)

[Contreras 2012] The fundamental difference between mercurial and git branches can be visualized in this example:
Merge example
In which branches is the commit ‘Quick fix’ contained? Is it in ‘quick-fix’, or is it both in ‘quick-fix’ and master? In mercurial it would be the former, and in git the latter. (If you ask me, it doesn’t make any sense that the ‘Quick fix’ commit is only on the ‘quick-fix’ branch)
In mercurial a commit can be only on one branch, while in git, a commit can be in many branches (you can find out with ‘git branch --contains‘). Mercurial “branches” are more like labels, or tags, which is why you can’t delete them, or rename them; they are stored forever in posterity just like the commit message.
GLEW COMMENT: Yes, this is a key difference.
We might talk about branches and sub-branches.  'Quick-fix' is a sub-branch of 'master'.
There might be branches  or paths that start off in the 'master' branch, and end up in 'some-other-branch'. Such a "crossing-branch" is not really a sub-branch at all.
In fact, a branch that is merged and then terminated is no longer a branch at all.  At least, in trees, branches usually do not start off low down in the trunk, and then merge back into the trunk.  Although this can be arranged by grafting.  Hortitorture.
I would like to have better terms.  "Streams" can diverge and recombine, but "streams" are too dynamic. "Paths" may be a better term, although paths can be bidirectional, and version control systems usually go forward in time. Paths can fork and merge.  Paths may be created out of distinct stepping stone nodes.
(Hmm: railway "tracks" might be even better than paths. Similarly bidirectional. Tracks can fork and merge. Tracks can be shunts. Sidings. Tracks have railway ties => rather like nodes.)
(Or possibly roadways. Networks of one way streets.  Side streets, dead ends, cul de sacs.  Mutiple lanes, that may be divided - rather like the parallel streams we so often see. Service roads running beside major highways. ...)

(Later: perhaps "routes", as in rock-climbing?  Rock-climbing routes are usually, mostly, one-way.  Although I like downclimbing, most people rappel down; and down-climbing is different enough that down-climbing routes are frequently not the same as up-climbing routes.)

(Or, how about ski-trails?  Again, mostly one-way, downhill in this case.)
But "branches" are the term most people use.  Even though many people have different ideas about what a branch means.
So back to talking about branches and sub-branches. 'Quick-fix' is a sub-branch of 'master'. 'Quick-fix' is one of the two paths that lead from the initial commit to the head of the master path above. The checkin "Quick fix" is on the branch(path) "quick-fix", and leads to the node "Merge branch quick-fix" on branch "master".
AFAICT git has no concept of a branch, a path, a contiguous directed linear subset of nodes, versus te set of all nodews/paths leading to a node.
Much of [Contreras 2012] amounts to confusion about these concepts.
And then piling on immutability, Mercurial's recording branch in the commit metadata.

Merge example
GLEW COMMENT: the way this graph is drawn is biased towards git's model, where the branch is designated by its youngest node.   TBD: draw with 2 or more nodes on each path.  Color the sets of nodes on each path as the branch.

[Contreras 2012] Anonymous heads are probably the most stupid idea ever; in mercurial a branch can have multiple heads. So you can’t just merge, or checkout a branch, or really do any operation that needs a single commit.
One of GLEW'S OBSERVATIONS: the most important thing is to be able to give something a name.  The next most important thing is to not be required to give it a name.
Mercurial's anonymous heads can be a pain.  Just like arithmetic zero.

[Contreras 2012] Git forces you to either merge, or rebase before you push, this ensures that nobody else would need to do that

[Contreras 2012]
I didn’t ask for a list of all the commits that are currently included in the head of the branch currently named ‘release’ that are not included in the head of the branch currently named ‘master’. I wanted to know what was the name of the branch on which the commit was made, at the time, and in the repository, where it was first introduced.
How convenient; now he doesn’t explain why he needs that information, he just says he needs it. ‘git log master..release‘ does what he said he was looking for.
Pissant arrogance, lack of of imagination. Here's an example of why you might want the branch name: some workflows put a BugFix#, Issue#, or ECO#, in the branch name.
Sure, there are other ways to do that, both in git and other VCSes.
But: it's a convention, as are, usually, those other ways.
Here's another way of thinking about compatibility between VCSes: it would be nice if procedures and concepts ported.  It would be nice if you could import from, say, Mercurial to git, and then export back to Mercurial, and get (almost) exactly the same repo.

Some articles and references

[StackOverflow 2009] := StackOverflow: Pros and Cons of Different Branching Models in DVCS

[Brad 1998] := Streamed Lines: Branching Patterns for Parallel Software Development TBD notes

[Contreras 2011] := Mercurial vs Git - It's All in the Branches.  Nice overview, although Git biased.

[Contreras 2012] := No, mercurial branches are still not better than git ones; response to jhw’s More On Mercurial vs. Git (with Graphs!)

[StackOverflow 2015 - add Git branch name to commit message]

 [Stackoverflow 2009 Jakub] :=  Git and Mercurial – Compare and Contrast - much liked by [Contreras 2011] TBD - notes

TBD: J. H. Woodyatt’s blog post 
Why I Like Mercurial More Than Git More On Mercurial vs. Git (with Graphs!)

[StackOverflow 2010 - Branch descriptions in git] - especially interesting to me because, along with mention of the then new branch description feature, VonC discusses shortcomings of that feature, and use of text files as a not-really-satisfactory but possibly better alternative.

Thursday, June 01, 2017

Minimize AHK GUI window to taskbar (not desktop, not tray)

Took me a long time to find this, so posting for the benefit of future-forgetful-me, and possibly others.  (But not motivated enough to post to AHK fora or Stackexchange.)

I have an AutoHotKey script that creates several, possibly many, AHK gui windows.  Number not known in advance. Dynamically. When user asks.

PROBLEM: when script first started running, minimizing these GUI windows minimized to the desktop. In a manner familiar to old X Windows users, from the days predating taskbars and docks, etc.

It did NOT minimize to the taskbar.

Nor did they minimize to the tray.

They minimized to the desktop.

Screen clipping:

You can see the generic AHK taskbar button and the icon for my guix-sebp.ahk script , which I manually pinned to the taskbar.

But you also see the littl;e non-taskbar minimized windows.  Here it happens to be above the taskbar - in actuality, they might appear randomly, in the midle of a different display. Augh!!!

Many web.questions asking how to minimize to tray, I did not find anyone talking about minimizing to taskbar.

Much posting about hooking the XXGuiSize() function called when the AHK GUI windows are resized, maximized, or minimized.  But XX here is the GUI window name.  Since I don't know in advance how many such windows an instance will have, nor their names, I would have to generate such functions on the fly.

There does not appear to be a way to have all GUI windows share a single GuiClose() hook.


The way to minimize to taskbar:
GUI: "+E0x40000 would add the WS_EX_APPWINDOW style, which provides a taskbar button for a window that would otherwise lack one. "  'via Blog this'


Thursday, May 11, 2017

Review of my Surface Book with Performance Base

Overall I like my SurfBook, my Surface Book with Performance Base. But ...

I purchased it in late January, but delivery was delayed until late February.  I did not really start using it until late March, and only made it my main machine in April, replacing my old MacBook Pro Retina 15" mid-2014. Reason for delay in starting to use: projects at work - the delay in shipment missed a window of opportunity.

I suspect that delivery was delayed because I wanted the 1TB SSD.  To be honest, I actually wanted to purchase an ordinary, non-book Surface with a 1TB SSD, but that seems not available. :-(


Overall, I am happy - I rate it 4.5 out of 5.

Happier than I was with my MacBook

I love convertible touch tablet.

I hate the hinge. The hinge scares me.

I hate the fact that the pen keeps falling off.

Pre-Purchase Rationale

Why 1TB?
  • This is my main machine. I am not a big streaming video user or anything like that, but I do play around with OSes:
  • I am currently using 461MB, Windows, Cygwin, and I have barely started, not fully installed, Ubuntu / Windows Subsystem for Linux
  • On my MacBook I had 756MB in use. Much of that was the Parallels VM to run Windows apps like FrameMaker.  I was able to reduce dramatically, when I migrated. 
  • Nevertheless, buying a new laptop with a fraction of the diskspace seems retrograde, a time-waster
  • I hate disk wars
Why 1TB SSD?
  • Do I really need to explain?
  • I did consider non-SurfBook convertibles, some with 1TB rotating disk.  I could not find a reasonable hybrid configuration with a "large enough" SSD cache.
Why switch away from MacBook?
  • No touchscreen for MacBook. I love touch, I love tablet, I love pen. I considered the MacBook with TouchBar, but it is not big enough, and apparently not easily customizable 
  • Work makes me use Windows, for FrameMaker in particular.  Using the Parallels VM was always a hassle.
  • Similarly, Microsoft Outlook runs better on Windows, even though available on Mac and iPhone. Features such as conversation mode are only fully supported on Windows.  
  • I will miss MacOS being a real Unix-family OS.  Historically I have used Cygwin on Windows, but within the first few days it was obvious how much slower Cygwin was for things like starting shells than MacOS. Hence my interest in Ubuntu / Windows Subsystem for Linux, although its unsupportedness is a worry.
Why not Linux? ChromeBook? etc 
  • I want touchscreen/tablet. I like pen.
  • Windows definitely seems to be the leader in convertible laptop / touch tablet.  Especially Microsoft Surface and Surface Book, but also other Wintel manufacturers like Dell and HP.
  • ChromeBook not available as a convertible tablet, AFAIK.  Who wants a touchscreen clamshell that cannot act as a tablet?
  • Uncertain how good Linux support for touchscreen convertibles is.  I will probably try when this SurfBook nears EOL.
Why not an iPad?
  • I frequently use my machine without network connectivity. It must be freestanding.
Why tablet / convertible?
  • I really, Really, REALLY want a portable computer that I can do use for real work on an airplane in economy.  (Not a problem for rich people who can fly non-sardine classes.)
  • The problem is the touchpad, which adds 3-4 inches of mostly unnecessary depth.
  • Certainly not my 15" MacBook.  Even 13" clamshell not so good.  
  • Whereas with a tablet I can fold away the keyboard, and get stuff done.  Screen keyboard not as nice as real keyboard, but can do a lot just by touch and pen.
  • I also sometimes use a tiny separate keyboard on plane, with Surface or SurfBook in tablet mode.
  • BTW, the non-book Surface is, surprisingly, not so good on plane. Kickstand and touchpad on cover take up too much depth. Have tried touchpadless keyboards with a slot to hold Surface...
Why touch?
  • I have long used "GUI Extenders" to increase my (in)efficiency with apps like Outlook email. E.g. keyboard and mouse shortcuts, and systems of menus and buttons, for commonly used commands. Sometimes joystick and game controller shortcuts.
  • On MacBook I used apps such as Quadro, which allow an external iPad to provide touchscreen buttons for MacOS.  I also wrote my own (Python AppleScript), using Duet Display to give me a touchscreen for my MacBook.   It was a pain to have to deal with the external iPad - more to carry. Plus, although I liked Quadro, it was obviously consumer grade software, not power user friendly.  No version control. No diff. Etc.
  • I have written AutoHotKey gUIx SEBP (Graphical User Interface eXtender, Self-Editing Button Pad).  Does most of what I used Quadro for, plus is real software that I can manage.  And the touchscreen is always present.
  • Right now, I am working while walking on my treadmill desk, with my SurfBook in outward facing mode (clamshell, screen reversed) on a tray above my keyboard, with 3 external monitors.
  • I also use my ahk-guix-sebp on the Surfbook by itself. I split the screen, with Outlook occupying 80% of the width to the left, and my touch buttonpad on the right few inches.
External Monitors:
  • Did I mention that my MacBook could only handle two external monitors?   While my SurfBook can handle 3 external monitors, just like as well as the Thinkpad touch that I used before the MacBook. (30" 2560x1600 mini-DP, with 2 1200x1920 24" on either side using USB display adapters; + SurfBook LCD 3000x2000 between keyboard and 30")
  • This was a surprise - I would have expected Apple to be better at multi-monitor support.

And now, the rest of the review story

Most of the above were my pre-purchase rationale, with some minor feedback on usage.

Review after almost a month of use.

+ SurfBook is good on plane
   + First time I have been able to empty my Outlook Inbox on a plane in years

+ Touchscreen button pads work nicely
   + Using AHK (AutoHotKey)

- I still hate the Surface Book hinge, that does not close completely
  - I am constantly worried that it will get crushed in backpack.

+ I like the fact that the SurfBook hinge works without kickstand
   + I constantly use sitting in a chair with clamshell on my knees, in situations where the kickstand Surface was inconvenient
   + A few days later: spent much of the day sysadminning my daughter's non-book Surface Pro 3. Drove home how it is nice to  adjust the screen angle on the SurfBook.

- Cannot detach display from power/base when power is low or off

   I have come to hate and fear the following error message, when I try to detach the SurfBook screen to reverse it:
Tablet Battery is LowPlease charge the battery now and try detaching later.
(which I would clip and insert, except Google Blogger won't let me - I think because of system font size) 
This usually happens when I have been using the SurfBook as a clamshell laptop at kitchen table, and then want to plug in at treadmill - since I use in reverse clamshell at treadmill.

Or it happens when I have been using as a tablet, folded over on top of keyboard, and I want to switch back to being able to use as a laptop.

Or it happens when I have run out of battery, and want to plug it in to use it.

Moral: I often want to flip display when power is low, but cannot detach when power is low.

I almost never use the display detached without the keyboard attached. 

But Microsoft seems to assume that the only reason to detach is to use keyboardless.  Not just to reverse the display.

- When using in tablet mode, I tend to prefer landscape to portrait. And I prefer to use with keyboard attached. With hinge away from my body (because of its thickness), display edge opposite hinge resting against my body. But the SurfBook's two edge buttons live on the edge that presses into my body - power, and whatever I have bound volume up/down to. You can imagine the problems - powering off by accident, etc.  
+ Not a problem if keyboard / power base detached :-).
- But then, I have to carry my backpack around. :-( 
- When the "power base" / keyboard is detached, I worry that the male connector "prongs" are exposed and likely to break, since at a funny angle. Likely to catch on something and break (and then MS will blame me, rather than their industrial designer)
+? Overall, I think this detachable hinge might work almost when left at a desk
- But it doesn't work for me, since I am constantly carrying it between home and work.

- As everyone knows, Windows 10 "tablet mode" GUI is suboptimal. Very poor use of pixels
   - also, not available when connected to multiple displays
   - oddly enough, I think that I most want tablet mode when connected to multiple displays
   + I have bought a 3rd-party tiling window manager...
- But the Windows "desktop mode" is really bad for touch
   - buttons are far too small, far too likely to mishit
+ My ahk-guix-sebp buttonpad helps me use the touchscreen in desktop mode
- but it would be nicer not to have to write code for everything I want to touch in desktop mode
+ At least I *can* use touch mode in clamshell / laptop physical configuration

+ I like the SurfBook magnetic pen
- I dislike that pen is constantly falling off.  E.g. when in my backpack.
- I usually find it after a few days, but seldom have it when I want it.  E.g. on airplame.
- The stick-on loops that came with the pen for my daughter's non-book Surface also failed after a few months.
- I wish that it had a physical dock or slot for the pen.  Possibly in the space left because the frigging hinge doesn't close?

+ I use clamshell laptop mode a lot (display facing keyboard)
+ I use reverse clamshell mode a lot (display facing away from keyboard, angled circa 75 degrees)
+ I use tablet clamshell mode moderately (display facing away from keyboard, closed to hide keyboard)
   - see note about power button getting hit by accident
- I almost never use "detached" screen from powerbase/keyboard

- I dislike Windows Hello face recognition login.  I do not trust it. In fact, I disable the cameras in the BIOS and tape them over.
- I prefer fingerprint, like on the old Surface Pro 4 cover, or on my iPhone and newer Macs.
- But all biometrics are problematic, security risks. Fingerprints can be lifted. Face recognition can be faked out by photos or masks.
- I want fingerprint as an easy way of keeping myself logged on - e.g. set a short timeout to locking the screen, that can be unlocked with a fingerprint.  But I want to be required to enter a longer password at least once a day, or every few hours.
- Face recognition could be used same way, even less intrusively than fingerprint. But the face recognition camera can be used to spy on you.  Not just a privacy risk - it can probably see enough muscle movement to infer your password.

- Similarly, microphones are also a security risk. They can also infer keystrokes, e.g. passwords.
- I started off disabling both cameras and microphones. I disable Cortana voice recognition.
- But I have grown to like some MacOS "say" commands, integrated in my shell. Mainly, to alert me when a long running shell command has finished.
- Unfortunately, SurfBook BIOS cannot enable speaker output and disable microphone input. (I know, any speaker can also be used as a microphone. But I would still like separate disables. And HW that did not allow input from speaker.)

Irony: I have one of the earliest patents on the webcam, but I would prefer my computer not to have one. I'll go further: laptops and tablets should not have cameras and microphones: leave that to the phone. Phones should have Faraday cage cases that block sound, light, and EM.

'via Blog this'

Yes, really, I did (predict IME security bugs)

I usually try to resist the temptation to say "I told them so".  Saying "I told you so" doesn't make you popular with the people you told.  It is better to move forward and get things done, with the cooperation of those people, than it is to be proven right but shunned.

But in th case of the Intel IME / IAMT / ARC embedded CPU inside the master CPU/chipset, I am going to give in.  Because, yes, really, I did predict such security bugs. 

Darn...  I wrote a moderately long diatribe.  But loxt it because "Blogger is not a content management system, and does not proviode veersioning."  (I could swear blogger used to version - perhaps I am misremembering Google docs.)


Many folks believe that an "assistant" embedded processor is the answer to every problem in computer architecture: I/O; DMA; block memory copies; ... manageability; security.


The "assistant" embedded processor often has a less evolved security architecture than the "master" CPU.  Processor hardware, OS, SW.   Often the reason that the assistant embedded processor is more efficient is that it has thrown out advanced security. E.g. virtual memory page protection. E.g. privilege levels.

Oftentimes, the people advocating such "assistant" embedded processors say "It doesn't matter, we are only using it in narrowly constrained ways.  We can prove the code secure."  Yeah, right.

Over time, such "assistant" processors have added more and more security features.

But even when the assistant embedded processor has security features comparable to the master CPU, it still sucks that it is different.

By construction there will be more code in such a system to code review and security audit.  Even if functional code is the same, there is boot code.

Worse, the assistant and master may have comparable but slightly different security models.  E.g. one may have page permissions RWX-RW-RX but not execute only or read-only, the other may have all the relevat permissions RWX-RW-RX-R-X. Just a small difference, but it can matter.

Worse yet, there is a limited supply of experts for code reviews and security audits.  Even for one system, typically the master.   Even more so if they must be expert in both master and assistant processors.  Moreover, even if you can find such experts, it is damned hard to keep both of them in your head at the same time. It is common to say "oh, that's okay on x86 but not on ARC (or MIPS, or ARM, or your favorite embedded processors)", and later realize that you have it wrong.  Or vice versa.

Worse yet if simuilar but slightly different OSes and SW librraies are available on master and assistant.

Why heterogenous, given risks?

Given the security risks of heterogenous systems with different master and assistant processors, why do them?

Apart from the "my embedded processor is inherently more efficient than your master CPU" wars - which are usually wrong, although possibly true in certain instances (certainly for specialized I/O processors such as GPUs and DSPs) - the biggest reason to have embedded processors separate from the master CPU is:

OS independence.

If the embedded processor provides a high-enough-level API that it can be used without substantial modification for many different OS clients, then you may have less code to maintain (and code review, and security audit).

Furthermore, the very functions that you wish to provide using the assistant embedded processor may need to be protected from the OS.  E.g scanning for malware.

Plus, in a minor way, the fact that the assistant embedded processor is running a different ISA than the mass-market main x86 CPU makes it a little bit les likely that Joe Random Cracker has the skillz to break in.  But I would not count on that.  (Besides, you can get ISA encoding diversity whike preserving ISA semantic consistency. See below.)


Dedicate parts of Multiprocessors: I like the way that IBM mainframes do manageability: in an LPAR, typically one of the mainframe CPUs is reserved to run the manageability tasks.  Same CPU architecture.  Sometimes even the same OS architecture, although running with hypervisor (MVME) privilege.

You could have different microarchitectures with same ISA and privilege architecture.   But many companies cannot afford to create such a diversity of microarchitectures, while customers may want to license only one instance of the popular but most expensive. (IMHO licensing should encourage such consistent, rather than discourage.)

Shared memory is a security hole:   We live in a world of web services, SAAS (Software As A Service) - get used to it. If the heterogenous "assistants" in your SOC interface via message passing, or even as if over a local TCP/IP network (hopefully a high performance, NATed, local network) - well, at least you are in the range of things that we should know how to secure.  Even though we often fail.

Trouble is, this only works for services that can be decoupled via message passing.  It doesn't work for a service that requires access by the assistant embedded processor to master CPU memory, e.g. for virus scans of active DRAM, or for DRAM deduplication.  But it can work well enough for things like virus scanning of data as it flows from NIC to ME to CPU. If it flowed that way...

Message passing not fast enough?  We know how to make it faster.  Zero copy tricks.   Direct modification of master CPU page tables.   Hardware TLB shootdown mechanisms.  Easier when all parties have similar virtual memory architecture.
      Trouble is, most modern CPUs can only vaguely hope to accelerate message passing interfaces - via a "Smart" embedded assistant processor.
      In some ways I advocate standardizing message passing, possibly using risky shared memory tricks - so as to reduice the need to use such shared memory tricks to interface to other "Smart" devices.

Even higher performance message passing directly accessing processor registers.   But that is even harder to do heterogenously, although I can imagine APIs.

But...  the infamous Intel AMT bug occurred because the "smart" embedded processor was running a poorly secured webserver.  Yep... there's no helping it.   Although we should know how to secure a webserver, we keep f**king up.
      But...  a bug in an SOC embedded webserver woud not have been so bad if isolated to a subsystem that could only access 1 NIC in a multi-NIC system. It is the fact that the subsystem that was compromised has such global access, to everything, more than an OS or hypervisor - that is the really bad thing.

IMHO the Principle of Least Privilege is not just a good idea - it should be the law.  As in, if a company designs a system with blatant disregard for the PLP, and if somebody sustains losses as a result of a security compromise, then they should be liable. And more.
     All of these mitigations are really just ways of saying "Principle of Least Privilege".
If you have to allow the "smart" assistant processors shared memory access to master CPU memory: Hardware level firewalls. Standard control and status register space and enumeration, so that you can verify that they are correctly configured.   Unchangeable hardware IDs as well as software IDs on bus transactions, so that you can easily create invariants like "the network packet filter is never allowed to directly access the audio microphones".  (Of course, you might want to allow the NIC to access the microphones for ultra-low-power voice conferencing. And the packet filter, for NSA-like stuff.)

Hardware Page Protection Keys:  associated with physical addresses, or at least the addresses seen on the bus. In addition to virtual address based protection.

Memory Encryption:  may not prevent access by malware running on an embedded processor to CPU storage, or vice versa.  But may prevent secrets leaking, and prevent malware crafting attack packets, whether code or data, to compromise further.\

I suppose that I should also mention capability systems for privilege management, both OS/SW, but possibly also hardware to shared memory. Fine grain.   But I butted my head against that wall, most recently with what became Intel MPX.


"Smart" assistant embedded processors have security risks.  

We really want to prevent "smart" assistant embedded processors from being "smart-ass".  Or, worse, "evil genius".

Intel IAMT bug: strncmp(trusted,untrusted,strlen(untrusted))

Intel's embarrassingly negligent IAMT bug seems ... easy to imagine how it happened.

Embedi's analysis shows that the bug is in code that looks like
if( strncmp(computed_response, user_response, response_length) )
using the user_response length to limit the length of the string comparison, rather than the expected length (which I believe is, in this case, constant 16, 128-bits, the length of the MD5 hash used in Kerberos authentication).

Immediately I thought of
strncmp(computed_response, user_response, strlen(user_response))
which inspired the riff below

Pay no attention to the riff below

Embedi's writeup indicates that the user_response is actually
    char *str;
    int len;
which probably invalidates any supposition that strlen may have one time been involved.

Nevertheless, the riff is fun, and although inaccurate in detail, probably has some aspects of truth.

Riffing on strcmp(trusted,untrusted,bad_length)

When I heard and saw
strncmp(computed_response, user_response, response_length)
Immediately I thought
    user_response, strlen(user_response))
I imagined the original code was
strcmp(computed_response, user_response)
I guessed that some "security audit" might have said
Security Auditor: strcmp is insecure, this code must be changed to use strncmp
Might not have been a human security audit.  Might have been a secure-lint tool.


The programmer who made the change to use strncmp looked around for a size_t to use as the maximum length to compare.  If the strings were "naked" null terminated C strings, he may just have guessed wrong, choosing the second rather than the first.

 Embedi's writeup indicates that the user_response is not a naked C string, but is actually
    char *str;
    int len;
Hey, wait, there's a length here we can pass to strncmp!  It just happens to be the user response length. The wrong string length.


Perhaps it was not obvious (to the programmer making the change) what the buffer length of the computed_response is.  BTW, it is really a buffer length, not a string length.  It might be declared as
typedef uint32_t MD5_hash[MD5_SIZE_IN_WORDS]
typedef MD5_hash[MD5_SIZE_IN_BYTES]
Or it might have been malloc'ed.

The code might have made some provision for changing eventually to a non-MD5 hash, so you would not want to hard-wire the MD5_SIZE into the code. The code doing the comparison might not need to be aware that the hash involved was MD5.


Possibly the hypothetical strcmp code was actually more secure than the strncmp code - so long as both hashes were guaranteed to be null terminated.  So long as the user response was guaranteed not to overflow any buffer allocated for it.


But come to think of it, optimized network code probably does not copy the user response from the header into a separate newly allocated string.  It probably sets AUTH_HEAD_VALUE.str to point directly into the buffer containing the headers.

(Or at least the buffer containing part of the headers.  If the headers are split into several buffers... well, that's a bug that has been seen before.)

So, it is probably not "naked null-terminated C string" data.  Probably not:
strcmp(computed_response, user_response)
But if it were, then
strncmp(computed_response, user_response,
            max(strlen(computed_response),strlen(user_response)) )
might have been better.  Really that is equivalent to strcmp - but at least it might silence the security audit tool's warning about strcmp. Replacing it by equally annoying warnings about strlen instead of strnlen.


But - the code audit / lint tools that might have triggered this may not have been security oriented. They may have been using a buffer overflow detector like valgrind or purify. These may have warned about read accesses beyond the memory allocated to hold the hashes.

Strictly speaking, neither strcmp not strlen need to perform buffer overflow read memory accesses, if given properly sized null terminated C string arguments.  But... if an "optimized" strcmp or strlen is used, it is common to process things 4 or 8 bytes at a time - the number of bytes that fit into a 32-bit or 64-bit register - in which case the code might read beyond the end of the memoryu allocated for the string, past the terminating null byte.  In past lives, when I was more the "low level assembly code optimization guy" rather than a security guy, I have written such code.  It is hard to write optimized strcmp and strlen code that doesn't go past the terminating null, that still runs faster than doing it a byte at a time. Even fixed length strnlen, strncmp, bcopy, memcpy are hard to write using registers wider than the official granule size, without going past the end.  Which is one reason why I advocate special instruction hardware support, whether RISC or CISCy microcode.

Examining my motivation

When I heard about the IAMT bug, my first reaction was "I told them so - I told them that the ARC embedded CPU would lead to security bugs."  Yes, really, I did (see another post).

But I also said to myself "That's what you get when you have the B-team working on an important feature".

Face it: at Intel, at least when the manageability push started, the A-team worked on CPUs, hardware and software.  The guys working on chipsets and I/O devices were usually understaffed and in a hurry.

I don't like thinking such uncharitable thoughts. Moreover, chipsets and I/O devices are more and more important.  One of the smartest guys I know told me that he decided to go into chipsets because there were too many guys working in Intel CPUs, too much bureaucracy, while in chipsets he could innovate more easily.

So I imagined these scenarios, about code reviews and security audits and lint tools
Security Auditor: strcmp is insecure, this code must be changed to use strncmp
and making changes in a hurry, as a way of imagining how such bugs could have happened.

Doesn't excuse the bugs.  But may be more productive in determining how to prevent such bugs in the future, than simply saying (as I heard one security podcaster say): "This is unimaginably bad code.  One wonders how Intel could ever have allowed such code to pass code reviews and security audits.  One wonders if one should ever trust Intel CPU security again."


My overall point is that code reviews, security audits, and tools such as security lints or buffer overflow detectors, may have triggered code changes that introduced this bug to originally correct code.

This is no excuse.  It is even more important to review code changes than original code, since bug density is higher.

Of course, it is possible that the bug was present since the code was originally written.

'via Blog this'