Related Blog Pages
The original page, http://blog.andy.glew.ca/2016/01/dynamic-keypad-user-interface-elements.html, is more of a shopping project - a page where I take notes, a sort of informal review. Posted in the hope that others may find useful.It soon became obvious that I was very hopeful about Quadro, a particular iPhone remote control facility, with multiple buttons, whose main claim to fame seems to be using AppleScript/OSA commands. I was hopeful this might make it more reliable than the many, many, macro facilities I have used in the past. Perhaps; but Quadero still has reliability problems. Split into a separate page,
Quadro - I love it when it works, but it is often unreliable.
http://blog.andy.glew.ca/2016/01/scripting-raw-io-macros-versus.html is a somewhat generic discussion of the pros and cons of the various levels of abstraction found in such tools. One of my big frustrations with Quadro, second to reliability, is that I cannot drive all user interface actions because Quadro does not seem to provide all raw keyboard and mouse events.
http://blog.andy.glew.ca/2016/01/notes-and-thoughts-about-dynamic-keypad.html for still more generic thought.
For the umpteenth time, realized that I should have written this in wiki rather than blog. Googlesb blogspot/blogger blog really does not provide much support for pages that need to evolve by being split and otherwise refactored. Let alone transclusion.
Software Running on the Target
Many, IMHO most, such remote macro facilities require some special software "server" running on the target machine.This scares people: because such software can do pretty much anything. It is security vulnerability. Especially for closed source software.
Such concerns are somewhat allayed for software such as Android Keypad, where the software running in the target is just well-known software such as the VNC - essentially a remote terminal facility.
In theory, users could monitor the VNC traffic, and verify that no commands they do not want are being inserted.
In practice, I suspect that few users do this.
Even more specific: the client could appear to the target as an ordinary Human Input Device (HID), connected via USB or BlueTooth. Once again, in theory a monitor could be inserted: in practice, few probably do.
Communications Channel
A communications channel is needed to connect the extended input device to the target.It could be a standard I/O protocol, such as USB or BlueTooth. Probably with the constraints of a "profile", such as a HID (Human Interface Device), constraining what actions can be performed.
But: the extended input device can also benefit from feedback from the target, e.g. to monitor errors, or to do what Quadro does, showing what appears to be an OSA/AppleScript menu command hierarchy to the user. Constraining to a HID profile might prohibit this.
An arbitrary, device defined, protocol can be created. But then users get scared off by the security implications.
A bidirectional protocol such as VNC could be used. Standard, inspires a bit more confidence about security. But VNC is not defined at a level that makes it easy to detect higher level semantics such as "command finished".
"Extended Input Device"
I am beginning to use the term "Extended Input Device" for the class of devices I am talking about here. Just because I need a term to think about. Not restricted to physical hardware: could be software. Not restricted to software running on a client device physically separate from the target: could be software running on the same OS. Not restricted to input - the "extended input device" client may want to be able to receive feedback from its "server" on the target, for purposes such as handling errors, decided what optimized commands to send, etc. But certainly, for the purposes of this discussion my main emphasis is to produce new, alternative, interface devices. Possibly auxiliary.(Conversely, VNC is both input and output, but essentially extended existing HIOD, rather than creating new types of HID.)
Multithreading
A single extended input device client may talk to a single server on the target.Multiple client devices may talk to the same server. E.g. multiple keypads, a small iPhone keypad, and a larger iPad. Extends to speech, gesture, etc., extended input devices.
The "single server on the target" may talk to multiple applications, running concurrently.
For these reasons it is probably important that the target server be multithreaded, or otherwise capable of handling asynchronous concurrent events.
Similarly, if the client interacts, receiving replies such as for errors, it needs to be able to disambiguate replies from multiple asynchronous target applications.
I conjecture that lack of this sort of multithreading/multiplexing capability is one of the reasons that my trial of Quadro has experienced so many problems. On OSes such as MacOS-X, error dialog boxes probably disrupt the expected sequencing of a single threaded conversation between client, target server, and target app.
Perhaps blocking a single thread is acceptable.
But it would be nice to be able to switch to a different thread, so as to be able to close such dialog boxes, kill and restart target server, etc.
Interestingly, "send and forget" macro packages may be more robust in this context. Raw input event macros are unreliable, since errors may result in raw input events being sent to unexpected conswquences. But raw input events can always be sent, and will always do something - so long as they can be sent to a thread that is not blocked, and so long as they are not blocked in a stalled queue along the way.
Another aspect of reliability is the queue structure.
ç
The sort of thing that the X Windows system provides. The sort of thing that used to be provided by some character terminal packages on serial lines, modems, etc., so that you could run several ttys across a single phone line.
VNC, however, is not multiplexed like this. Therefore using VNC as the communications channel between client and server is a step backward.
Essentially, this is the sort of thing that networking, TCP/IP, provide. Or that SSH allows, so that you can do X Windows across an SSH connection. Essentially, a tunnel.
So, why not use existing SW packages, like SSH, for such tunneling? Why think about making it part of the "extended input device client-server communications channel link"?
Mainly, ease of use. It's a pain to have to set up SSH separate from the extended input device, whether for security or multiplexing. Even more of a pain to have to manually connect different instances.
Moreover, the sort of extended input device system we are contemplating is dynamic. Essentially every remote application is a separate thread, requiring its own multiplexing, and thread handling on either end.
Therefore, while it may be fine to reuse existing software such as SSH internally, it would be bad to reuse at the application level, and require the user to set up such connections manually.
The extended input device client/server should be setting up such a multiplexed channel, forking it as needed, and otherwise provided management facilities. At the lowest level, all traffic may indeed go through a separate client and server thread at either end, but such a receiver should drain its buffers as quickly as possible and forward on to the actual worker threads. In a non-blocking manner, potentially discarding input transmitted if the receiver does not have room to accept.
Conceptually, multiplexing IP, TCP, or SSH across such a communications channel might be easiest in terms of reusing existing code. Of course, the overhead of such full-ish network tunneling might be too much.
TBD: are there any other multiplexed protocols, reasonably standard, that could be leveraged?
TBD: when we talk about SSH or TCP/IP, we usually imagine that we are communicating across wired Ethernet or WiFi. This is good: such networks are reasonably ubiquitous. But it would also be good to be able to network across BlueTooth or USB. Of course people already send TCP/IP across such links, but there may also be suitable multiplexed protocols that can also be used.
CURRENT THOUGHTS:
- multiplex across SSH and TCP/IP most general
- possibly SSL / TLS
- may also want to talk more primitive, possibly mostly unidirectional, HID (Human Interface Device) BlueTooth/USB - just to get ubiquity.
SSL / TLS: can imagine the server on the target system exposing an HTTPS interface. ??????
Security, Authentication
Quadro: the QuadroSync server on the target just provides a TCP/IP address and port for the client to connect to. Scary!!! :-( The Keypad app connects via low security VNC. Of course, should send VNC across SSH. IIRC at least one of these apps provided a primitive two factor: displaying a code on the target, to be typed into the client, or vice versa.
I think that I definitely want such an extended input device to have such a separate authentication channel. And to similarly have nonced encrypted communications, set up by default.
One "advantage" of using BlueTooth is that the existing BT security infrastructure can be used. Although BT has had security problems, surely they have solved them, so that wireless BT mice and keyboards can be used securely.
One advantage of using wired USB is that at least wireless man-in-the-middle attacks are impossible.
Error Handling
Always a pain.
Want to be able to reset parts - an individual app, etc. - without having to reset whole.
If an app spawns am error dialog, ideally would not have input stream go and press buttons in that dialog, unless the dialog was expected. Or possibly direct, and hope that nothing causes dialog to do bad stuff.
Want to be able to create a separate client-server thread to go and handle the dialog errors. Probably default discarding any pended input blocked by dialog, although sometimes such dialogs allow pended macros to cvomplete once some buttion like "are you sure" is pressed.
KLUGE IDEA: scripting interfaces like AppleScript provide lousy handling of errors, and of things like startup delays L: any script that contains "delay(5s) /* give app chance to start*/" is IMHOP bad.
If the script interface does not support, may be able to kluge by sending simply observable commands - such a s a command that opens an expected dialog, that we can look for, on successful completion".
Sketching a Protocol
Across any channel on the multiplexed link, we want to be able to
- send raw input events
- sequence of keystroke modifiers keys
- ditto mouse events
- possibly at low level key down/up events
- commands, whether
- locally unique menu commands
- menu paths
- or fully functional, sufficiently unique, commands
Each such being addressed to a particular thread and/or a particular multiplex.
I highly suspect that it would be good to identify recursive, nested, transaction boundaries, for error handling. Not because I expect the be able to provide atomicity, but so that extra queued up commands can be discarded when an error occurs.
New threads/multiplexes are created whenever talking to a new application. Possibly to each instance of an application, each window, etc.
Requests sent one way may get replies sent the other. Because want non-blocking by default, transaction IDs to match them up. Unidirectional may not require replies, but should be able to handle unexpected replies.
No comments:
Post a Comment