Michael Roberts
September 28, 2024
Status: Draft
This is a proposal for a simple scheme for coordinating actions between pinball simulator programs like VPX, and "front-end" menu systems, such as PinballX, Pinup Popper, and PinballY. It's designed for inter-operation between any simulator and any front end that wishes to implement the mechanism. The goal is to smooth out some common rough edges in the interactions between front ends and simulators that arise because of way the front ends currently have to "trick" the simulators into doing what they want by simulating user input actions. This proposal aims to replace that rather jury-rigged, ad hoc approach with a well-defined protocol for requesting specific actions from the simulator.
The motivation was the frequent problems I hear about from PinballY users trying to get VPX to close properly when exiting a game. PinballY and other front ends act as replacements for the Windows desktop shell, their main function being to launch games. The front ends all aim to create a "kiosk" type of environment that hides the Windows desktop UI from the user, and a big part of that lies in controlling the lifetime of the launched program through the front end's UI. The obstacle has always been that the Windows desktop doesn't think in terms of controlling launched programs' lifecycles; the desktop's involvement a program's lifecycle ends as soon as the program is launched, and from that point on, the desktop just thinks of the program as a peer running alongside it asynchronously. There's no concept in the Windows desktop of "returning to the shell". Instead, it's up to the user to close the program through the program's user interface when they're done with it. Given that the desktop shell has no need for a programmatic action to "close the launched program and return to the shell", Windows doesn't define any APIs along those lines at the system level. So the front ends have to resort to rather crude ad hoc approaches, some of which have to be tailored to each individual program they know how to launch. In some cases, the launcher can simply close the program's windows, but in other cases the launcher has to send it bespoke sequences of synthetic key events, or even invoke the nuclear option of TerminateProcess().
This proposal takes a pretty obvious approach to solving this: given that Windows doesn't define a standard way for one program to politely ask another program to exit, let's just define one of our own, as a published inter-process communication protocol that any simulator can implement, and that any front end can invoke. The first command in the protocol is one asking the simulator to "please exit now", but as long as we're at it, we can make the protocol extensible to other common points of interaction between simulators and launchers, to handle other common tasks in a more structured and reliable way.
The term "inter-process communication protocol" makes this sound like something scarily complex, but the scheme here is really ridiculously simple. Don't be put off yet.
The core of this scheme is a Win32 "registered window message" that a pinball simulator can handle in its message loop, and which launchers can send to the simulators they launch. Windows provides a mechanism that allows an application to register a custom window message by name. Registering the name assigns the message a 16-bit integer identifier within the WM_xxx message space, in a sub-range reserved for this purpose. The name is a string that serves as a globally unique ID for the message. It's meant to have the same degree of uniqueness and permanence as a GUID, but it doesn't have the rigid format of GUID; it's just an arbitrary string chosen by the application developer. The integer ID assigned upon registration is arbitrarily chosen by the operating system, but once registered, it's associated with the name string globally for the duration of the Windows session (that is, until the next system reboot), so unrelated processes that register the same name will get the same numeric ID back. This allows one process to send the registered message to a separate process, with certainty that the receiving process will understand the message ID as having the same meaning.
Registered window messages are specifically designed for inter-process communications between applications that don't need to know anything about one another except for the name and semantics of the custom message. Any application can use SendMessage() or PostMessage() to send the named message ID to any window from any other application, knowing that the message ID is uniquely defined across the entire system. Receiver applications process the named message in their window procedures, alongside the usual set of WM_xxx messages.
Registered window messages are extremely simple for the sender and receiver to use. The registration is accomplished with a single API call, and there are no associated resources to manage. The receiver can handle the message with one additional case in its main window procedure switch on message ID. All of this makes the mechanism easy to retrofit into an existing application, with zero architectural impact; the only changes needed are the new code related to the new message handling.
A major limitation of the registered window message as an inter-process communications mechanism is that it can only send a very small amount of data per message. The payload is limited to the WPARAM and LPARAM message parameters, which are only 32 bits each on an x86 Windows system and 64 bits each on an x64 system - so the maximum usable payload to remain compatible with all Windows systems is only 64 bits total. Importantly, SendMessage() doesn't marshal pointers across process boundaries for custom messages, so the WPARAM and LPARAM can't be used to pass pointers to structures or even strings; all you get is the two integer slots. It's possible for cooperating applications to use the WPARAM/LPARAM as seed data for some additional IPC mechanism that can exchange larger objects, such as shared memory or pipes, but anything like that has to be defined at the application level; it's not part of the registered message scheme itself. For our current purposes, though, the WPARAM/LPARAM payload is sufficient, and we can always elaborate it in the future with additional layers if the need ever arises.
To allow for easy future extensions to this proposal without the need to add new registered messages, we define our message structure so that the WPARAM contains a sub-command code, which specifies the concrete action that the simulator is meant to perform. The LPARAM is left as an argument value whose meaning varies by sub-command.
The sub-command WPARAM codes are pre-defined, so that all participating applications can agree on the meanings of all codes.
See Names and Numbers below for a list of sub-command codes.
The front-end program (e.g., PinballX, Pinup Popper) sends the pinball simulator program the registered message through a designated open window. The simulator might have multiple windows, though, so the front end must figure out which window to address as the handler.
The front end figures out which simulator window handles the message as follows:
Simulators aren't obligated to handle the message through more than one of their windows. For example, VPX might wish to handle it only through its main designer MDI frame window. If a simulator does handle the message in multiple windows, it shouldn't matter which window the front-end program addresses for a given message, since most of the messages are meant to have a global effect on the whole simulator application. (The command description will say otherwise if we ever come up with messages that are meant to have effects confined to a specific target window.)
"PinSim::FrontEndControls"
A registered message name has essentially the same purpose as a GUID, of serving as a permanent, universally unique, public identifier. A GUID gets its uniqueness from randomness: a developer rolling up a GUID chooses a large number of bits at random to form a GUID, and the format contains enough bits to make the "birthday problem" odds of any two developers accidentally choosing the same GUID vanishingly small across all time and space. Registered message are chosen by hand by human authors, rather than at random, and they're generally chosen to be printable and somewhat descriptive, but like a GUID, they must be chosen to make the odds of a collision with other developers' names effectively zero. This can best be accomplished by making the string fairly long, and by choosing a name that contains plenty of descriptive information on at least a few axes, such as the originating organization or author name, the product name, and the specific function for whatever's being identified. It's become somewhat conventional in such hand-picked unique names (which show up in other contexts, such as Java packages) to use Internet domain-like naming that includes the originating organization's domain name as a qualifier. In our case, we use the abstraction "PinSim" rather than any particular organization, but the intention is the same. (And even though Visual Pinball is the only simulator that's currently contemplating implementing this, we thought it better not to imply through the name that this is a Visual Pinball-specific feature, since it's meant to apply equally well to any other simulator that wishes to implement it.) We use the C++-like double-colon notation to suggest a namespace related to pinball simulation, although the notation means nothing special to the Windows registration mechanism, which treats the whole string as an opaque series of bytes with no internal structure.
This command queries whether or not the target window processes the registered window message. If so, the window procedure returns 1.
We plan to use the return code as an interface version number. If a set of new commands is added in the future, we'll bump this to 2 to indicate that the new commands are available, as well as the original commands. The front-end caller can check the return value to determine if the simulator will accept the newer commands from the v2 set. Adding another new set of commands after that would bump it to version 3, and so on.
Note that not implementing this message in the window procedure, by letting it fall through to the system default window procedure, will have the effect of answering "not supported" by returning 0. The system default window procedure always returns 0 for an unhandled message, so windows that don't handle the registered message don't need any handling for it. This default handling also means that a front end can try querying an application's windows for participation even if the application isn't known to be a participant, because Windows conventions require applications to ignore messages in the registered message range that they don't subscribe to, by simply returning 0.
This command closes the simulator application. It should end the game, close all simulator windows, and exit the simulator process.
If possible, the program termination should proceed asynchronously from the message handler. That is, initiate the termination and then immediately return from the message handler. If the front end wants to wait for the program to fully exit, it can either wait for its last window to close, or wait for the process to terminate.
Bring the game window to the foreground and take focus, if possible. The front end sends this to the simulator when it wants to switch back to the game after another application (such as the front end itself) was temporarily brought into the foreground, interrupting the game.
This message lets simulator select which of its windows should have focus when the game is brought back into the foreground after another program takes focus temporarily in the middle of a game session. Simulators often display multiple windows while a game is running, and some of the game windows might belong to other cooperating processes (this is the case with VP). This makes it all but impossible for a front end to guess which window should have focus when the game is in the foreground. The simulator, in contrast, usually knows exactly which window should have focus during play. This command eliminates the need for jury-rigged heuristics in the front end to figure out which window to activate, by delegating the decision to the competent authority.
This command is only likely to work when sent by the parent process that launched the simulator, and only while the parent process itself is in the foreground (or when the simulator is already the foreground process). This is because Windows restricts the use of the SetForegroundWindow() API, which is the typical way that the simulator would accomplish the requested context switch programmatically, to certain conditions, one of which is that the current process or its parent is currently in the foreground. If the front-end program's process didn't launch the simulator directly, it might not be possible for the simulator process to effect the foreground window change on its own while the front end is currently in the foreground. In that case, the simulator can use QUERY_GAME_HWND to obtain the window handle to the simulator's main game window, and perform the SetForegroundWindow() call itself, which will succeed as long as the front-end program is currently in the foreground, no matter the relationship between the two processes. The basic logic seems to be that a program is allowed to cede its place in the foreground, by moving any other process to the foreground while it's in control, but it can't seize foreground status when it's in the background, with the special exception that a child can take the foreground from its parent process whenever the parent is in the foreground. The restrictions are obviously intended to prevent an application from suddenly snatching away focus while the user is in the middle of work in another program, and the special exception for parent/child processes is probably made on the theory that such process pairs are tightly coupled enough that they'll only make such switches among themselves when it's appropriate for the UI state in both programs, which is certainly true for our use case.
Note that front ends aren't expected to send this message during launch. Front ends generally assume that Windows will automatically bring the simulator to the foreground at launch as part of the normal OS-level application launching procedure, and that the simulator will make sure that focus lands in the correct one of its windows when the game starts. We mention this only to clarify that simulators shouldn't rely on receiving this message as part of the normal game startup process.
This message lets the front-end program obtain the window handle to the simulator's main game window, for operations such as switching the game window to the foreground. If a game is in progress, the simulator should return the window handle (HWND) of its main player window. This is the window that should have input focus during normal game play, and it's the same window that the simulator would bring to the foreground in response to a GAME_TO_FOREGROUND command.
If no game is in progress, or there's no suitable HWND to designate as the foreground window, return 0 (a null HWND).
This command is meant to serve as an alternative to GAME_TO_FOREGROUND for situations where the caller (the front-end program) wants or needs to handle the application context switch itself. One case where this is necessary is when the front end doesn't have a simple parent/child relationship with the simulator process, because Windows doesn't allow a background process to move itself into the foreground unless it's a child of the current foreground process.
Note: Windows HWND objects are system-wide global values that have the same meanings across all processes on the local machine, so these can be freely passed across process boundaries with SendMessage() without any marshaling. (This is contrast to file system HANDLE objects, which are only meaningful within a single process.)