Date: Mon, 24 Feb 2014 13:33:56 -0500
Post by Eli Zaretskii
I don't think this matters. This convention is used between Guile and
itself: in scm_kill we tell the process being killed to terminate with
status of 0xC0000200 + SIGNAL, and then we extract the signal number
in status:term-sig. So this is just something private to Guile, [...]
I don't like this. I think our 'status:*' procedures should follow a
There is none that I'm aware of.
or simply punt on it if there's no convention.
Why punt if we can do better? The code I proposed works for me and
others for many years in ported Findutils and elsewhere. I hate it
when a ported program loses features for no good reason.
What exactly is the nature of your objections to using this working
"When an unhandled fatal signal terminates a process,
the exit code is 3."
If this is wrong, can you help me understand what they might have been
thinking when they wrote it?
They were thinking about a single use case: a program that calls
'abort'. Such a program on Windows exits with a status code of 3.
But that is not very interesting in the case in point, and besides,
some programs exit with code 3 for other reasons. One such example is
wget, and there are others.
More to the point, when a Windows program is terminated with an
unhandled fatal exception, it rarely if ever terminates via 'abort'.
"The signal that terminated a process is not known posthum."
Well, since signals don't really exist on Windows, it's hard to tell
anything about this statement. On Windows, a program is terminated by
an exception, not by a signal. And that exception is visible in the
exit status of the process. For example, the exception equivalent to
SIGSEGV (which is also reported as SIGSEGV by GDB on Windows) causes
the exit status of 0xC0000005 to be returned. These 0xC... values are
documented on Microsoft headers, and we can use them to report the
equivalent of fatal signals. Thus the code I submitted.
and on that basis, they make WTERMSIG(x) simply return SIGTERM. To my
mind, this seems better than making up our own convention that nobody
See above: it is not something I invented.
The only invention here is the 0xC0000200 + signo thing, see below.
and assuming that the subprocess is another guile process.
No, this is a misunderstanding: the subprocess does not have to be
Guile in any way. The second argument passed to the TerminateProcess
function is the exit code to be used by the process being terminated,
So by passing this value to TerminateProcess, we guarantee that the
subsequent call to waitpid will return to us this very number, and
allow Guile to report that the program was indeed killed with the
signal we intended. IOW, this value is not interpreted in any way by
the subprocess, it is simply returned back to us.
If you think that the Gnulib sys_wait module is "dead wrong for
Windows", then please post a message to the Gnulib mailing list and make
your case. If there's truly a problem, then we need to get it fixed in
Gnulib. I'd also like to see how they respond to your claims.
These are not "claims", this code works. I tested it, and not only in
As for talking to gnulib maintainers, I see no reason to do that until
the other issues I submitted there start moving in some constructive
direction. Given the examples I've shown you, I have all but given up
on talking to them.
Post by Eli Zaretskii Post by Mark H Weaver
For example, if it is true that we can avoid all the nasty problems with
filename encoding by using the native Windows APIs that use UTF-16 for
filenames, then I'd be in favor of doing that.
What nasty problems do you have in mind? I implemented something like
this in Emacs not long ago, so perhaps I can help here.
The nasty problem is that POSIX uses sequences of bytes for filenames,
although conceptually filenames are character strings, and in fact
virtually all user interfaces treat filenames as character strings.
I will respond to this in a separate message.
Post by Eli Zaretskii Post by Mark H Weaver
On the other hand, if a program _does_ try to do one of those things, it
might be important that the job be done right. For example, a program
might be trying to reduce its privileges. Doing nothing while silently
pretending that the operation succeeded is a good way to cause security
flaws to go unnoticed, or to make a programmer waste hours of time
trying to debug a problem that he should have been notified about
immediately with an error message.
We are talking about operations that have no equivalent meaning on
Windows. they only have meaning in Posix worldview. Why does it make
sense to fail an operation that is meaningless? What useful purpose
could this possibly serve, except having a subtly broken Guile module?
As I said before, useful purposes include (1) to alert the user to a
potential security flaw
Windows does that automatically, no need for Guile to do anything.
and (2) to save someone from a possibly long
debugging session when they should have gotten an error message.
Not sure I understand what you mean here. In what way is this
different from (1) above?
In general, I think it's a serious mistake to assume that what the user
asked to do was unimportant, and can therefore be silently ignored.
It is not unimportant, but it simply makes no sense on Windows. You
cannot change your user or group ID on Windows without typing that
user's password and firing up that user's interactive environment as
Does Windows really not have a concept of different security contexts,
with different privileges?
It has, but those concepts are implemented in a way that does not fit
into the Posix scheme. Privileges are requested and granted one by
one, not by changing the UID. When an operation is a privileged one,
and the user doesn't have that privilege granted, Windows
automatically pops up an elevation prompt. Moreover, the program
cannot elevate itself, except by re-executing, which I hope you will
agree is not a good idea for Guile. There's no way that I know of to
remove a privilege from the set that is granted to the process by the
OS, except if the process acquired such a privilege by issuing a
system call from application code; you certainly cannot do that by
changing your UID. Etc. etc.: the Windows concepts of these are so
alien to what is in Guile now that it is impossible to have them, if
at all, without a complete redesign of how the program treats these
privileged operations. I hope you aren't going to ask for that,
because IMO it would be an unreasonable request.
There are dozens if not hundreds of good ports of Posix software to
Windows which make these operations no-ops. They do this for a good
reason: privileged operations are generally reserved on Windows for
system-level chores, such as installing software. Users don't like to
get elevation prompts while running a "normal" interactive program.
So I don't understand why Guile should be different from all the rest.
I find that hard to believe, but if it's true, then I could perhaps
be convinced that we should make up a uid/gid when the user asks for
If you want Guile to returns a true UID/GID on Windows, it's possible.
I can write that code, sure. But they are not simple scalar numbers
on Windows, they are variable-length arrays of several potentially
very large numbers (the so-called SIDs, or Security Identifiers), you
can read about them here:
I hope you will agree with me that using these SIDs where Guile
expects small integers is unwieldy. So we will have to somehow pack
these arrays into scalars, and invent a scheme that will not cause 2
different SIDs map to the same number. Or we could use only the last
component of the SID, and accept the risk of collisions every now and
Of course, Posix-style functions like 'stat' don't returns these SIDs
in the st_uid and st_gid members of 'struct stat', so if Guile would
like to check whether a given file belongs to the current Guile user,
it will think that it doesn't. So we will need to replace 'stat' with
a version that does return these UID/GID numbers. And that is just
the tip of an iceberg.
But suppose we did all that -- what does this gain us in practical
terms that my simplistic solution doesn't? 500 and 513 are the last
components of the "well-known SIDs" of the Local Administrator user
and the "None" group; in the vast majority of cases, a Windows user on
her private machine will use these UID and GID, so in many cases this
is the exact result. In other cases, it is not really correct, but
will this really hamper Guile? I very much doubt that.
By contrast, what you suggest will either (a) gain us marginally more
accurate numbers that will sometimes still be incorrect, or (b) give
us the full SID support, but for a price of much more coding and
refactoring all over the place -- all this when most Windows users
don't even know their user and group SIDs and never saw them. I'm
sorry, but this makes very little sense to me.
It's not that I'm terribly smart; I simply faced these problems in
different programs and resolved them many times. In some cases, it
indeed made sense to go some of the way and implement parts of the
above; in others, it doesn't. If it turns out that Guile indeed needs
more elaborate emulation of these Posix concepts, believe me I'm well
equipped to implement them. For now, my analysis indicated that no
such complications are required. I hope you will accept my judgment,
or describe specific use cases that contradict it.
However, I will strongly oppose turning setuid or setgid into silent
I'm sorry to hear that. Perhaps what I say above will convince you.
Or maybe you could simply trust me, as someone who worked on these
issues for several years, even if you are not entirely convinced. If
not, I guess I should stop submitting Windows-related patches to
Guile, because we obviously have irreconcilable disagreements that
will continue fueling prolonged arguments such as this one.
I've since learned more details about what he did. He built
pthreads-w32 2.9.1 <https://www.sourceware.org/pthreads-win32/>
from source code. He also built libgc-7.2e.
As I said, I've built GC as well. I will try to build pthreads-w32
when I have time, and see if that helps. Thanks for the info.
Post by Eli Zaretskii Post by Mark H Weaver
If you will be comfortable with a lot more Widows specific stuff, I
can work on that.
OK, but please don't block this patch while you wait for me to do
that. It is certainly much better to have this functionality with
restrictions than not at all.
I'd rather wait until we have a proper implementation that works with
I'm sorry to hear this.
Please try to look at this from my POV. I submitted these changes in
June, 8 months ago. I re-submitted them again now because Ludovic
specifically asked me to do that. And look where I am after again
investing some non-trivial effort: back where I was 8 months ago. I
am once again asked to go through gnulib, which didn't work then, and
my patches are put on hold. This is all very disheartening, I must
say. Sorry, but I no longer understand why I was asked to re-submit
the patches and why I was given write access to the repo. I thought
it was because the patches are going to be accepted with minor
changes, now I'm confused.