Discussion:
[PATCH v2] Improved ^c support for gdb/guile
(too old to reply)
Doug Evans
2014-02-17 20:59:22 UTC
Permalink
[+ guile-devel]
Date: Mon, 17 Feb 2014 15:26:27 -0500
Unworkable-as-is optimization trying to avoid queueing asyncs. Blech.
I'm still seeing intermittent testsuite failures because Guile is
getting an uncaught SIGINT.
Bother: is this the only way to fix these issues (whatever they are)?
Because you are quickly getting into non-Windows lands, so there's a
real risk that the Guile support will not be very useful on Windows
(except with Cygwin).
The Guile developers are looking into providing a better way to do
this for 2.2 so all is not lost. :-)
+void
+gdbscm_initialize_sigint (void)
+{
+ siscm_sigint_pipe[0] = siscm_sigint_pipe[1] = -1;
+
+ if (!SCM_USE_PTHREAD_THREADS)
+ {
+ warning (_("Guile does not have pthreads support."));
+ warning (_("Proper SIGINT handling for Guile will be unavailable."));
+ return;
+ }
The above is what worries me. Guile currently doesn't work in the
native MinGW build if configured with threads (it crashes, hangs,
etc.). Can't we have decent SIGINT handling without pthreads?
With 2.0.x, no.
I'm ok with changing the warning, e.g., not printing it at all on
systems where it would otherwise always be printed, and instead
documenting the issue for such systems.

The downside is that while Scheme code is running SIGINT is ignored
(unless one is in the repl, or sets up a SIGINT handler oneself).
For lots of uses of Guile in GDB it's not a total loss.
Eli Zaretskii
2014-02-17 21:13:57 UTC
Permalink
Date: Mon, 17 Feb 2014 12:59:22 -0800
+void
+gdbscm_initialize_sigint (void)
+{
+ siscm_sigint_pipe[0] = siscm_sigint_pipe[1] = -1;
+
+ if (!SCM_USE_PTHREAD_THREADS)
+ {
+ warning (_("Guile does not have pthreads support."));
+ warning (_("Proper SIGINT handling for Guile will be unavailable."));
+ return;
+ }
The above is what worries me. Guile currently doesn't work in the
native MinGW build if configured with threads (it crashes, hangs,
etc.). Can't we have decent SIGINT handling without pthreads?
With 2.0.x, no.
I'm ok with changing the warning, e.g., not printing it at all on
systems where it would otherwise always be printed, and instead
documenting the issue for such systems.
The downside is that while Scheme code is running SIGINT is ignored
(unless one is in the repl, or sets up a SIGINT handler oneself).
Ignored why? because GDB sets the handler to SIG_IGN? Or for some
other reason?
Doug Evans
2014-02-18 00:37:30 UTC
Permalink
Post by Eli Zaretskii
Date: Mon, 17 Feb 2014 12:59:22 -0800
+void
+gdbscm_initialize_sigint (void)
+{
+ siscm_sigint_pipe[0] = siscm_sigint_pipe[1] = -1;
+
+ if (!SCM_USE_PTHREAD_THREADS)
+ {
+ warning (_("Guile does not have pthreads support."));
+ warning (_("Proper SIGINT handling for Guile will be unavailable."));
+ return;
+ }
The above is what worries me. Guile currently doesn't work in the
native MinGW build if configured with threads (it crashes, hangs,
etc.). Can't we have decent SIGINT handling without pthreads?
With 2.0.x, no.
I'm ok with changing the warning, e.g., not printing it at all on
systems where it would otherwise always be printed, and instead
documenting the issue for such systems.
The downside is that while Scheme code is running SIGINT is ignored
(unless one is in the repl, or sets up a SIGINT handler oneself).
Ignored why? because GDB sets the handler to SIG_IGN? Or for some
other reason?
A better way to phrase that is the SIGINT is deferred until the call
out to Guile returns.
Why? Because Guile's SIGINT handling in 2.0.x requires a separate
thread: that's how all async signals are handled in Guile.
ref: guile-2.0.9/libguile/scmsigs.c
I'll let guile-devel take over at this point - I understand the code,
but may miss something noteworthy.
There is code in scmsigs.c to handle the non-pthread case but it's not
clear how much is exported nor how well it works.
Ludovic Courtès
2014-02-18 11:20:39 UTC
Permalink
Post by Doug Evans
Post by Eli Zaretskii
Date: Mon, 17 Feb 2014 12:59:22 -0800
+void
+gdbscm_initialize_sigint (void)
+{
+ siscm_sigint_pipe[0] = siscm_sigint_pipe[1] = -1;
+
+ if (!SCM_USE_PTHREAD_THREADS)
+ {
+ warning (_("Guile does not have pthreads support."));
+ warning (_("Proper SIGINT handling for Guile will be unavailable."));
+ return;
+ }
The above is what worries me. Guile currently doesn't work in the
native MinGW build if configured with threads (it crashes, hangs,
etc.). Can't we have decent SIGINT handling without pthreads?
I don’t remember, Eli: do you have patches pending review for these
issues and other MinGW issues in Guile?
Post by Doug Evans
Post by Eli Zaretskii
With 2.0.x, no.
I'm ok with changing the warning, e.g., not printing it at all on
systems where it would otherwise always be printed, and instead
documenting the issue for such systems.
The downside is that while Scheme code is running SIGINT is ignored
(unless one is in the repl, or sets up a SIGINT handler oneself).
Ignored why? because GDB sets the handler to SIG_IGN? Or for some
other reason?
A better way to phrase that is the SIGINT is deferred until the call
out to Guile returns.
Why? Because Guile's SIGINT handling in 2.0.x requires a separate
thread: that's how all async signals are handled in Guile.
ref: guile-2.0.9/libguile/scmsigs.c
Right, when Guile is built with pthread support, it has a signal
delivery thread. The actual SIGINT handler (‘take_signal’ in scmsigs.c)
just write one byte to a pipe; the signal delivery thread reads from
that pipe, and queues an async in the destination thread for later
execution.
Post by Doug Evans
I'll let guile-devel take over at this point - I understand the code,
but may miss something noteworthy.
There is code in scmsigs.c to handle the non-pthread case but it's not
clear how much is exported nor how well it works.
The non-pthread code is used when Guile is built without pthread
support. In that case, the async is queued directly from the signal
handler.

(I think we should aim to get rid of the signal-delivery thread
eventually, and I remember Mark mentioned it before too.)

HTH,
Ludo’.
Eli Zaretskii
2014-02-18 16:01:48 UTC
Permalink
Date: Tue, 18 Feb 2014 12:20:39 +0100
=20
=20
I don=E2=80=99t remember, Eli: do you have patches pending review f=
or these
issues and other MinGW issues in Guile?
I don't know, you tell me. I sent several changesets in June,
in these messages:

http://lists.gnu.org/archive/html/guile-user/2013-06/msg00031.html
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00032.html
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00033.html
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00036.html
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00037.html
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00039.html

In this message:

http://lists.gnu.org/archive/html/guile-user/2013-06/msg00057.html

you have requested a copyright assignment for applying my patches;
that paperwork was done long ago, so the changes can be admitted. I
don't know if they were, though. One thing I do know is that the
request to gnulib maintainers to include hstrerror, which I posted, a=
t
your request, here

http://lists.gnu.org/archive/html/bug-gnulib/2013-06/msg00042.html

was left without any followups.

Also, since the only way I could get a functional MinGW Guile was to
configure it without threads, I would suggest that this be the defaul=
t
for MinGW, but that isn't a big deal.
The non-pthread code is used when Guile is built without pthread
support. In that case, the async is queued directly from the signa=
l
handler.
So why cannot this code be used by GDB?
(I think we should aim to get rid of the signal-delivery thread
eventually, and I remember Mark mentioned it before too.)
Right, which raises again the question why use in GDB something that
is slated for deletion.

Btw, where does the value of SCM_USE_PTHREAD_THREADS come from? Is i=
t
something defined by the installed Guile headers?
Ludovic Courtès
2014-02-18 16:45:27 UTC
Permalink
Post by Eli Zaretskii
Date: Tue, 18 Feb 2014 12:20:39 +0100
I don’t remember, Eli: do you have patches pending review for these
issues and other MinGW issues in Guile?
I don't know, you tell me. I sent several changesets in June,
OK, will follow-up on guile-devel.
Post by Eli Zaretskii
The non-pthread code is used when Guile is built without pthread
support. In that case, the async is queued directly from the signal
handler.
So why cannot this code be used by GDB?
Because GDB uses whichever Guile is available. If the user has Guile
built with pthread support, then that’s what GDB uses.
Post by Eli Zaretskii
(I think we should aim to get rid of the signal-delivery thread
eventually, and I remember Mark mentioned it before too.)
Right, which raises again the question why use in GDB something that
is slated for deletion.
I think there’s a misunderstanding. Doug’s signal-delivery thread will
work no matter what strategy Guile uses internally. My comment above
was referring to Guile’s internal implementation of signal delivery,
which does not affect what GDB does.
Post by Eli Zaretskii
Btw, where does the value of SCM_USE_PTHREAD_THREADS come from? Is it
something defined by the installed Guile headers?
Yes, and determined at Guile configure time.

Ludo’.
Eli Zaretskii
2014-02-18 16:56:46 UTC
Permalink
.org
Date: Tue, 18 Feb 2014 17:45:27 +0100
=20
Post by Eli Zaretskii
I don=E2=80=99t remember, Eli: do you have patches pending revie=
w for these
Post by Eli Zaretskii
issues and other MinGW issues in Guile?
I don't know, you tell me. I sent several changesets in June,
=20
OK, will follow-up on guile-devel.
Thanks.
Post by Eli Zaretskii
The non-pthread code is used when Guile is built without pthread
support. In that case, the async is queued directly from the si=
gnal
Post by Eli Zaretskii
handler.
So why cannot this code be used by GDB?
=20
Because GDB uses whichever Guile is available. If the user has Gui=
le
built with pthread support, then that=E2=80=99s what GDB uses.
Sorry, I meant why that code couldn't be used when Guile was built
without pthreads.
Ludovic Courtès
2014-02-18 17:45:53 UTC
Permalink
Post by Eli Zaretskii
Date: Tue, 18 Feb 2014 17:45:27 +0100
[...]
Post by Eli Zaretskii
Post by Eli Zaretskii
Post by Ludovic Courtès
The non-pthread code is used when Guile is built without pthread
support. In that case, the async is queued directly from the signal
handler.
So why cannot this code be used by GDB?
Because GDB uses whichever Guile is available. If the user has Guile
built with pthread support, then that’s what GDB uses.
Sorry, I meant why that code couldn't be used when Guile was built
without pthreads.
Because with the patch Doug posted, both the SIGINT thread and GDB’s
main thread would call libguile.

A different strategy would need to be used when Guile lacks pthread
support.

Ludo’.
Eli Zaretskii
2014-02-18 17:59:01 UTC
Permalink
.org
Date: Tue, 18 Feb 2014 18:45:53 +0100
=20
Sorry, I meant why that code couldn't be used when Guile was buil=
t
without pthreads.
=20
Because with the patch Doug posted, both the SIGINT thread and GDB=
=E2=80=99s
main thread would call libguile.
=20
A different strategy would need to be used when Guile lacks pthread
support.
Could you perhaps suggest such a different strategy?
Ludovic Courtès
2014-02-18 23:08:27 UTC
Permalink
Post by Eli Zaretskii
Date: Tue, 18 Feb 2014 18:45:53 +0100
Post by Eli Zaretskii
Sorry, I meant why that code couldn't be used when Guile was built
without pthreads.
Because with the patch Doug posted, both the SIGINT thread and GDB’s
main thread would call libguile.
A different strategy would need to be used when Guile lacks pthread
support.
Could you perhaps suggest such a different strategy?
Something similar to Guile’s own strategy when compiled without pthread
support.

However, Doug is proposing on guile-devel an approach that may solve
that more elegantly.

Thanks,
Ludo’.
Ludovic Courtès
2014-02-18 17:32:34 UTC
Permalink
Post by Eli Zaretskii
Date: Tue, 18 Feb 2014 12:20:39 +0100
I don’t remember, Eli: do you have patches pending review for these
issues and other MinGW issues in Guile?
I don't know, you tell me. I sent several changesets in June,
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00031.html
Already applied.
Post by Eli Zaretskii
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00032.html
Already applied.
Post by Eli Zaretskii
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00033.html
open-process without fork, see below.
Post by Eli Zaretskii
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00036.html
Already applied.
Post by Eli Zaretskii
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00037.html
What about exposing %shell-command-name and %shell-command-switch as you
suggested back then, and using that in popen.scm?
Post by Eli Zaretskii
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00039.html
h_error, see below.
Post by Eli Zaretskii
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00057.html
you have requested a copyright assignment for applying my patches;
that paperwork was done long ago, so the changes can be admitted.
Yes, modulo the comments in that message: commenting where it is due,
and moving the MinGW-specific code to its own function.

Could you resubmit this patch in ‘git format-patch’ format, with a
ChangeLog-style commit log, and in a separate message?

For convenience and to hopefully smooth the process, I’ve added you to
the Savannah group. Please post here for review before pushing.
Post by Eli Zaretskii
I don't know if they were, though. One thing I do know is that the
request to gnulib maintainers to include hstrerror, which I posted, at
your request, here
http://lists.gnu.org/archive/html/bug-gnulib/2013-06/msg00042.html
was left without any followups.
Could you ping them again?

It would be really ideal for this to go into Gnulib, rather than
duplicating it in each project.
Post by Eli Zaretskii
Also, since the only way I could get a functional MinGW Guile was to
configure it without threads, I would suggest that this be the default
for MinGW, but that isn't a big deal.
Is it something that can fixed? Does libgc pass its tests on MinGW?

Please use one thread per patch, so that the discussion remains focused.

Thanks for persevering, and sorry again for dropping the ball!

Ludo’.
Eli Zaretskii
2014-02-18 18:16:41 UTC
Permalink
Date: Tue, 18 Feb 2014 18:32:34 +0100
=20
I don't know if they were, though. One thing I do know is that t=
he
request to gnulib maintainers to include hstrerror, which I poste=
d, at
your request, here
http://lists.gnu.org/archive/html/bug-gnulib/2013-06/msg00042.h=
tml
was left without any followups.
=20
Could you ping them again?
Just did, but I'm not holding my breath. Perhaps you could also add
your voice.
It would be really ideal for this to go into Gnulib, rather than
duplicating it in each project.
I agree.
Also, since the only way I could get a functional MinGW Guile was=
to
configure it without threads, I would suggest that this be the de=
fault
for MinGW, but that isn't a big deal.
=20
Is it something that can fixed?
Maybe, but I don't know how. There was a long discussion about the
symptoms, starting here:

http://lists.gnu.org/archive/html/guile-user/2013-08/msg00095.html

but it didn't get anywhere, and no one of the Guile developers was
able to help me find the reason.
Does libgc pass its tests on MinGW?
Yes, with flying colors.
Please use one thread per patch, so that the discussion remains foc=
used.

Will do.
Eli Zaretskii
2014-02-22 10:33:33 UTC
Permalink
This patch removes several "#ifdef HAVE_POSIX" conditionals that
unnecessarily prevent useful Guile functions from showing up in the
MinGW build on MS-Windows.

Remove unneeded HAVE_POSIX conditionals

* libguile/filesys.c (fsync, link) [__MINGW32__]: Redirect Posix
functions to their Windows equivalents.
(scm_chown, scm_open_fdes, scm_open, scm_close, scm_close_fdes)
(scm_link, scm_chdir, set_element, fill_select_type)
(get_element, retrieve_select_type, scm_select, scm_fcntl)
(scm_fsync, scm_symlink, scm_readlink, scm_copy_file, scm_sendfile)
(scm_dir_print, scm_dir_free): These functions are no longer
wholesale ifdef'ed away if HAVE_POSIX is not defined.
(scm_init_filesys): Don't ifdef away parts of the function when
HAVE_POSIX is not defined.

diff --git a/libguile/filesys.c b/libguile/filesys.c
index aa3e671..441ced8 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -111,7 +111,12 @@

/* Some more definitions for the native Windows port. */
#ifdef __MINGW32__
-# define fsync(fd) _commit (fd)
+# define fsync(fd) _commit (fd)
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+/* FIXME: Should use 'link' module from gnulib. */
+# define link(f1,f2) CreateHardLink(f2, f1, NULL)
+# define HAVE_LINK 1
#endif /* __MINGW32__ */


@@ -146,8 +151,6 @@



-#ifdef HAVE_POSIX
-
/* {Permissions}
*/

@@ -323,8 +326,6 @@ SCM_DEFINE (scm_close_fdes, "close-fdes", 1, 0, 0,
}
#undef FUNC_NAME

-#endif /* HAVE_POSIX */
-

/* {Files}
*/
@@ -590,7 +591,6 @@ SCM_DEFINE (scm_lstat, "lstat", 1, 0, 0,
#endif /* HAVE_LSTAT */


-#ifdef HAVE_POSIX

/* {Modifying Directories}
*/
@@ -1222,8 +1222,6 @@ SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
}
#undef FUNC_NAME

-#endif /* HAVE_POSIX */
-

/* Essential procedures used in (system base compile). */

@@ -1848,7 +1846,6 @@ SCM_DEFINE (scm_closedir, "closedir", 1, 0, 0,
#undef FUNC_NAME


-#ifdef HAVE_POSIX
static int
scm_dir_print (SCM exp, SCM port, scm_print_state *pstate SCM_UNUSED)
{
@@ -1869,14 +1866,12 @@ scm_dir_free (SCM p)
closedir ((DIR *) SCM_SMOB_DATA_1 (p));
return 0;
}
-#endif



void
scm_init_filesys ()
{
-#ifdef HAVE_POSIX
scm_tc16_dir = scm_make_smob_type ("directory", 0);
scm_set_smob_free (scm_tc16_dir, scm_dir_free);
scm_set_smob_print (scm_tc16_dir, scm_dir_print);
@@ -1945,7 +1940,6 @@ scm_init_filesys ()
#ifdef FD_CLOEXEC
scm_c_define ("FD_CLOEXEC", scm_from_int (FD_CLOEXEC));
#endif
-#endif /* HAVE_POSIX */

/* `access' symbols. */
scm_c_define ("R_OK", scm_from_int (R_OK));
Mark H Weaver
2014-02-22 14:52:06 UTC
Permalink
Post by Eli Zaretskii
This patch removes several "#ifdef HAVE_POSIX" conditionals that
unnecessarily prevent useful Guile functions from showing up in the
MinGW build on MS-Windows.
I think perhaps we should simply remove the --disable-posix configure
option in master, since it is apparently no longer needed on Windows.
Of course this patch would be part of that.

If we decide to keep --disable-posix (which should be the case on
stable-2.0 regardless), then I think we should not apply this patch.
Instead, we should just recommend that MinGW builds be done without
--disable-posix.
Post by Eli Zaretskii
diff --git a/libguile/filesys.c b/libguile/filesys.c
index aa3e671..441ced8 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -111,7 +111,12 @@
/* Some more definitions for the native Windows port. */
#ifdef __MINGW32__
-# define fsync(fd) _commit (fd)
+# define fsync(fd) _commit (fd)
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+/* FIXME: Should use 'link' module from gnulib. */
+# define link(f1,f2) CreateHardLink(f2, f1, NULL)
+# define HAVE_LINK 1
#endif /* __MINGW32__ */
Rather than including Windows-specific code in Guile and this FIXME,
let's just add the 'link' Gnulib module, as you suggest. I already have
a list of some more modules to add, so I'll do that in the next day or
so.

What do you think?

Thanks,
Mark
Eli Zaretskii
2014-02-22 15:41:48 UTC
Permalink
Date: Sat, 22 Feb 2014 09:52:06 -0500
=20
=20
Post by Eli Zaretskii
This patch removes several "#ifdef HAVE_POSIX" conditionals that
unnecessarily prevent useful Guile functions from showing up in t=
he
Post by Eli Zaretskii
MinGW build on MS-Windows.
=20
I think perhaps we should simply remove the --disable-posix configu=
re
option in master, since it is apparently no longer needed on Window=
s.
Of course this patch would be part of that.
=20
If we decide to keep --disable-posix (which should be the case on
stable-2.0 regardless), then I think we should not apply this patch=
.
Instead, we should just recommend that MinGW builds be done without
--disable-posix.
That's OK with me, but then the handfull of other instances of
HAVE_POSIX should be removed as well.
Post by Eli Zaretskii
diff --git a/libguile/filesys.c b/libguile/filesys.c
index aa3e671..441ced8 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -111,7 +111,12 @@
=20
/* Some more definitions for the native Windows port. */
#ifdef __MINGW32__
-# define fsync(fd) _commit (fd)
+# define fsync(fd) _commit (fd)
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+/* FIXME: Should use 'link' module from gnulib. */
+# define link(f1,f2) CreateHardLink(f2, f1, NULL)
+# define HAVE_LINK 1
#endif /* __MINGW32__ */
=20
Rather than including Windows-specific code in Guile and this FIXME=
,
let's just add the 'link' Gnulib module, as you suggest. I already=
have
a list of some more modules to add, so I'll do that in the next day=
or
so.
=20
What do you think?
That's OK with me, but then also perhaps add the fsync module from
gnulib.
Ludovic Courtès
2014-02-26 21:42:46 UTC
Permalink
Post by Mark H Weaver
I think perhaps we should simply remove the --disable-posix configure
option in master,
Agreed.
Post by Mark H Weaver
If we decide to keep --disable-posix (which should be the case on
stable-2.0 regardless), then I think we should not apply this patch.
Instead, we should just recommend that MinGW builds be done without
--disable-posix.
Yes.

Thanks,
Ludo’.
Eli Zaretskii
2014-02-22 10:50:51 UTC
Permalink
This patch allows the MinGW build of Guile to have the process related
functions (open-process, kill, waitpid, status:exit-val, etc.).

Implement open-process and related functions on MinGW

* libguile/filesys.c (fsync, link) [__MINGW32__]: Redirect Posix
functions to their Windows equivalents.
(scm_chown, scm_open_fdes, scm_open, scm_close, scm_close_fdes)
(scm_link, scm_chdir, set_element, fill_select_type)
(get_element, retrieve_select_type, scm_select, scm_fcntl)
(scm_fsync, scm_symlink, scm_readlink, scm_copy_file, scm_sendfile)
(scm_dir_print, scm_dir_free): These functions are no longer
wholesale ifdef'ed away if HAVE_POSIX is not defined.
(scm_init_filesys): Don't ifdef away parts of the function when
HAVE_POSIX is not defined.

diff --git a/libguile/posix.c b/libguile/posix.c
index 0443f95..69652a3 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -85,6 +85,27 @@
#if HAVE_SYS_WAIT_H
# include <sys/wait.h>
#endif
+#ifdef __MINGW32__
+# define WEXITSTATUS(stat_val) ((stat_val) & 255)
+/* Windows reports exit status from fatal exceptions as 0xC0000nnn
+ codes, see winbase.h. We usurp codes above 0xC0000200 for SIGxxx
+ signals. */
+# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
+# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
+# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_val - 0xC0000200 : stat_val)
+# define WIFSTOPPED(stat_val) (0)
+# define WSTOPSIG(stat_var) (0)
+# include <process.h>
+# define waitpid(p,s,o) _cwait(s, p, WAIT_CHILD)
+# define HAVE_WAITPID 1
+# define getuid() (500) /* Local Administrator */
+# define getgid() (513) /* None */
+# define setuid(u) (0)
+# define setgid(g) (0)
+# define pipe(f) _pipe(f, 0, O_NOINHERIT)
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+#endif
#ifndef WEXITSTATUS
# define WEXITSTATUS(stat_val) ((unsigned)(stat_val) >> 8)
#endif
@@ -674,6 +695,25 @@ SCM_DEFINE (scm_kill, "kill", 2, 0, 0,
goto err;
}
}
+#ifdef __MINGW32__
+ else
+ {
+ HANDLE ph = OpenProcess (PROCESS_TERMINATE, 0, scm_to_int (pid));
+ int s = scm_to_int (sig);
+
+ if (!ph)
+ {
+ errno = EPERM;
+ goto err;
+ }
+ if (!TerminateProcess (ph, 0xC0000200 + s))
+ {
+ errno = EINVAL;
+ goto err;
+ }
+ CloseHandle (ph);
+ }
+#endif /* __MINGW32__ */
#endif
return SCM_UNSPECIFIED;
}
@@ -720,6 +760,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
{
int i;
int status;
+#ifndef __MINGW32__
int ioptions;
if (SCM_UNBNDP (options))
ioptions = 0;
@@ -728,6 +769,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
/* Flags are interned in scm_init_posix. */
ioptions = scm_to_int (options);
}
+#endif
SCM_SYSCALL (i = waitpid (scm_to_int (pid), &status, ioptions));
if (i == -1)
SCM_SYSERROR;
@@ -736,7 +778,6 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
#undef FUNC_NAME
#endif /* HAVE_WAITPID */

-#ifndef __MINGW32__
SCM_DEFINE (scm_status_exit_val, "status:exit-val", 1, 0, 0,
(SCM status),
"Return the exit status value, as would be set if a process\n"
@@ -787,7 +828,6 @@ SCM_DEFINE (scm_status_stop_sig, "status:stop-sig", 1, 0, 0,
return SCM_BOOL_F;
}
#undef FUNC_NAME
-#endif /* __MINGW32__ */

#ifdef HAVE_GETPPID
SCM_DEFINE (scm_getppid, "getppid", 0, 0, 0,
@@ -802,7 +842,6 @@ SCM_DEFINE (scm_getppid, "getppid", 0, 0, 0,
#endif /* HAVE_GETPPID */


-#ifndef __MINGW32__
SCM_DEFINE (scm_getuid, "getuid", 0, 0, 0,
(),
"Return an integer representing the current real user ID.")
@@ -932,7 +971,6 @@ SCM_DEFINE (scm_setegid, "setegid", 1, 0, 0,

}
#undef FUNC_NAME
-#endif


#ifdef HAVE_GETPGRP
@@ -1248,6 +1286,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
return scm_from_int (pid);
}
#undef FUNC_NAME
+#endif /* HAVE_FORK */

/* Since Guile uses threads, we have to be very careful to avoid calling
functions that are not async-signal-safe in the child. That's why
@@ -1321,7 +1360,130 @@ scm_open_process (SCM mode, SCM prog, SCM args)
}
#endif

+#ifdef HAVE_FORK
pid = fork ();
+#elif defined(__MINGW32__)
+ {
+ int save_stdin = -1, save_stdout = -1;
+ int errno_save;
+
+ if (reading)
+ {
+ save_stdout = dup (1);
+ errno_save = errno;
+ if (save_stdout == -1)
+ {
+ close (c2p[0]);
+ close (c2p[1]);
+ free (exec_file);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ }
+
+ if (writing)
+ {
+ save_stdin = dup (0);
+ errno_save = errno;
+ if (save_stdin == -1)
+ {
+ if (reading)
+ close (save_stdout);
+ close (p2c[0]);
+ close (p2c[1]);
+ free (exec_file);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ }
+
+ if (reading)
+ {
+ close (1);
+ if (dup (c2p[1]) != 1)
+ {
+ errno_save = errno;
+ close (save_stdout);
+ close (c2p[0]);
+ close (c2p[1]);
+ if (writing)
+ {
+ close (save_stdin);
+ close (p2c[0]);
+ close (p2c[1]);
+ }
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (c2p[1]);
+ }
+ if (writing)
+ {
+ close (0);
+ if (dup (p2c[0]) != 0)
+ {
+ errno_save = errno;
+ close (save_stdin);
+ close (p2c[0]);
+ close (p2c[1]);
+ if (reading)
+ {
+ close (save_stdout);
+ close (c2p[0]);
+ close (c2p[1]);
+ }
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (p2c[0]);
+ }
+
+ pid = spawnvp (P_NOWAIT, exec_file, exec_argv);
+ errno_save = errno;
+
+ if (reading)
+ {
+ close (1);
+ if (dup (save_stdout) != 1)
+ {
+ if (writing)
+ close (save_stdin);
+ close (save_stdout);
+ close (p2c[1]);
+ close (c2p[0]);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (save_stdout);
+ }
+ if (writing)
+ {
+ close (0);
+ if (dup (save_stdin) != 0)
+ {
+ if (reading)
+ close (save_stdout);
+ close (save_stdin);
+ close (p2c[1]);
+ close (c2p[0]);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (save_stdin);
+ }
+
+ if (pid < 0)
+ errno = errno_save;
+ }
+#else
+ close (c2p[0]);
+ close (c2p[1]);
+ close (p2c[0]);
+ close (p2c[1]);
+ free (exec_file);
+ errno = ENOSYS;
+ SCM_SYSERROR;
+#endif /* HAVE_FORK */

if (pid == -1)
{
@@ -1330,14 +1492,28 @@ scm_open_process (SCM mode, SCM prog, SCM args)
if (reading)
{
close (c2p[0]);
+#ifdef HAVE_FORK
close (c2p[1]);
+#endif
}
if (writing)
{
+#ifdef HAVE_FORK
close (p2c[0]);
+#endif
close (p2c[1]);
}
errno = errno_save;
+
+#ifndef HAVE_FORK
+ /* The exec failed! There is nothing sensible to do. */
+ if (err > 0)
+ {
+ char *msg = strerror (errno);
+ fprintf (fdopen (err, "a"), "In spawnvp of %s: %s\n",
+ exec_file, msg);
+ }
+#endif
SCM_SYSERROR;
}

@@ -1363,7 +1539,8 @@ scm_open_process (SCM mode, SCM prog, SCM args)
return scm_values
(scm_list_3 (read_port, write_port, scm_from_int (pid)));
}
-
+
+#ifdef HAVE_FORK
/* The child. */
if (reading)
close (c2p[0]);
@@ -1412,16 +1589,16 @@ scm_open_process (SCM mode, SCM prog, SCM args)
if (err > 0)
{
char *msg = strerror (errno);
- fprintf (fdopen (err, "a"), "In execlp of %s: %s\n",
+ fprintf (fdopen (err, "a"), "In execvp of %s: %s\n",
exec_file, msg);
}

_exit (EXIT_FAILURE);
/* Not reached. */
return SCM_BOOL_F;
+#endif /* HAVE_FORK */
}
#undef FUNC_NAME
-#endif /* HAVE_FORK */

#ifdef __MINGW32__
# include "win32-uname.h"
@@ -2239,13 +2416,11 @@ SCM_DEFINE (scm_gethostname, "gethostname", 0, 0, 0,
#endif /* HAVE_GETHOSTNAME */


-#ifdef HAVE_FORK
static void
scm_init_popen (void)
{
scm_c_define_gsubr ("open-process", 2, 0, 1, scm_open_process);
}
-#endif

void
scm_init_posix ()
@@ -2338,11 +2513,11 @@ scm_init_posix ()

#ifdef HAVE_FORK
scm_add_feature ("fork");
+#endif /* HAVE_FORK */
scm_c_register_extension ("libguile-" SCM_EFFECTIVE_VERSION,
"scm_init_popen",
(scm_t_extension_init_func) scm_init_popen,
NULL);
-#endif /* HAVE_FORK */
}

/*
Mark H Weaver
2014-02-22 14:59:39 UTC
Permalink
Post by Eli Zaretskii
This patch allows the MinGW build of Guile to have the process related
functions (open-process, kill, waitpid, status:exit-val, etc.).
Implement open-process and related functions on MinGW
* libguile/filesys.c (fsync, link) [__MINGW32__]: Redirect Posix
functions to their Windows equivalents.
Gnulib has an 'fsync' module that apparently implements it on MinGW.
I think we should use that instead, no?
Post by Eli Zaretskii
(scm_chown, scm_open_fdes, scm_open, scm_close, scm_close_fdes)
(scm_link, scm_chdir, set_element, fill_select_type)
(get_element, retrieve_select_type, scm_select, scm_fcntl)
(scm_fsync, scm_symlink, scm_readlink, scm_copy_file, scm_sendfile)
(scm_dir_print, scm_dir_free): These functions are no longer
wholesale ifdef'ed away if HAVE_POSIX is not defined.
As I wrote in my other recent message, I think we should simply
recommend that MinGW builds be done without "--disable-posix", since we
have at least one report that it works now.
Post by Eli Zaretskii
(scm_init_filesys): Don't ifdef away parts of the function when
HAVE_POSIX is not defined.
diff --git a/libguile/posix.c b/libguile/posix.c
index 0443f95..69652a3 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -85,6 +85,27 @@
#if HAVE_SYS_WAIT_H
# include <sys/wait.h>
#endif
+#ifdef __MINGW32__
+# define WEXITSTATUS(stat_val) ((stat_val) & 255)
+/* Windows reports exit status from fatal exceptions as 0xC0000nnn
+ codes, see winbase.h. We usurp codes above 0xC0000200 for SIGxxx
+ signals. */
+# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
+# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
+# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_val - 0xC0000200 : stat_val)
+# define WIFSTOPPED(stat_val) (0)
+# define WSTOPSIG(stat_var) (0)
+# include <process.h>
+# define waitpid(p,s,o) _cwait(s, p, WAIT_CHILD)
+# define HAVE_WAITPID 1
+# define getuid() (500) /* Local Administrator */
+# define getgid() (513) /* None */
+# define setuid(u) (0)
+# define setgid(g) (0)
+# define pipe(f) _pipe(f, 0, O_NOINHERIT)
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+#endif
#ifndef WEXITSTATUS
# define WEXITSTATUS(stat_val) ((unsigned)(stat_val) >> 8)
#endif
@@ -674,6 +695,25 @@ SCM_DEFINE (scm_kill, "kill", 2, 0, 0,
goto err;
}
}
+#ifdef __MINGW32__
+ else
+ {
+ HANDLE ph = OpenProcess (PROCESS_TERMINATE, 0, scm_to_int (pid));
+ int s = scm_to_int (sig);
+
+ if (!ph)
+ {
+ errno = EPERM;
+ goto err;
+ }
+ if (!TerminateProcess (ph, 0xC0000200 + s))
+ {
+ errno = EINVAL;
+ goto err;
+ }
+ CloseHandle (ph);
+ }
+#endif /* __MINGW32__ */
#endif
return SCM_UNSPECIFIED;
}
@@ -720,6 +760,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
{
int i;
int status;
+#ifndef __MINGW32__
int ioptions;
if (SCM_UNBNDP (options))
ioptions = 0;
@@ -728,6 +769,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
/* Flags are interned in scm_init_posix. */
ioptions = scm_to_int (options);
}
+#endif
SCM_SYSCALL (i = waitpid (scm_to_int (pid), &status, ioptions));
if (i == -1)
SCM_SYSERROR;
@@ -736,7 +778,6 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
#undef FUNC_NAME
#endif /* HAVE_WAITPID */
-#ifndef __MINGW32__
SCM_DEFINE (scm_status_exit_val, "status:exit-val", 1, 0, 0,
(SCM status),
"Return the exit status value, as would be set if a process\n"
@@ -787,7 +828,6 @@ SCM_DEFINE (scm_status_stop_sig, "status:stop-sig", 1, 0, 0,
return SCM_BOOL_F;
}
#undef FUNC_NAME
-#endif /* __MINGW32__ */
#ifdef HAVE_GETPPID
SCM_DEFINE (scm_getppid, "getppid", 0, 0, 0,
@@ -802,7 +842,6 @@ SCM_DEFINE (scm_getppid, "getppid", 0, 0, 0,
#endif /* HAVE_GETPPID */
-#ifndef __MINGW32__
SCM_DEFINE (scm_getuid, "getuid", 0, 0, 0,
(),
"Return an integer representing the current real user ID.")
@@ -932,7 +971,6 @@ SCM_DEFINE (scm_setegid, "setegid", 1, 0, 0,
}
#undef FUNC_NAME
-#endif
#ifdef HAVE_GETPGRP
@@ -1248,6 +1286,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
return scm_from_int (pid);
}
#undef FUNC_NAME
+#endif /* HAVE_FORK */
/* Since Guile uses threads, we have to be very careful to avoid calling
functions that are not async-signal-safe in the child. That's why
@@ -1321,7 +1360,130 @@ scm_open_process (SCM mode, SCM prog, SCM args)
}
#endif
+#ifdef HAVE_FORK
pid = fork ();
+#elif defined(__MINGW32__)
+ {
+ int save_stdin = -1, save_stdout = -1;
+ int errno_save;
+
+ if (reading)
+ {
+ save_stdout = dup (1);
+ errno_save = errno;
+ if (save_stdout == -1)
+ {
+ close (c2p[0]);
+ close (c2p[1]);
+ free (exec_file);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ }
+
+ if (writing)
+ {
+ save_stdin = dup (0);
+ errno_save = errno;
+ if (save_stdin == -1)
+ {
+ if (reading)
+ close (save_stdout);
+ close (p2c[0]);
+ close (p2c[1]);
+ free (exec_file);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ }
+
+ if (reading)
+ {
+ close (1);
+ if (dup (c2p[1]) != 1)
+ {
+ errno_save = errno;
+ close (save_stdout);
+ close (c2p[0]);
+ close (c2p[1]);
+ if (writing)
+ {
+ close (save_stdin);
+ close (p2c[0]);
+ close (p2c[1]);
+ }
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (c2p[1]);
+ }
+ if (writing)
+ {
+ close (0);
+ if (dup (p2c[0]) != 0)
+ {
+ errno_save = errno;
+ close (save_stdin);
+ close (p2c[0]);
+ close (p2c[1]);
+ if (reading)
+ {
+ close (save_stdout);
+ close (c2p[0]);
+ close (c2p[1]);
+ }
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (p2c[0]);
+ }
+
+ pid = spawnvp (P_NOWAIT, exec_file, exec_argv);
+ errno_save = errno;
+
+ if (reading)
+ {
+ close (1);
+ if (dup (save_stdout) != 1)
+ {
+ if (writing)
+ close (save_stdin);
+ close (save_stdout);
+ close (p2c[1]);
+ close (c2p[0]);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (save_stdout);
+ }
+ if (writing)
+ {
+ close (0);
+ if (dup (save_stdin) != 0)
+ {
+ if (reading)
+ close (save_stdout);
+ close (save_stdin);
+ close (p2c[1]);
+ close (c2p[0]);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (save_stdin);
+ }
+
+ if (pid < 0)
+ errno = errno_save;
+ }
+#else
+ close (c2p[0]);
+ close (c2p[1]);
+ close (p2c[0]);
+ close (p2c[1]);
+ free (exec_file);
+ errno = ENOSYS;
+ SCM_SYSERROR;
+#endif /* HAVE_FORK */
if (pid == -1)
{
@@ -1330,14 +1492,28 @@ scm_open_process (SCM mode, SCM prog, SCM args)
if (reading)
{
close (c2p[0]);
+#ifdef HAVE_FORK
close (c2p[1]);
+#endif
}
if (writing)
{
+#ifdef HAVE_FORK
close (p2c[0]);
+#endif
close (p2c[1]);
}
errno = errno_save;
+
+#ifndef HAVE_FORK
+ /* The exec failed! There is nothing sensible to do. */
+ if (err > 0)
+ {
+ char *msg = strerror (errno);
+ fprintf (fdopen (err, "a"), "In spawnvp of %s: %s\n",
+ exec_file, msg);
+ }
+#endif
SCM_SYSERROR;
}
@@ -1363,7 +1539,8 @@ scm_open_process (SCM mode, SCM prog, SCM args)
return scm_values
(scm_list_3 (read_port, write_port, scm_from_int (pid)));
}
-
+
+#ifdef HAVE_FORK
/* The child. */
if (reading)
close (c2p[0]);
@@ -1412,16 +1589,16 @@ scm_open_process (SCM mode, SCM prog, SCM args)
if (err > 0)
{
char *msg = strerror (errno);
- fprintf (fdopen (err, "a"), "In execlp of %s: %s\n",
+ fprintf (fdopen (err, "a"), "In execvp of %s: %s\n",
exec_file, msg);
}
_exit (EXIT_FAILURE);
/* Not reached. */
return SCM_BOOL_F;
+#endif /* HAVE_FORK */
}
#undef FUNC_NAME
-#endif /* HAVE_FORK */
#ifdef __MINGW32__
# include "win32-uname.h"
@@ -2239,13 +2416,11 @@ SCM_DEFINE (scm_gethostname, "gethostname", 0, 0, 0,
#endif /* HAVE_GETHOSTNAME */
-#ifdef HAVE_FORK
static void
scm_init_popen (void)
{
scm_c_define_gsubr ("open-process", 2, 0, 1, scm_open_process);
}
-#endif
void
scm_init_posix ()
@@ -2338,11 +2513,11 @@ scm_init_posix ()
#ifdef HAVE_FORK
scm_add_feature ("fork");
+#endif /* HAVE_FORK */
scm_c_register_extension ("libguile-" SCM_EFFECTIVE_VERSION,
"scm_init_popen",
(scm_t_extension_init_func) scm_init_popen,
NULL);
-#endif /* HAVE_FORK */
}
/*
Eli Zaretskii
2014-02-22 15:43:33 UTC
Permalink
Date: Sat, 22 Feb 2014 09:59:39 -0500
=20
=20
This patch allows the MinGW build of Guile to have the process re=
lated
functions (open-process, kill, waitpid, status:exit-val, etc.).
Implement open-process and related functions on MinGW
* libguile/filesys.c (fsync, link) [__MINGW32__]: Redirect Posix
functions to their Windows equivalents.
=20
Gnulib has an 'fsync' module that apparently implements it on MinGW=
.
I think we should use that instead, no?
Yes, see my other message.
(scm_chown, scm_open_fdes, scm_open, scm_close, scm_close_fdes)
(scm_link, scm_chdir, set_element, fill_select_type)
(get_element, retrieve_select_type, scm_select, scm_fcntl)
(scm_fsync, scm_symlink, scm_readlink, scm_copy_file, scm_sendf=
ile)
(scm_dir_print, scm_dir_free): These functions are no longer
wholesale ifdef'ed away if HAVE_POSIX is not defined.
=20
As I wrote in my other recent message, I think we should simply
recommend that MinGW builds be done without "--disable-posix", sinc=
e we
have at least one report that it works now.
OK.
Mark H Weaver
2014-02-22 15:54:16 UTC
Permalink
Hi Eli,

My last response to this was not finished yet. Please disregard it.
(C-c C-c is bound to 'diff-goto-source' in diff-mode, which is what I
meant to do :-)
Post by Eli Zaretskii
This patch allows the MinGW build of Guile to have the process related
functions (open-process, kill, waitpid, status:exit-val, etc.).
Implement open-process and related functions on MinGW
* libguile/filesys.c (fsync, link) [__MINGW32__]: Redirect Posix
functions to their Windows equivalents.
Gnulib has 'fsync' and 'link' modules that apparently implement them on
MinGW. I think we should use those instead, no? I'll import those
modules soon, and some others.
Post by Eli Zaretskii
(scm_chown, scm_open_fdes, scm_open, scm_close, scm_close_fdes)
(scm_link, scm_chdir, set_element, fill_select_type)
(get_element, retrieve_select_type, scm_select, scm_fcntl)
(scm_fsync, scm_symlink, scm_readlink, scm_copy_file, scm_sendfile)
(scm_dir_print, scm_dir_free): These functions are no longer
wholesale ifdef'ed away if HAVE_POSIX is not defined.
As I wrote in my other recent message, I think we should simply
recommend that MinGW builds be done without "--disable-posix", since we
have at least one report that it works now.
Post by Eli Zaretskii
(scm_init_filesys): Don't ifdef away parts of the function when
HAVE_POSIX is not defined.
diff --git a/libguile/posix.c b/libguile/posix.c
index 0443f95..69652a3 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -85,6 +85,27 @@
#if HAVE_SYS_WAIT_H
# include <sys/wait.h>
#endif
+#ifdef __MINGW32__
+# define WEXITSTATUS(stat_val) ((stat_val) & 255)
+/* Windows reports exit status from fatal exceptions as 0xC0000nnn
+ codes, see winbase.h. We usurp codes above 0xC0000200 for SIGxxx
+ signals. */
+# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
+# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
+# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_val - 0xC0000200 : stat_val)
+# define WIFSTOPPED(stat_val) (0)
+# define WSTOPSIG(stat_var) (0)
Definitions for these on MinGW are available in Gnulib's 'sys_wait'
module. I'll import it soon.
Post by Eli Zaretskii
+# include <process.h>
+# define waitpid(p,s,o) _cwait(s, p, WAIT_CHILD)
+# define HAVE_WAITPID 1
Gnulib has a 'waitpid' module that implements it on MinGW. Can we just
use that, and then assume it is there instead of setting HAVE_WAITPID?

I'll add it to my list of modules to import, along with the others
mentioned in this message.
Post by Eli Zaretskii
+# define getuid() (500) /* Local Administrator */
+# define getgid() (513) /* None */
+# define setuid(u) (0)
+# define setgid(g) (0)
Hmm. I'm not sure about these. If we can't do better than this, I
think we should arrange to not bind these functions in MinGW, and not
call them in our code. What do you think?
Post by Eli Zaretskii
+# define pipe(f) _pipe(f, 0, O_NOINHERIT)
Gnulib has a 'pipe' module that implements it on MinGW.
Post by Eli Zaretskii
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+#endif
#ifndef WEXITSTATUS
# define WEXITSTATUS(stat_val) ((unsigned)(stat_val) >> 8)
#endif
@@ -674,6 +695,25 @@ SCM_DEFINE (scm_kill, "kill", 2, 0, 0,
goto err;
}
}
+#ifdef __MINGW32__
+ else
+ {
+ HANDLE ph = OpenProcess (PROCESS_TERMINATE, 0, scm_to_int (pid));
+ int s = scm_to_int (sig);
+
+ if (!ph)
+ {
+ errno = EPERM;
+ goto err;
+ }
+ if (!TerminateProcess (ph, 0xC0000200 + s))
+ {
+ errno = EINVAL;
+ goto err;
+ }
+ CloseHandle (ph);
+ }
+#endif /* __MINGW32__ */
#endif
return SCM_UNSPECIFIED;
}
This change is not mentioned in the commit log.

Can you use the GNU coding conventions for placement of brackets?

What is the meaning of 0xC0000200? It would be good to add a comment
explaining what that means, or better yet use preprocessor defines
(if they are available in a header).

Ideally this should be implemented in Gnulib, but I'm okay with
including this in Guile for now.
Post by Eli Zaretskii
@@ -720,6 +760,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
{
int i;
int status;
+#ifndef __MINGW32__
int ioptions;
if (SCM_UNBNDP (options))
ioptions = 0;
@@ -728,6 +769,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
/* Flags are interned in scm_init_posix. */
ioptions = scm_to_int (options);
}
+#endif
SCM_SYSCALL (i = waitpid (scm_to_int (pid), &status, ioptions));
if (i == -1)
SCM_SYSERROR;
Gnulib has a 'waitpid' module that apparently implements it on MinGW.
Can we use that?

Also, this change is not mentioned in the commit log.
Post by Eli Zaretskii
@@ -736,7 +778,6 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
#undef FUNC_NAME
#endif /* HAVE_WAITPID */
-#ifndef __MINGW32__
SCM_DEFINE (scm_status_exit_val, "status:exit-val", 1, 0, 0,
(SCM status),
"Return the exit status value, as would be set if a process\n"
@@ -787,7 +828,6 @@ SCM_DEFINE (scm_status_stop_sig, "status:stop-sig", 1, 0, 0,
return SCM_BOOL_F;
}
#undef FUNC_NAME
-#endif /* __MINGW32__ */
#ifdef HAVE_GETPPID
SCM_DEFINE (scm_getppid, "getppid", 0, 0, 0,
@@ -802,7 +842,6 @@ SCM_DEFINE (scm_getppid, "getppid", 0, 0, 0,
#endif /* HAVE_GETPPID */
-#ifndef __MINGW32__
SCM_DEFINE (scm_getuid, "getuid", 0, 0, 0,
(),
"Return an integer representing the current real user ID.")
@@ -932,7 +971,6 @@ SCM_DEFINE (scm_setegid, "setegid", 1, 0, 0,
}
#undef FUNC_NAME
-#endif
#ifdef HAVE_GETPGRP
Looks good.
Post by Eli Zaretskii
@@ -1248,6 +1286,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
return scm_from_int (pid);
}
#undef FUNC_NAME
+#endif /* HAVE_FORK */
/* Since Guile uses threads, we have to be very careful to avoid calling
functions that are not async-signal-safe in the child. That's why
@@ -1321,7 +1360,130 @@ scm_open_process (SCM mode, SCM prog, SCM args)
}
#endif
+#ifdef HAVE_FORK
pid = fork ();
+#elif defined(__MINGW32__)
+ {
+ int save_stdin = -1, save_stdout = -1;
+ int errno_save;
+
+ if (reading)
+ {
+ save_stdout = dup (1);
+ errno_save = errno;
+ if (save_stdout == -1)
+ {
+ close (c2p[0]);
+ close (c2p[1]);
+ free (exec_file);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ }
+
+ if (writing)
+ {
+ save_stdin = dup (0);
+ errno_save = errno;
+ if (save_stdin == -1)
+ {
+ if (reading)
+ close (save_stdout);
+ close (p2c[0]);
+ close (p2c[1]);
+ free (exec_file);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ }
+
+ if (reading)
+ {
+ close (1);
+ if (dup (c2p[1]) != 1)
+ {
+ errno_save = errno;
+ close (save_stdout);
+ close (c2p[0]);
+ close (c2p[1]);
+ if (writing)
+ {
+ close (save_stdin);
+ close (p2c[0]);
+ close (p2c[1]);
+ }
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (c2p[1]);
+ }
+ if (writing)
+ {
+ close (0);
+ if (dup (p2c[0]) != 0)
+ {
+ errno_save = errno;
+ close (save_stdin);
+ close (p2c[0]);
+ close (p2c[1]);
+ if (reading)
+ {
+ close (save_stdout);
+ close (c2p[0]);
+ close (c2p[1]);
+ }
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (p2c[0]);
+ }
+
+ pid = spawnvp (P_NOWAIT, exec_file, exec_argv);
+ errno_save = errno;
+
+ if (reading)
+ {
+ close (1);
+ if (dup (save_stdout) != 1)
+ {
+ if (writing)
+ close (save_stdin);
+ close (save_stdout);
+ close (p2c[1]);
+ close (c2p[0]);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (save_stdout);
+ }
+ if (writing)
+ {
+ close (0);
+ if (dup (save_stdin) != 0)
+ {
+ if (reading)
+ close (save_stdout);
+ close (save_stdin);
+ close (p2c[1]);
+ close (c2p[0]);
+ errno = errno_save;
+ SCM_SYSERROR;
+ }
+ close (save_stdin);
+ }
+
+ if (pid < 0)
+ errno = errno_save;
+ }
+#else
+ close (c2p[0]);
+ close (c2p[1]);
+ close (p2c[0]);
+ close (p2c[1]);
+ free (exec_file);
+ errno = ENOSYS;
+ SCM_SYSERROR;
+#endif /* HAVE_FORK */
Thanks for working on this, but in a multithreaded program, it's no good
to change the file descriptors in the main program temporarily before
spawning, and then restore them afterwards. We'll have to find another
way of doing this.
Post by Eli Zaretskii
if (pid == -1)
{
@@ -1330,14 +1492,28 @@ scm_open_process (SCM mode, SCM prog, SCM args)
if (reading)
{
close (c2p[0]);
+#ifdef HAVE_FORK
close (c2p[1]);
+#endif
}
if (writing)
{
+#ifdef HAVE_FORK
close (p2c[0]);
+#endif
close (p2c[1]);
}
errno = errno_save;
+
+#ifndef HAVE_FORK
+ /* The exec failed! There is nothing sensible to do. */
+ if (err > 0)
+ {
+ char *msg = strerror (errno);
+ fprintf (fdopen (err, "a"), "In spawnvp of %s: %s\n",
+ exec_file, msg);
+ }
+#endif
SCM_SYSERROR;
}
@@ -1363,7 +1539,8 @@ scm_open_process (SCM mode, SCM prog, SCM args)
return scm_values
(scm_list_3 (read_port, write_port, scm_from_int (pid)));
}
-
+
+#ifdef HAVE_FORK
/* The child. */
if (reading)
close (c2p[0]);
@@ -1412,16 +1589,16 @@ scm_open_process (SCM mode, SCM prog, SCM args)
if (err > 0)
{
char *msg = strerror (errno);
- fprintf (fdopen (err, "a"), "In execlp of %s: %s\n",
+ fprintf (fdopen (err, "a"), "In execvp of %s: %s\n",
exec_file, msg);
}
Oops, good catch!
Post by Eli Zaretskii
_exit (EXIT_FAILURE);
/* Not reached. */
return SCM_BOOL_F;
+#endif /* HAVE_FORK */
}
#undef FUNC_NAME
-#endif /* HAVE_FORK */
#ifdef __MINGW32__
# include "win32-uname.h"
@@ -2239,13 +2416,11 @@ SCM_DEFINE (scm_gethostname, "gethostname", 0, 0, 0,
#endif /* HAVE_GETHOSTNAME */
-#ifdef HAVE_FORK
static void
scm_init_popen (void)
{
scm_c_define_gsubr ("open-process", 2, 0, 1, scm_open_process);
}
-#endif
void
scm_init_posix ()
@@ -2338,11 +2513,11 @@ scm_init_posix ()
#ifdef HAVE_FORK
scm_add_feature ("fork");
+#endif /* HAVE_FORK */
scm_c_register_extension ("libguile-" SCM_EFFECTIVE_VERSION,
"scm_init_popen",
(scm_t_extension_init_func) scm_init_popen,
NULL);
-#endif /* HAVE_FORK */
}
/*
Eli Zaretskii
2014-02-22 16:35:25 UTC
Permalink
Date: Sat, 22 Feb 2014 10:54:16 -0500
=20
Post by Eli Zaretskii
diff --git a/libguile/posix.c b/libguile/posix.c
index 0443f95..69652a3 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -85,6 +85,27 @@
#if HAVE_SYS_WAIT_H
# include <sys/wait.h>
#endif
+#ifdef __MINGW32__
+# define WEXITSTATUS(stat_val) ((stat_val) & 255)
+/* Windows reports exit status from fatal exceptions as 0xC0000n=
nn
Post by Eli Zaretskii
+ codes, see winbase.h. We usurp codes above 0xC0000200 for SI=
Gxxx
Post by Eli Zaretskii
+ signals. */
+# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) =3D=
=3D 0)
Post by Eli Zaretskii
+# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) !=3D 0=
)
Post by Eli Zaretskii
+# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_v=
al - 0xC0000200 : stat_val)
Post by Eli Zaretskii
+# define WIFSTOPPED(stat_val) (0)
+# define WSTOPSIG(stat_var) (0)
=20
Definitions for these on MinGW are available in Gnulib's 'sys_wait'
module. I'll import it soon.
Please don't: that gnulib module is dead wrong for Windows. If we us=
e
it, we will be unable to report even the simplest error exits, let
alone programs that crashed due to a fatal signal.
Post by Eli Zaretskii
+# include <process.h>
+# define waitpid(p,s,o) _cwait(s, p, WAIT_CHILD)
+# define HAVE_WAITPID 1
=20
Gnulib has a 'waitpid' module that implements it on MinGW. Can we =
just
use that, and then assume it is there instead of setting HAVE_WAITP=
ID?

The gnulib waitpid module does exactly what I did above. If you are
more comfortable with including it than with a single trivial macro,
then go ahead. But in general, due to the known problems with gettin=
g
Windows-specific patches to gnulib approved and admitted, I'd advise
to stay away of gnulib as much as possible, when Windows is
concerned. E.g., what if tomorrow we would need to make the waitpid
emulation more sophisticated?
I'll add it to my list of modules to import, along with the others
mentioned in this message.
=20
Post by Eli Zaretskii
+# define getuid() (500) /* Local Administrator */
+# define getgid() (513) /* None */
+# define setuid(u) (0)
+# define setgid(g) (0)
=20
Hmm. I'm not sure about these. If we can't do better than this, I
think we should arrange to not bind these functions in MinGW, and n=
ot
call them in our code. What do you think?
Which functions are you talking about, specifically?

In general, where the Windows equivalent is to do nothing, my advice
would be to have a no-op implementation, rather than an unbound
function. The former approach makes it much easier to write portable
Guile scripts, instead of spreading boundness checks all over the
place. FWIW, Emacs takes the former approach with very good results.
Post by Eli Zaretskii
+# define pipe(f) _pipe(f, 0, O_NOINHERIT)
=20
Gnulib has a 'pipe' module that implements it on MinGW.
Same response as for waitpid.
Post by Eli Zaretskii
+#ifdef __MINGW32__
+ else
+ {
+ HANDLE ph =3D OpenProcess (PROCESS_TERMINATE, 0, scm_to_in=
t (pid));
Post by Eli Zaretskii
+ int s =3D scm_to_int (sig);
+
+ if (!ph)
+=09{
+=09 errno =3D EPERM;
+=09 goto err;
+=09}
+ if (!TerminateProcess (ph, 0xC0000200 + s))
+=09{
+=09 errno =3D EINVAL;
+=09 goto err;
+=09}
+ CloseHandle (ph);
+ }
+#endif=09/* __MINGW32__ */
#endif
return SCM_UNSPECIFIED;
}
=20
This change is not mentioned in the commit log.
Sorry, I sent a wrong commit log. Here's the correct one:

=09* posix.c (WEXITSTATUS, WIFEXITED, WIFSIGNALED, WTERMSIG)
=09(WIFSTOPPED, WSTOPSIG) [__MINGW32__]: Define for MinGW.
=09(waitpid, getuid, getgid, setuid, setgid, pipe) [__MINGW32__]:
=09Define replacements for MinGW.
=09(scm_kill) [__MINGW32__]: Implement killing a process on
=09MS-Windows.
=09(scm_waitpid) [__MINGW32__]: Ignore the options argument.
=09(scm_status_exit_val, scm_getuid): No longer ifdef'ed away for Min=
GW.
=09(scm_open_process) [__MINGW32__]: Implement starting a process
=09on MS-Windows. Fix a typo in an error message (execlp instead
=09of execvp).
=09(scm_init_popen): No longer ifdef'ed away when HAVE_FORK is not
=09defined.
=09(scm_init_posix): The scm_init_popen feature is now available
=09even if HAVE_FORK is not defined.
Can you use the GNU coding conventions for placement of brackets?
Sorry, I don't follow: which part is not according to the GNU coding
conventions?
What is the meaning of 0xC0000200? It would be good to add a comme=
nt
explaining what that means, or better yet use preprocessor defines
(if they are available in a header).
It's explained where WEXIT* macros are defined. I will add another
comment here pointing there.
Ideally this should be implemented in Gnulib, but I'm okay with
including this in Guile for now.
Thanks. As I wrote above, gnulib doesn't have a functional
implementation of these macros and the related functionality.
Post by Eli Zaretskii
@@ -720,6 +760,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
{
int i;
int status;
+#ifndef __MINGW32__
int ioptions;
if (SCM_UNBNDP (options))
ioptions =3D 0;
@@ -728,6 +769,7 @@ SCM_DEFINE (scm_waitpid, "waitpid", 1, 1, 0,
/* Flags are interned in scm_init_posix. */
ioptions =3D scm_to_int (options);
}
+#endif
SCM_SYSCALL (i =3D waitpid (scm_to_int (pid), &status, ioption=
s));
Post by Eli Zaretskii
if (i =3D=3D -1)
SCM_SYSERROR;
=20
Gnulib has a 'waitpid' module that apparently implements it on MinG=
W.
Can we use that?
See above.
Also, this change is not mentioned in the commit log.
See above.
Thanks for working on this, but in a multithreaded program, it's no=
good
to change the file descriptors in the main program temporarily befo=
re
spawning, and then restore them afterwards. We'll have to find ano=
ther
way of doing this.
Threads don't work in the MinGW build anyway, and no one was able to
help me understand why (see the discussions last August). As long as
this limitation exists, this is a non-issue, and certainly better tha=
n
not having this functionality available at all.

The only other way I know of is to bypass Posix-like functions like
'open', 'close', and 'dup', and go to the level of Win32 API, where a
function that creates a child subprocess can accept handles for the
child's standard I/O. If you will be comfortable with a lot more
Widows specific stuff, I can work on that.

Thanks.
Mark H Weaver
2014-02-23 05:50:16 UTC
Permalink
Hi Eli,
Date: Sat, 22 Feb 2014 10:54:16 -0500
Post by Eli Zaretskii
diff --git a/libguile/posix.c b/libguile/posix.c
index 0443f95..69652a3 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -85,6 +85,27 @@
#if HAVE_SYS_WAIT_H
# include <sys/wait.h>
#endif
+#ifdef __MINGW32__
+# define WEXITSTATUS(stat_val) ((stat_val) & 255)
+/* Windows reports exit status from fatal exceptions as 0xC0000nnn
+ codes, see winbase.h. We usurp codes above 0xC0000200 for SIGxxx
+ signals. */
+# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
+# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
+# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_val - 0xC0000200 : stat_val)
+# define WIFSTOPPED(stat_val) (0)
+# define WSTOPSIG(stat_var) (0)
Definitions for these on MinGW are available in Gnulib's 'sys_wait'
module. I'll import it soon.
Please don't: that gnulib module is dead wrong for Windows. If we use
it, we will be unable to report even the simplest error exits, let
alone programs that crashed due to a fatal signal.
Hmm. This is rather uncomfortable. Did you discuss these problems with
the Gnulib developers? If so, can you provide a link to the relevant
threads?

Your comment above "We usurp codes above 0xC0000200 for SIGxxx signals"
makes me wonder: How widely used is this convention? What other
software packages use it?
Post by Eli Zaretskii
+# include <process.h>
+# define waitpid(p,s,o) _cwait(s, p, WAIT_CHILD)
+# define HAVE_WAITPID 1
Gnulib has a 'waitpid' module that implements it on MinGW. Can we just
use that, and then assume it is there instead of setting HAVE_WAITPID?
The gnulib waitpid module does exactly what I did above. If you are
more comfortable with including it than with a single trivial macro,
then go ahead.
In general, I would prefer to use Gnulib, except in cases where their
MinGW wrappers are not good enough to get the job done properly.

My reasons for this preference are (1) to keep the Guile code cleaner,
and (2) if some deficiency is found in the wrapper, or if Windows
introduces an improved API some day, the changes have to be made in only
one place: Gnulib.
But in general, due to the known problems with getting
Windows-specific patches to gnulib approved and admitted, I'd advise
to stay away of gnulib as much as possible, when Windows is concerned.
Can you provide some links to mailing list threads where you've tried to
get things fixed in Gnulib and encountered resistance?
E.g., what if tomorrow we would need to make the waitpid emulation
more sophisticated?
I suppose I would prefer to get the problem fixed in Gnulib. If there
is truly a problem with the way Gnulib is being maintained w.r.t. to
Windows, perhaps I will make an effort to get that larger problem fixed.

Having said all of this, I'm not necessarily opposed to using Windows
APIs directly when there is a clear benefit to doing so.

For example, if it is true that we can avoid all the nasty problems with
filename encoding by using the native Windows APIs that use UTF-16 for
filenames, then I'd be in favor of doing that.

I'd also be in favor of using native Windows APIs to spawn new processes
if that's the only way to do it properly. (See below)
Post by Eli Zaretskii
+# define getuid() (500) /* Local Administrator */
+# define getgid() (513) /* None */
+# define setuid(u) (0)
+# define setgid(g) (0)
Hmm. I'm not sure about these. If we can't do better than this, I
think we should arrange to not bind these functions in MinGW, and not
call them in our code. What do you think?
Which functions are you talking about, specifically?
The ones that use the macros you've defined above,
e.g. {get,set}{uid,gid,euid,egid}.
In general, where the Windows equivalent is to do nothing, my advice
would be to have a no-op implementation, rather than an unbound
function. The former approach makes it much easier to write portable
Guile scripts, instead of spreading boundness checks all over the
place.
I think this is a very bad idea. First of all, it's not very common for
Guile programs to query or set the uid/gid, so I don't see much benefit.

On the other hand, if a program _does_ try to do one of those things, it
might be important that the job be done right. For example, a program
might be trying to reduce its privileges. Doing nothing while silently
pretending that the operation succeeded is a good way to cause security
flaws to go unnoticed, or to make a programmer waste hours of time
trying to debug a problem that he should have been notified about
immediately with an error message.
FWIW, Emacs takes the former approach with very good results.
Emacs is primarily an interactive editor, and therefore it is more
acceptable for it to try to make guesses and paper over problems for the
sake of convenience. Guile is a general-purpose programming language,
and we must consider the needs of a larger set of users, for example the
authors of server software where security is important.
Post by Eli Zaretskii
+# define pipe(f) _pipe(f, 0, O_NOINHERIT)
Gnulib has a 'pipe' module that implements it on MinGW.
Same response as for waitpid.
Same reply as for waitpid.
Can you use the GNU coding conventions for placement of brackets?
Sorry, I don't follow: which part is not according to the GNU coding
conventions?
Nevermind, my mistake. Your use of tabs in the indentation combined
with the prefix characters in the email caused some (but not all) lines
to move to the right. I've long been using (setq indent-tabs-mode nil)
globally, for that reason.

[...]
Thanks for working on this, but in a multithreaded program, it's no good
to change the file descriptors in the main program temporarily before
spawning, and then restore them afterwards. We'll have to find another
way of doing this.
Threads don't work in the MinGW build anyway, and no one was able to
help me understand why (see the discussions last August). As long as
this limitation exists, this is a non-issue, and certainly better than
not having this functionality available at all.
As I wrote elsewhere, madsy on #guile reported having successfully built
a recent Guile snapshot with both threads and posix enabled.

I know that you asked for more specifics about this, but I don't know
the answers. I asked madsy to reply to you directly when he has some
free time. However, I remember that he cross-built from Ubuntu, and
iirc he built libgc (and several other dependencies) from source code.
The only other way I know of is to bypass Posix-like functions like
'open', 'close', and 'dup', and go to the level of Win32 API, where a
function that creates a child subprocess can accept handles for the
child's standard I/O.
This sounds to me like the right approach in this case.
If you will be comfortable with a lot more Widows specific stuff, I
can work on that.
Please do!

Thanks,
Mark
Eli Zaretskii
2014-02-23 17:54:24 UTC
Permalink
Date: Sun, 23 Feb 2014 00:50:16 -0500
Post by Mark H Weaver
Definitions for these on MinGW are available in Gnulib's 'sys_wait'
module. I'll import it soon.
Please don't: that gnulib module is dead wrong for Windows. If we use
it, we will be unable to report even the simplest error exits, let
alone programs that crashed due to a fatal signal.
Hmm. This is rather uncomfortable. Did you discuss these problems with
the Gnulib developers? If so, can you provide a link to the relevant
threads?
I'm sorry, but I seem to be unable to efficiently discuss with gnulib
developers anything that is related to MinGW and Windows in general.
Any such discussion at best drags on for months, and at worst simply
dies because there's no response. See a few examples below. And no,
I didn't discuss this specific issue with gnulib developers.
Your comment above "We usurp codes above 0xC0000200 for SIGxxx signals"
makes me wonder: How widely used is this convention? What other
software packages use it?
I don't think this matters. This convention is used between Guile and
itself: in scm_kill we tell the process being killed to terminate with
status of 0xC0000200 + SIGNAL, and then we extract the signal number
in status:term-sig. So this is just something private to Guile,
that's all; its only purpose is to report the signal with which we
killed the process. Other applications don't need to know about this.
My reasons for this preference are (1) to keep the Guile code cleaner,
and (2) if some deficiency is found in the wrapper, or if Windows
introduces an improved API some day, the changes have to be made in only
one place: Gnulib.
As long as this is practical, I won't argue with you. However, my
experience with gnulib is significantly different (see below), and
therefore my personal preferences are also different.
But in general, due to the known problems with getting
Windows-specific patches to gnulib approved and admitted, I'd advise
to stay away of gnulib as much as possible, when Windows is concerned.
Can you provide some links to mailing list threads where you've tried to
get things fixed in Gnulib and encountered resistance?
Certainly. My first example is with canonicalize_filename, which
started here:

http://lists.gnu.org/archive/html/bug-gnulib/2012-01/msg00253.html

and then promptly died. My pings here

http://lists.gnu.org/archive/html/bug-gnulib/2012-06/msg00222.html
http://lists.gnu.org/archive/html/bug-gnulib/2012-07/msg00004.html
http://lists.gnu.org/archive/html/bug-gnulib/2012-07/msg00239.html

were left without any response whatsoever. Finally, you intervened
here:

http://lists.gnu.org/archive/html/bug-gnulib/2012-11/msg00021.html

and, lo and behold, 3 weeks later the changes were finally committed,
more than 10 months after the initial submittal.

This is a successful story, mind you. Here's a couple of unsuccessful
ones:

http://lists.gnu.org/archive/html/bug-gnulib/2013-06/msg00042.html
http://lists.gnu.org/archive/html/bug-gnulib/2014-01/msg00072.html

Who knows, perhaps a year from now these will also become "successes"?
One can hope, can't one?
E.g., what if tomorrow we would need to make the waitpid emulation
more sophisticated?
I suppose I would prefer to get the problem fixed in Gnulib. If there
is truly a problem with the way Gnulib is being maintained w.r.t. to
Windows, perhaps I will make an effort to get that larger problem fixed.
See above.
For example, if it is true that we can avoid all the nasty problems with
filename encoding by using the native Windows APIs that use UTF-16 for
filenames, then I'd be in favor of doing that.
What nasty problems do you have in mind? I implemented something like
this in Emacs not long ago, so perhaps I can help here.
In general, where the Windows equivalent is to do nothing, my advice
would be to have a no-op implementation, rather than an unbound
function. The former approach makes it much easier to write portable
Guile scripts, instead of spreading boundness checks all over the
place.
I think this is a very bad idea. First of all, it's not very common for
Guile programs to query or set the uid/gid, so I don't see much benefit.
Unless you remove them from Guile, I don't think this argument stands:
as long as the functions exist, some Guile program will certainly use
them. For example, those servers where security is important, as you
mention.
On the other hand, if a program _does_ try to do one of those things, it
might be important that the job be done right. For example, a program
might be trying to reduce its privileges. Doing nothing while silently
pretending that the operation succeeded is a good way to cause security
flaws to go unnoticed, or to make a programmer waste hours of time
trying to debug a problem that he should have been notified about
immediately with an error message.
We are talking about operations that have no equivalent meaning on
Windows. they only have meaning in Posix worldview. Why does it make
sense to fail an operation that is meaningless? What useful purpose
could this possibly serve, except having a subtly broken Guile module?
FWIW, Emacs takes the former approach with very good results.
Emacs is primarily an interactive editor
Oh, Emacs is much more than that, believe me.
Guile is a general-purpose programming language, and we must
consider the needs of a larger set of users, for example the authors
of server software where security is important.
Server software cannot be written portably anyway. I can believe you
that Guile has enough infrastructure to support writing a server on
Posix platforms, but it certainly does _not_ have the infrastructure
for doing that on Windows; using setuid etc., or setting privileges,
will be the least of your worries, if you try doing that. So this is
a red herring, I think. By contrast, calls to setuid are quite common
in modern programs that needs to run on Posix platforms.
Post by Mark H Weaver
Can you use the GNU coding conventions for placement of brackets?
Sorry, I don't follow: which part is not according to the GNU coding
conventions?
Nevermind, my mistake. Your use of tabs in the indentation combined
with the prefix characters in the email caused some (but not all) lines
to move to the right. I've long been using (setq indent-tabs-mode nil)
globally, for that reason.
Using tabs like I did is the Emacs default. Is there a requirement in
Guile to use only spaces? If so, I probably missed it.
Post by Mark H Weaver
Thanks for working on this, but in a multithreaded program, it's no good
to change the file descriptors in the main program temporarily before
spawning, and then restore them afterwards. We'll have to find another
way of doing this.
Threads don't work in the MinGW build anyway, and no one was able to
help me understand why (see the discussions last August). As long as
this limitation exists, this is a non-issue, and certainly better than
not having this functionality available at all.
As I wrote elsewhere, madsy on #guile reported having successfully built
a recent Guile snapshot with both threads and posix enabled.
I know that you asked for more specifics about this, but I don't know
the answers. I asked madsy to reply to you directly when he has some
free time. However, I remember that he cross-built from Ubuntu
Cross-building on GNU/Linux is probably the cause of success: the
problematic steps run on a Posix platform using Posix APIs, I presume?
Otherwise, the issues are subtle and only show when running certain
Guile programs that use threading infrastructure, as discussed and
described back in August.
and iirc he built libgc (and several other dependencies) from source
code.
I've built libgc myself as well (and it passed all the tests), so this
cannot be the reason. I also built libgmp.
The only other way I know of is to bypass Posix-like functions like
'open', 'close', and 'dup', and go to the level of Win32 API, where a
function that creates a child subprocess can accept handles for the
child's standard I/O.
This sounds to me like the right approach in this case.
OK.
If you will be comfortable with a lot more Widows specific stuff, I
can work on that.
Please do!
OK, but please don't block this patch while you wait for me to do
that. It is certainly much better to have this functionality with
restrictions than not at all. At the very least, we could condition
redirection of standard handles on !SCM_USE_PTHREAD_THREADS, and
otherwise support only child processes that don't require redirection.
This is already much better than what we have today, even if Guile
with threads is indeed workable on Windows.
Mark H Weaver
2014-02-24 18:33:56 UTC
Permalink
Post by Eli Zaretskii
Date: Sun, 23 Feb 2014 00:50:16 -0500
Post by Mark H Weaver
Definitions for these on MinGW are available in Gnulib's 'sys_wait'
module. I'll import it soon.
Please don't: that gnulib module is dead wrong for Windows. If we use
it, we will be unable to report even the simplest error exits, let
alone programs that crashed due to a fatal signal.
Hmm. This is rather uncomfortable. Did you discuss these problems with
the Gnulib developers? If so, can you provide a link to the relevant
threads?
I'm sorry, but I seem to be unable to efficiently discuss with gnulib
developers anything that is related to MinGW and Windows in general.
Any such discussion at best drags on for months, and at worst simply
dies because there's no response. See a few examples below. And no,
I didn't discuss this specific issue with gnulib developers.
Your comment above "We usurp codes above 0xC0000200 for SIGxxx signals"
makes me wonder: How widely used is this convention? What other
software packages use it?
I don't think this matters. This convention is used between Guile and
itself: in scm_kill we tell the process being killed to terminate with
status of 0xC0000200 + SIGNAL, and then we extract the signal number
in status:term-sig. So this is just something private to Guile, [...]
I don't like this. I think our 'status:*' procedures should follow a
common convention, or simply punt on it if there's no convention.

The Gnulib sys_wait module says in its comments:

"When an unhandled fatal signal terminates a process,
the exit code is 3."

If this is wrong, can you help me understand what they might have been
thinking when they wrote it?

The Gnulib comments also say:

"The signal that terminated a process is not known posthum."

and on that basis, they make WTERMSIG(x) simply return SIGTERM. To my
mind, this seems better than making up our own convention that nobody
else follows, and assuming that the subprocess is another guile process.

If you think that the Gnulib sys_wait module is "dead wrong for
Windows", then please post a message to the Gnulib mailing list and make
your case. If there's truly a problem, then we need to get it fixed in
Gnulib. I'd also like to see how they respond to your claims.
Post by Eli Zaretskii
For example, if it is true that we can avoid all the nasty problems with
filename encoding by using the native Windows APIs that use UTF-16 for
filenames, then I'd be in favor of doing that.
What nasty problems do you have in mind? I implemented something like
this in Emacs not long ago, so perhaps I can help here.
The nasty problem is that POSIX uses sequences of bytes for filenames,
although conceptually filenames are character strings, and in fact
virtually all user interfaces treat filenames as character strings.

Guile uses character strings for filenames (the only sane thing to do),
and it would be good to build Guile on system APIs that also use
character strings for filenames, instead of having to guess how to
encode the characters into bytes.

We don't have a fully satisfactory solution to this problem on POSIX,
but I guess we do on Windows, if we use the native Windows APIs.

BTW, the same problems exist for command-line arguments, environment
variables, the hostname, etc. All of these are sequences of bytes in
POSIX, but conceptually they should be character strings.

If you'd like to work on a patch to have Guile use the native Windows
APIs (that use UTF-16) for these things, I think that would be very
useful and worthy of inclusion.
Post by Eli Zaretskii
In general, where the Windows equivalent is to do nothing, my advice
would be to have a no-op implementation, rather than an unbound
function. The former approach makes it much easier to write portable
Guile scripts, instead of spreading boundness checks all over the
place.
I think this is a very bad idea. First of all, it's not very common for
Guile programs to query or set the uid/gid, so I don't see much benefit.
[...]
Post by Eli Zaretskii
On the other hand, if a program _does_ try to do one of those things, it
might be important that the job be done right. For example, a program
might be trying to reduce its privileges. Doing nothing while silently
pretending that the operation succeeded is a good way to cause security
flaws to go unnoticed, or to make a programmer waste hours of time
trying to debug a problem that he should have been notified about
immediately with an error message.
We are talking about operations that have no equivalent meaning on
Windows. they only have meaning in Posix worldview. Why does it make
sense to fail an operation that is meaningless? What useful purpose
could this possibly serve, except having a subtly broken Guile module?
As I said before, useful purposes include (1) to alert the user to a
potential security flaw, and (2) to save someone from a possibly long
debugging session when they should have gotten an error message.

In general, I think it's a serious mistake to assume that what the user
asked to do was unimportant, and can therefore be silently ignored.

Does Windows really not have a concept of different security contexts,
with different privileges? I find that hard to believe, but if it's
true, then I could perhaps be convinced that we should make up a uid/gid
when the user asks for one.

However, I will strongly oppose turning setuid or setgid into silent
no-ops.
Post by Eli Zaretskii
Emacs is primarily an interactive editor
Oh, Emacs is much more than that, believe me.
Yes, I know that. You're talking to someone who does almost everything
inside of Emacs. I run my interactive shells in shell-mode, use gnus
for mail/news, info and woman for reading docs, docview for viewing
PDFs, emacs-w3m for web browsing, erc+bitlbee for irc+jabber, magit for
git, sound editing with my own custom elisp mode, etc. It's actually
quite rare for any other window to be open in my X session.
Post by Eli Zaretskii
Using tabs like I did is the Emacs default. Is there a requirement in
Guile to use only spaces? If so, I probably missed it.
No, it's not currently a requirement for C code, but IMO the Emacs
default should be changed.
Post by Eli Zaretskii
Post by Mark H Weaver
Thanks for working on this, but in a multithreaded program, it's no good
to change the file descriptors in the main program temporarily before
spawning, and then restore them afterwards. We'll have to find another
way of doing this.
Threads don't work in the MinGW build anyway, and no one was able to
help me understand why (see the discussions last August). As long as
this limitation exists, this is a non-issue, and certainly better than
not having this functionality available at all.
As I wrote elsewhere, madsy on #guile reported having successfully built
a recent Guile snapshot with both threads and posix enabled.
I know that you asked for more specifics about this, but I don't know
the answers. I asked madsy to reply to you directly when he has some
free time. However, I remember that he cross-built from Ubuntu
I've since learned more details about what he did. He built
pthreads-w32 2.9.1 <https://www.sourceware.org/pthreads-win32/>
from source code. He also built libgc-7.2e. He said that the
MinGW cross-compiler identified itself as follows:

"i686-w64-mingw32-gcc --version" says: i686-w64-mingw32-gcc (GCC) 4.6.3
Post by Eli Zaretskii
Cross-building on GNU/Linux is probably the cause of success: the
problematic steps run on a Posix platform using Posix APIs, I presume?
Possibly. Anyway, the point is that madsy built a working Guile with
thread support on Windows, and it seems to work, so your claim that
"Threads don't work in the MinGW build anyway" seems to be false.
Post by Eli Zaretskii
The only other way I know of is to bypass Posix-like functions like
'open', 'close', and 'dup', and go to the level of Win32 API, where a
function that creates a child subprocess can accept handles for the
child's standard I/O.
This sounds to me like the right approach in this case.
OK.
If you will be comfortable with a lot more Widows specific stuff, I
can work on that.
Please do!
OK, but please don't block this patch while you wait for me to do
that. It is certainly much better to have this functionality with
restrictions than not at all.
I'd rather wait until we have a proper implementation that works with
multi-threaded programs.

Thanks,
Mark
Eli Zaretskii
2014-02-24 21:06:37 UTC
Permalink
Date: Mon, 24 Feb 2014 13:33:56 -0500
Post by Eli Zaretskii
I don't think this matters. This convention is used between Guile and
itself: in scm_kill we tell the process being killed to terminate with
status of 0xC0000200 + SIGNAL, and then we extract the signal number
in status:term-sig. So this is just something private to Guile, [...]
I don't like this. I think our 'status:*' procedures should follow a
common convention
There is none that I'm aware of.
or simply punt on it if there's no convention.
Why punt if we can do better? The code I proposed works for me and
others for many years in ported Findutils and elsewhere. I hate it
when a ported program loses features for no good reason.

What exactly is the nature of your objections to using this working
code?
"When an unhandled fatal signal terminates a process,
the exit code is 3."
If this is wrong, can you help me understand what they might have been
thinking when they wrote it?
They were thinking about a single use case: a program that calls
'abort'. Such a program on Windows exits with a status code of 3.

But that is not very interesting in the case in point, and besides,
some programs exit with code 3 for other reasons. One such example is
wget, and there are others.

More to the point, when a Windows program is terminated with an
unhandled fatal exception, it rarely if ever terminates via 'abort'.
"The signal that terminated a process is not known posthum."
Well, since signals don't really exist on Windows, it's hard to tell
anything about this statement. On Windows, a program is terminated by
an exception, not by a signal. And that exception is visible in the
exit status of the process. For example, the exception equivalent to
SIGSEGV (which is also reported as SIGSEGV by GDB on Windows) causes
the exit status of 0xC0000005 to be returned. These 0xC... values are
documented on Microsoft headers, and we can use them to report the
equivalent of fatal signals. Thus the code I submitted.
and on that basis, they make WTERMSIG(x) simply return SIGTERM. To my
mind, this seems better than making up our own convention that nobody
else follows,
See above: it is not something I invented.

The only invention here is the 0xC0000200 + signo thing, see below.
and assuming that the subprocess is another guile process.
No, this is a misunderstanding: the subprocess does not have to be
Guile in any way. The second argument passed to the TerminateProcess
function is the exit code to be used by the process being terminated,
see

http://msdn.microsoft.com/en-us/library/windows/desktop/ms686714%28v=vs.85%29.aspx

So by passing this value to TerminateProcess, we guarantee that the
subsequent call to waitpid will return to us this very number, and
allow Guile to report that the program was indeed killed with the
signal we intended. IOW, this value is not interpreted in any way by
the subprocess, it is simply returned back to us.
If you think that the Gnulib sys_wait module is "dead wrong for
Windows", then please post a message to the Gnulib mailing list and make
your case. If there's truly a problem, then we need to get it fixed in
Gnulib. I'd also like to see how they respond to your claims.
These are not "claims", this code works. I tested it, and not only in
Guile.

As for talking to gnulib maintainers, I see no reason to do that until
the other issues I submitted there start moving in some constructive
direction. Given the examples I've shown you, I have all but given up
on talking to them.
Post by Eli Zaretskii
Post by Mark H Weaver
For example, if it is true that we can avoid all the nasty problems with
filename encoding by using the native Windows APIs that use UTF-16 for
filenames, then I'd be in favor of doing that.
What nasty problems do you have in mind? I implemented something like
this in Emacs not long ago, so perhaps I can help here.
The nasty problem is that POSIX uses sequences of bytes for filenames,
although conceptually filenames are character strings, and in fact
virtually all user interfaces treat filenames as character strings.
I will respond to this in a separate message.
Post by Eli Zaretskii
Post by Mark H Weaver
On the other hand, if a program _does_ try to do one of those things, it
might be important that the job be done right. For example, a program
might be trying to reduce its privileges. Doing nothing while silently
pretending that the operation succeeded is a good way to cause security
flaws to go unnoticed, or to make a programmer waste hours of time
trying to debug a problem that he should have been notified about
immediately with an error message.
We are talking about operations that have no equivalent meaning on
Windows. they only have meaning in Posix worldview. Why does it make
sense to fail an operation that is meaningless? What useful purpose
could this possibly serve, except having a subtly broken Guile module?
As I said before, useful purposes include (1) to alert the user to a
potential security flaw
Windows does that automatically, no need for Guile to do anything.
and (2) to save someone from a possibly long
debugging session when they should have gotten an error message.
Not sure I understand what you mean here. In what way is this
different from (1) above?
In general, I think it's a serious mistake to assume that what the user
asked to do was unimportant, and can therefore be silently ignored.
It is not unimportant, but it simply makes no sense on Windows. You
cannot change your user or group ID on Windows without typing that
user's password and firing up that user's interactive environment as
result.
Does Windows really not have a concept of different security contexts,
with different privileges?
It has, but those concepts are implemented in a way that does not fit
into the Posix scheme. Privileges are requested and granted one by
one, not by changing the UID. When an operation is a privileged one,
and the user doesn't have that privilege granted, Windows
automatically pops up an elevation prompt. Moreover, the program
cannot elevate itself, except by re-executing, which I hope you will
agree is not a good idea for Guile. There's no way that I know of to
remove a privilege from the set that is granted to the process by the
OS, except if the process acquired such a privilege by issuing a
system call from application code; you certainly cannot do that by
changing your UID. Etc. etc.: the Windows concepts of these are so
alien to what is in Guile now that it is impossible to have them, if
at all, without a complete redesign of how the program treats these
privileged operations. I hope you aren't going to ask for that,
because IMO it would be an unreasonable request.

There are dozens if not hundreds of good ports of Posix software to
Windows which make these operations no-ops. They do this for a good
reason: privileged operations are generally reserved on Windows for
system-level chores, such as installing software. Users don't like to
get elevation prompts while running a "normal" interactive program.
So I don't understand why Guile should be different from all the rest.
I find that hard to believe, but if it's true, then I could perhaps
be convinced that we should make up a uid/gid when the user asks for
one.
If you want Guile to returns a true UID/GID on Windows, it's possible.
I can write that code, sure. But they are not simple scalar numbers
on Windows, they are variable-length arrays of several potentially
very large numbers (the so-called SIDs, or Security Identifiers), you
can read about them here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa379571%28v=vs.85%29.aspx

I hope you will agree with me that using these SIDs where Guile
expects small integers is unwieldy. So we will have to somehow pack
these arrays into scalars, and invent a scheme that will not cause 2
different SIDs map to the same number. Or we could use only the last
component of the SID, and accept the risk of collisions every now and
then.

Of course, Posix-style functions like 'stat' don't returns these SIDs
in the st_uid and st_gid members of 'struct stat', so if Guile would
like to check whether a given file belongs to the current Guile user,
it will think that it doesn't. So we will need to replace 'stat' with
a version that does return these UID/GID numbers. And that is just
the tip of an iceberg.

But suppose we did all that -- what does this gain us in practical
terms that my simplistic solution doesn't? 500 and 513 are the last
components of the "well-known SIDs" of the Local Administrator user
and the "None" group; in the vast majority of cases, a Windows user on
her private machine will use these UID and GID, so in many cases this
is the exact result. In other cases, it is not really correct, but
will this really hamper Guile? I very much doubt that.

By contrast, what you suggest will either (a) gain us marginally more
accurate numbers that will sometimes still be incorrect, or (b) give
us the full SID support, but for a price of much more coding and
refactoring all over the place -- all this when most Windows users
don't even know their user and group SIDs and never saw them. I'm
sorry, but this makes very little sense to me.

It's not that I'm terribly smart; I simply faced these problems in
different programs and resolved them many times. In some cases, it
indeed made sense to go some of the way and implement parts of the
above; in others, it doesn't. If it turns out that Guile indeed needs
more elaborate emulation of these Posix concepts, believe me I'm well
equipped to implement them. For now, my analysis indicated that no
such complications are required. I hope you will accept my judgment,
or describe specific use cases that contradict it.
However, I will strongly oppose turning setuid or setgid into silent
no-ops.
I'm sorry to hear that. Perhaps what I say above will convince you.
Or maybe you could simply trust me, as someone who worked on these
issues for several years, even if you are not entirely convinced. If
not, I guess I should stop submitting Windows-related patches to
Guile, because we obviously have irreconcilable disagreements that
will continue fueling prolonged arguments such as this one.
I've since learned more details about what he did. He built
pthreads-w32 2.9.1 <https://www.sourceware.org/pthreads-win32/>
from source code. He also built libgc-7.2e.
As I said, I've built GC as well. I will try to build pthreads-w32
when I have time, and see if that helps. Thanks for the info.
Post by Eli Zaretskii
Post by Mark H Weaver
If you will be comfortable with a lot more Widows specific stuff, I
can work on that.
Please do!
OK, but please don't block this patch while you wait for me to do
that. It is certainly much better to have this functionality with
restrictions than not at all.
I'd rather wait until we have a proper implementation that works with
multi-threaded programs.
I'm sorry to hear this.

Please try to look at this from my POV. I submitted these changes in
June, 8 months ago. I re-submitted them again now because Ludovic
specifically asked me to do that. And look where I am after again
investing some non-trivial effort: back where I was 8 months ago. I
am once again asked to go through gnulib, which didn't work then, and
my patches are put on hold. This is all very disheartening, I must
say. Sorry, but I no longer understand why I was asked to re-submit
the patches and why I was given write access to the repo. I thought
it was because the patches are going to be accepted with minor
changes, now I'm confused.

Thanks.
Mark H Weaver
2014-02-28 07:22:19 UTC
Permalink
Hi Eli,
Post by Eli Zaretskii
Post by Mark H Weaver
"When an unhandled fatal signal terminates a process,
the exit code is 3."
If this is wrong, can you help me understand what they might have been
thinking when they wrote it?
They were thinking about a single use case: a program that calls
'abort'. Such a program on Windows exits with a status code of 3.
But that is not very interesting in the case in point, and besides,
some programs exit with code 3 for other reasons. One such example is
wget, and there are others.
More to the point, when a Windows program is terminated with an
unhandled fatal exception, it rarely if ever terminates via 'abort'.
Thanks for this explanation. You have convinced me that we should not
import the current 'sys_wait' or 'waitpid' Gnulib modules, and that we
should instead use something closer to what you wrote.

However, I don't want to use the "0xC0000200 + SIGNAL" convention
"between Guile and itself" that you've invented. Instead, based on what
you wrote, I guess that we should make WIFSIGNALED(x) return true for
any status code of the form 0xC0000xxx, or maybe some larger set. I
guess that WIFEXITED(x) should return the logical not of WIFSIGNALED(x).
WIFSTOPPED(x) presumably should always return false.
Post by Eli Zaretskii
# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
WIFSIGNALED(x) returns true if either of two high bits are 1. This
seems a bit too loose to me. Would it be better to include more bits in
the mask, and to compare it with a specific value instead of simply
checking that at least one of those high bits is 1?
Post by Eli Zaretskii
# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_val - 0xC0000200 : stat_val)
I'd prefer to define WTERMSIG differently that this, which would return
very large values like 0xC0000005 for the Windows equivalent of SIGSEGV.
I think we should try to translate these until something that will make
more sense to a program written for POSIX.

IMO, it would be reasonable to just return SIGTERM in all cases, like
Gnulib does, but perhaps we should map some known values to specific
signals. You mentioned that the equivalent of SIGSEGV on Windows
results in a 0xC0000005 status code. Any others worth mapping?

What do you think?

I will respond to the other subthreads at a later date.

Thanks,
Mark
Eli Zaretskii
2014-02-28 09:10:25 UTC
Permalink
Date: Fri, 28 Feb 2014 02:22:19 -0500
However, I don't want to use the "0xC0000200 + SIGNAL" convention
"between Guile and itself" that you've invented. Instead, based on what
you wrote, I guess that we should make WIFSIGNALED(x) return true for
any status code of the form 0xC0000xxx, or maybe some larger set. I
guess that WIFEXITED(x) should return the logical not of WIFSIGNALED(x).
WIFSTOPPED(x) presumably should always return false.
Post by Eli Zaretskii
# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
WIFSIGNALED(x) returns true if either of two high bits are 1. This
seems a bit too loose to me.
Strictly speaking, any of these two bits or both of them could be set,
although an exit code due to a fatal exception normally has both of
them set. The individual bits are used by exceptions related to
debugging a program, like the equivalent of SIGTRAP. In general, a
program launched by Guile could have a debugger attached, and then it
could potentially get an exception like 0x40010005 (Ctrl-C was pressed
into a debugged program). But I don't think these statuses will ever
be propagated back to Guile as a parent process, they are only sent to
the attached debugger, which must handle them.

With this in mind, do you like the below variant of WIFSIGNALED
better?

#define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) == 0xC0000000)

All of the fatal exceptions that are of interest exit with a status
that has these two bits set.

As for the lower bits, I didn't see anywhere documentation of which
ones can or cannot be set, and newer versions of Windows enlarge the
maximum value there, although the interesting values I do see have
non-zero bits only in the lower 12 bits. So if you prefer something
like this:

#define WIFSIGNALED(stat_val) \
(((stat_val) & 0xC0000000) == 0xC0000000 && ((stat_val) & 0xFFF) != 0)

then we could go with that as well. I would actually prefer not to
test those lower bits, since we don't really know which values there
are legitimate, and OTOH the chance that an application deliberately
chooses to exit with any of these large values for any purpose other
than reporting a fatal exception are slim at best.
Post by Eli Zaretskii
# define WTERMSIG(stat_val) ((stat_val > 0xC0000200) ? stat_val - 0xC0000200 : stat_val)
I'd prefer to define WTERMSIG differently that this, which would return
very large values like 0xC0000005 for the Windows equivalent of SIGSEGV.
I think we should try to translate these until something that will make
more sense to a program written for POSIX.
IMO, it would be reasonable to just return SIGTERM in all cases, like
Gnulib does, but perhaps we should map some known values to specific
signals. You mentioned that the equivalent of SIGSEGV on Windows
results in a 0xC0000005 status code. Any others worth mapping?
Are you still talking about a program that crashed, or are you talking
about a program that Guile forcibly killed? SIGTERM only makes sense
in the latter case. My reasoning for not producing SIGTERM
unconditionally was that it might be confusing for a Guile program
that requested termination via some signal other than SIGTERM to see
that the program was terminated by SIGTERM. I wanted WTERMSIG to
return the same signal that was used to kill the program. Since
terminating a program on Windows provides an opportunity to specify
the exit code, we might as well use that opportunity to our advantage
in this case.

As for mapping these values to Posix signals: yes, this is possible.
Here's the suggested mapping of values I consider useful:

0xC0000005 (access to invalid address) SIGSEGV
0xC0000008 (invalid handle) SIGSEGV
0xC000001D (illegal instruction) SIGILL
0xC0000025 (non-continuable exception) SIGILL
0xC000008C (array bounds exceeded) SIGSEGV
0xC000008D (float denormal) SIGFPE
0xC000008E (float divide by zero) SIGFPE
0xC000008F (float inexact) SIGFPE
0xC0000090 (float invalid operation) SIGFPE
0xC0000091 (float overflow) SIGFPE
0xC0000092 (float stack check) SIGFPE
0xC0000093 (float underflow) SIGFPE
0xC0000094 (integer divide by zero) SIGFPE
0xC0000095 (integer overflow) SIGFPE
0xC0000096 (privileged instruction) SIGILL
0xC00000FD (stack overflow) SIGSEGV
0xC000013A (Ctrl-C exit) SIGINT

(The 0xC0000200+signo trick was just a simple way around a proper
mapping, and also a way to keep out of the OS codes. Also, that code
was originally written years ago, when I don't think I knew all of the
above.)

You will see that there's no mapping for SIGTERM here. If having that
is important, we could map the last one to SIGTERM, since a program is
frequently forcibly terminated on Windows by simulating a Ctrl-C
keypress to it. (As for SIGKILL, Windows doesn't have its definition
in its headers, AFAIK.)
I will respond to the other subthreads at a later date.
Thanks.
Mark H Weaver
2014-02-28 10:20:11 UTC
Permalink
Post by Eli Zaretskii
Date: Fri, 28 Feb 2014 02:22:19 -0500
However, I don't want to use the "0xC0000200 + SIGNAL" convention
"between Guile and itself" that you've invented. Instead, based on what
you wrote, I guess that we should make WIFSIGNALED(x) return true for
any status code of the form 0xC0000xxx, or maybe some larger set. I
guess that WIFEXITED(x) should return the logical not of WIFSIGNALED(x).
WIFSTOPPED(x) presumably should always return false.
Post by Eli Zaretskii
# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
WIFSIGNALED(x) returns true if either of two high bits are 1. This
seems a bit too loose to me.
Strictly speaking, any of these two bits or both of them could be set,
[...]
Post by Eli Zaretskii
In general, a
program launched by Guile could have a debugger attached, and then it
could potentially get an exception like 0x40010005 (Ctrl-C was pressed
into a debugged program).
Okay, in that case perhaps the definitions above are the best ones.
I'll leave it to your best judgment what definitions are most likely to
be "future proof", i.e. to correctly distinguish normal exits from
abnormal termination in future versions of Windows.
Post by Eli Zaretskii
IMO, it would be reasonable to just return SIGTERM in all cases, like
Gnulib does, but perhaps we should map some known values to specific
signals. You mentioned that the equivalent of SIGSEGV on Windows
results in a 0xC0000005 status code. Any others worth mapping?
Are you still talking about a program that crashed, or are you talking
about a program that Guile forcibly killed?
I think we should try to consider both of these cases, and also a third
case: programs that are forcibly killed by something other than Guile.
Post by Eli Zaretskii
My reasoning for not producing SIGTERM
unconditionally was that it might be confusing for a Guile program
that requested termination via some signal other than SIGTERM to see
that the program was terminated by SIGTERM. I wanted WTERMSIG to
return the same signal that was used to kill the program.
I suspect that this is relatively unimportant. I suspect that it's more
important to do our best to follow the status code conventions used on
Windows, because:

* When Guile kills a process, the status code of the killed process
should make sense to its parent, even if that parent is not Guile.

* When a Guile subprocess is killed by something other than Guile,
The Guile program should be able to make sense of the status code.
Post by Eli Zaretskii
As for mapping these values to Posix signals: yes, this is possible.
0xC0000005 (access to invalid address) SIGSEGV
0xC0000008 (invalid handle) SIGSEGV
0xC000001D (illegal instruction) SIGILL
0xC0000025 (non-continuable exception) SIGILL
0xC000008C (array bounds exceeded) SIGSEGV
0xC000008D (float denormal) SIGFPE
0xC000008E (float divide by zero) SIGFPE
0xC000008F (float inexact) SIGFPE
0xC0000090 (float invalid operation) SIGFPE
0xC0000091 (float overflow) SIGFPE
0xC0000092 (float stack check) SIGFPE
0xC0000093 (float underflow) SIGFPE
0xC0000094 (integer divide by zero) SIGFPE
0xC0000095 (integer overflow) SIGFPE
0xC0000096 (privileged instruction) SIGILL
0xC00000FD (stack overflow) SIGSEGV
0xC000013A (Ctrl-C exit) SIGINT
This looks good to me. We'll also have to decide what WTERMSIG should
return by default, when the status code is none of the values above.

Also, what if the top bits are something other than 0xCxxx? Should we
mask those off before looking up the code in the table above?
Post by Eli Zaretskii
You will see that there's no mapping for SIGTERM here. If having that
is important, we could map the last one to SIGTERM, since a program is
frequently forcibly terminated on Windows by simulating a Ctrl-C
keypress to it.
Yes, I think that's probably wise.

BTW, on the stable-2.0 branch in git, I recently imported the following
modules from Gnulib:

link fsync readlink rename mkdir rmdir unistd

and removed the corresponding "#ifdef HAVE_xxx" around the associated
Scheme procedure definitions. For details, see:

http://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.0&id=3243fffcc19144fc0b30e983c3e50d1d1fc19ef4
http://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.0&id=ca6adcc6df462f325dfa7b099295fd6212d02b43

I nearly imported the 'pipe-posix' Gnulib module as well, but I noticed
Post by Eli Zaretskii
# define pipe(f) _pipe(f, 0, O_NOINHERIT)
Whereas they do: _pipe(f, 4096, _O_BINARY).

Can you help me understand the pros and cons to these two approaches?
One thing: I'm fairly sure that we'll want _O_BINARY, and I think we
should also make sure that the pipe buffer is no less than 4096 bytes.

What do you think?

Regards,
Mark
Eli Zaretskii
2014-02-28 11:26:05 UTC
Permalink
Date: Fri, 28 Feb 2014 05:20:11 -0500
Post by Eli Zaretskii
Post by Mark H Weaver
Post by Eli Zaretskii
# define WIFEXITED(stat_val) (((stat_val) & 0xC0000000) == 0)
# define WIFSIGNALED(stat_val) (((stat_val) & 0xC0000000) != 0)
WIFSIGNALED(x) returns true if either of two high bits are 1. This
seems a bit too loose to me.
Strictly speaking, any of these two bits or both of them could be set,
[...]
Post by Eli Zaretskii
In general, a
program launched by Guile could have a debugger attached, and then it
could potentially get an exception like 0x40010005 (Ctrl-C was pressed
into a debugged program).
Okay, in that case perhaps the definitions above are the best ones.
I'll leave it to your best judgment what definitions are most likely to
be "future proof", i.e. to correctly distinguish normal exits from
abnormal termination in future versions of Windows.
I think I will go with enforcing 0xC, since the other possible bit
patterns are only supposed to be exposed to debuggers.
Post by Eli Zaretskii
Post by Mark H Weaver
IMO, it would be reasonable to just return SIGTERM in all cases, like
Gnulib does, but perhaps we should map some known values to specific
signals. You mentioned that the equivalent of SIGSEGV on Windows
results in a 0xC0000005 status code. Any others worth mapping?
Are you still talking about a program that crashed, or are you talking
about a program that Guile forcibly killed?
I think we should try to consider both of these cases, and also a third
case: programs that are forcibly killed by something other than Guile.
The last case will look to us as a program that exited. I don't think
we can know that it was killed, on Windows.
Post by Eli Zaretskii
My reasoning for not producing SIGTERM
unconditionally was that it might be confusing for a Guile program
that requested termination via some signal other than SIGTERM to see
that the program was terminated by SIGTERM. I wanted WTERMSIG to
return the same signal that was used to kill the program.
I suspect that this is relatively unimportant.
Perhaps so, but it's easy to produce the same signal, so I think it's
worthwhile. We could use the values mapped from the Posix signals, if
you like that better.
I suspect that it's more important to do our best to follow the
* When Guile kills a process, the status code of the killed process
should make sense to its parent, even if that parent is not Guile.
The only conventions I know of are those codes that start with 0xC.
Not many programs interpret these special codes, but those which do
will be able to make sense of them.
* When a Guile subprocess is killed by something other than Guile,
The Guile program should be able to make sense of the status code.
As I write above, I don't think this goal is achievable on Windows.
Post by Eli Zaretskii
0xC0000005 (access to invalid address) SIGSEGV
0xC0000008 (invalid handle) SIGSEGV
0xC000001D (illegal instruction) SIGILL
0xC0000025 (non-continuable exception) SIGILL
0xC000008C (array bounds exceeded) SIGSEGV
0xC000008D (float denormal) SIGFPE
0xC000008E (float divide by zero) SIGFPE
0xC000008F (float inexact) SIGFPE
0xC0000090 (float invalid operation) SIGFPE
0xC0000091 (float overflow) SIGFPE
0xC0000092 (float stack check) SIGFPE
0xC0000093 (float underflow) SIGFPE
0xC0000094 (integer divide by zero) SIGFPE
0xC0000095 (integer overflow) SIGFPE
0xC0000096 (privileged instruction) SIGILL
0xC00000FD (stack overflow) SIGSEGV
0xC000013A (Ctrl-C exit) SIGINT
This looks good to me.
OK, I will rewrite WTERMSIG and scm_kill to use this mapping.
We'll also have to decide what WTERMSIG should return by default,
when the status code is none of the values above.
I don't see how we can return anything but zero, because we cannot
know what was the signal then, or indeed whether the program was
terminated by a signal.
Also, what if the top bits are something other than 0xCxxx? Should we
mask those off before looking up the code in the table above?
I'm not sure I follow: when a program terminates due to fatal
exception, it _always_ returns one of these codes (there are a few
others, but they don't map to Posix signals, so I left them out).
Therefore I think we should compare the exit code literally to these
values.
I nearly imported the 'pipe-posix' Gnulib module as well, but I noticed
Post by Eli Zaretskii
# define pipe(f) _pipe(f, 0, O_NOINHERIT)
Whereas they do: _pipe(f, 4096, _O_BINARY).
Can you help me understand the pros and cons to these two approaches?
One thing: I'm fairly sure that we'll want _O_BINARY, and I think we
should also make sure that the pipe buffer is no less than 4096 bytes.
First, the binary and the no-inherit flags can be ORed, they are not
mutually exclusive.

I used the no-inherit flag because otherwise the descriptors will be
inherited by the subprocesses. It seemed to me that preventing their
inheritance is a better way. It's not really important in
open-process, because the code I wrote duplicates the handles (which
on Windows strips the no-inheritance bit). But it might be important
in uses of scm_pipe. I think you are in a better position to judge
whether the descriptors should be inherited or not. (Is there a way
in Guile to make a descriptor non-inheritable after it was created?)

As for using O_BINARY, I think using it is wrong in Guile: most
programs that will communicate with us via pipes will use text-mode
I/O on Windows, so reading their output in binary mode would mean we
get CRLF end-of-lines verbatim, and Guile programs that do this will
need special code to get rid of the CR characters. Can Guile code
control this aspect of pipe I/O after the pipe was created?

As for the 4096 value, zero is documented as meaning "use the
default", but I couldn't find anywhere what that means. So I
preferred an explicit value.

Why do you think 4K might be insufficient?

Eli Zaretskii
2014-02-24 21:12:19 UTC
Permalink
Date: Mon, 24 Feb 2014 13:33:56 -0500
Post by Eli Zaretskii
Post by Mark H Weaver
For example, if it is true that we can avoid all the nasty problems with
filename encoding by using the native Windows APIs that use UTF-16 for
filenames, then I'd be in favor of doing that.
What nasty problems do you have in mind? I implemented something like
this in Emacs not long ago, so perhaps I can help here.
The nasty problem is that POSIX uses sequences of bytes for filenames,
although conceptually filenames are character strings, and in fact
virtually all user interfaces treat filenames as character strings.
Guile uses character strings for filenames (the only sane thing to do),
and it would be good to build Guile on system APIs that also use
character strings for filenames, instead of having to guess how to
encode the characters into bytes.
We don't have a fully satisfactory solution to this problem on POSIX,
but I guess we do on Windows, if we use the native Windows APIs.
BTW, the same problems exist for command-line arguments, environment
variables, the hostname, etc. All of these are sequences of bytes in
POSIX, but conceptually they should be character strings.
If you'd like to work on a patch to have Guile use the native Windows
APIs (that use UTF-16) for these things, I think that would be very
useful and worthy of inclusion.
This issue needs to be carefully designed first. File names are easy,
as long as Guile and the OS are concerned. Environment variables and
command-line arguments likewise. But once you need to display those
file names or variables, or ask the user to type them, there are
problems that don't have good solutions yet, at least not in Guile
applications that use the text terminal for display.

First, you need to bypass the usual stdio output routines and use
special APIs. And after you've done that, you'll bump into the fact
that Windows console devices are limited in their ability to support
Unicode characters outside of the system locale; basically anything
beyond European scripts is not supported. (Emacs avoids this problem
because its usual UI is a graphical one, where fonts and layout
engines are available that support almost any script in existence.)
Likewise for keyboard input: typing non-ASCII text into the Windows
console outside of the current console codepage is a tricky business;
basically, you need to completely bypass the "normal" stdio functions
and use Windows specific console APIs and Windows input methods.

There's also the issue of invoking other programs with arguments that
include Unicode characters. Most programs that Guile will invoke on
Windows do not support that, they are "normal" console programs that
only support characters encoded in the current console codepage.
Windows will transparently convert from Unicode to the codepage
encoding, but if there are characters outside of that codepage, they
will be omitted or replaced by placebos, which might cause strange
failures.

There are also complications when calling functions from external
libraries that accept file names: those libraries will not normally
support Unicode characters in file names. But this problem can be
solved by a known trick of using the 8+3 short aliases of the file
names, which use only ASCII characters.

So to provide something useful in this department, we need to discuss
what portions of Guile it is sensible and practical to convert to
Unicode, and how to treat those areas where we won't. I will
certainly need some insider's help in this.
Eli Zaretskii
2014-02-22 18:20:46 UTC
Permalink
Date: Sat, 22 Feb 2014 10:54:16 -0500
=20
Thanks for working on this, but in a multithreaded program, it's no=
good
to change the file descriptors in the main program temporarily befo=
re
spawning, and then restore them afterwards. We'll have to find ano=
ther
way of doing this.
Btw, how does the Posix build work reliably when it forks after
several threads are already running? I don't see any calls to
pthread_atfork or any similar machinery in place. What am I missing?
Mark H Weaver
2014-02-22 22:02:35 UTC
Permalink
Post by Eli Zaretskii
Date: Sat, 22 Feb 2014 10:54:16 -0500
Thanks for working on this, but in a multithreaded program, it's no good
to change the file descriptors in the main program temporarily before
spawning, and then restore them afterwards. We'll have to find another
way of doing this.
Btw, how does the Posix build work reliably when it forks after
several threads are already running? I don't see any calls to
pthread_atfork or any similar machinery in place. What am I missing?
It's safe to fork a multithreaded program without using pthread_atfork
if only async-signal-safe functions are called before the exec. In
fact, that's why Andy rewrote 'open-process' in C, as it says in the
comment above that function.

"If a multi-threaded process calls fork(), the new process shall
contain a replica of the calling thread and its entire address space,
possibly including the states of mutexes and other resources.
Consequently, to avoid errors, the child process may only execute
async-signal-safe operations until such time as one of the exec
functions is called."

http://pubs.opengroup.org/onlinepubs/000095399/functions/fork.html

Mark
Eli Zaretskii
2014-02-23 03:45:02 UTC
Permalink
Date: Sat, 22 Feb 2014 17:02:35 -0500
=20
=20
Post by Eli Zaretskii
Date: Sat, 22 Feb 2014 10:54:16 -0500
=20
Thanks for working on this, but in a multithreaded program, it's=
no good
Post by Eli Zaretskii
to change the file descriptors in the main program temporarily b=
efore
Post by Eli Zaretskii
spawning, and then restore them afterwards. We'll have to find =
another
Post by Eli Zaretskii
way of doing this.
Btw, how does the Posix build work reliably when it forks after
several threads are already running? I don't see any calls to
pthread_atfork or any similar machinery in place. What am I miss=
ing?
=20
It's safe to fork a multithreaded program without using pthread_atf=
ork
if only async-signal-safe functions are called before the exec.
You may know what your code does between fork and exec, but you don't
know what other parts do, like pthreads or the application that calle=
d
Guile as a library.
Mark H Weaver
2014-02-23 04:40:20 UTC
Permalink
Post by Eli Zaretskii
Date: Sat, 22 Feb 2014 17:02:35 -0500
Post by Eli Zaretskii
Date: Sat, 22 Feb 2014 10:54:16 -0500
Thanks for working on this, but in a multithreaded program, it's no good
to change the file descriptors in the main program temporarily before
spawning, and then restore them afterwards. We'll have to find another
way of doing this.
Btw, how does the Posix build work reliably when it forks after
several threads are already running? I don't see any calls to
pthread_atfork or any similar machinery in place. What am I missing?
It's safe to fork a multithreaded program without using pthread_atfork
if only async-signal-safe functions are called before the exec.
You may know what your code does between fork and exec, but you don't
know what other parts do, like pthreads or the application that called
Guile as a library.
I'm not sure I understand what you mean here. The relevant code here is
between line 1366 (/* The child. */) and the call to execvp on line
1408. I see calls to 'close', 'open', 'dup', and 'dup2'.

How could "pthreads" or "the application that called Guile" cause
anything else to happen between fork and exec in the new single-thread
child process?

Mark
Eli Zaretskii
2014-02-23 17:46:19 UTC
Permalink
Date: Sat, 22 Feb 2014 23:40:20 -0500
Post by Eli Zaretskii
Post by Mark H Weaver
It's safe to fork a multithreaded program without using pthread_atfork
if only async-signal-safe functions are called before the exec.
You may know what your code does between fork and exec, but you don't
know what other parts do, like pthreads or the application that called
Guile as a library.
I'm not sure I understand what you mean here. The relevant code here is
between line 1366 (/* The child. */) and the call to execvp on line
1408. I see calls to 'close', 'open', 'dup', and 'dup2'.
How could "pthreads" or "the application that called Guile" cause
anything else to happen between fork and exec in the new single-thread
child process?
Pthreads could have locked some mutex before you fork, for example, in
which case the child starts with a locked mutex and no live thread
that will release it any time soon. The application could have done
similar things, especially if it also uses threads, or some library
that uses shared memory.

Btw, I didn't write above anything that is not described here:

http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html
Mark H Weaver
2014-02-23 20:14:34 UTC
Permalink
Post by Eli Zaretskii
Date: Sat, 22 Feb 2014 23:40:20 -0500
Post by Eli Zaretskii
Post by Mark H Weaver
It's safe to fork a multithreaded program without using pthread_atfork
if only async-signal-safe functions are called before the exec.
You may know what your code does between fork and exec, but you don't
know what other parts do, like pthreads or the application that called
Guile as a library.
I'm not sure I understand what you mean here. The relevant code here is
between line 1366 (/* The child. */) and the call to execvp on line
1408. I see calls to 'close', 'open', 'dup', and 'dup2'.
How could "pthreads" or "the application that called Guile" cause
anything else to happen between fork and exec in the new single-thread
child process?
Pthreads could have locked some mutex before you fork, for example, in
which case the child starts with a locked mutex and no live thread
that will release it any time soon.
This is precisely why it's important to use only async-signal-safe
functions between fork and exec in the child process. Such functions,
by contract, already have to cope with data structures (such as mutexes)
being in an inconsistent state that will never become consistent no
longer how long they wait, because the thread that has the mutex locked
(or is in the middle of locking/unlocking a mutex) might be the same
thread that the signal handler is running in. They also cannot use
mutexes at all.
Post by Eli Zaretskii
The application could have done similar things, especially if it also
uses threads, or some library that uses shared memory.
Any async-signal-safe function has to cope with this kind of thing, by
contract.

I don't know what else to say. We have followed the widely recognized
guidelines for how fork+exec from a multithreaded program, as documented
in the POSIX spec and elsewhere. If you still think there's a problem
here, please do the research and bring us something more specific, or
construct a test case that demonstrates a problem.

Thanks,
Mark
Eli Zaretskii
2014-02-22 10:57:34 UTC
Permalink
Date: Tue, 18 Feb 2014 18:32:34 +0100
=20
http://lists.gnu.org/archive/html/guile-user/2013-06/msg00037.h=
tml
=20
What about exposing %shell-command-name and %shell-command-switch a=
s you
suggested back then, and using that in popen.scm?
That'd be fine too.

What about the -export-dynamic switch to the linker, as mentioned in
the same message -- was that issue resolved in the meantime?
Could you resubmit this patch in =E2=80=98git format-patch=E2=80=
=99 format, with a
ChangeLog-style commit log, and in a separate message?
Done (I hope). It's not exactly =E2=80=98git format-patch=E2=80=99, =
but it should be
close enough.

For my convenience, I prefer to prepare a real ChangeLog entry, then
copy that to a commit message, which leaves a ChangeLog file around.
May I add ChangeLog to libguile/.gitignore, so that ChangeLog is not
flagged by "git status"?
Neil Jerram
2014-02-22 12:23:26 UTC
Permalink
Post by Eli Zaretskii
For my convenience, I prefer to prepare a real ChangeLog entry, then
copy that to a commit message, which leaves a ChangeLog file around.
May I add ChangeLog to libguile/.gitignore, so that ChangeLog is not
flagged by "git status"?
In case you don't already know: to-be-ignored files can also be added to
.git/info/exclude. In that case the ignoring is part of your own local
metadata, and will usefully apply retrospectively to your whole
repository, rather than being committed at a particular point to the
repository content.

Regards,
Neil
Eli Zaretskii
2014-02-22 13:02:22 UTC
Permalink
Date: Sat, 22 Feb 2014 12:23:26 +0000
=20
In case you don't already know: to-be-ignored files can also be add=
ed to
.git/info/exclude. In that case the ignoring is part of your own l=
ocal
metadata, and will usefully apply retrospectively to your whole
repository, rather than being committed at a particular point to th=
e
repository content.
Thanks.

My question still stands, though, because others might want to ignore
ChangeLog files in subdirectories.
Eli Zaretskii
2014-02-22 11:06:23 UTC
Permalink
Date: Tue, 18 Feb 2014 18:32:34 +0100
=20
For convenience and to hopefully smooth the process, I=E2=80=99ve a=
dded you to
the Savannah group. Please post here for review before pushing.
Thanks, but I have one question after reading HACKING: is it required
to rebase changes before pushing? Say, if I commit locally, then,
while waiting for review, someone else pushes to master -- can I then
push once the patch is approved, without rebasing it?
Mark H Weaver
2014-02-22 15:59:17 UTC
Permalink
Post by Eli Zaretskii
Date: Tue, 18 Feb 2014 18:32:34 +0100
For convenience and to hopefully smooth the process, I’ve added you to
the Savannah group. Please post here for review before pushing.
Thanks, but I have one question after reading HACKING: is it required
to rebase changes before pushing? Say, if I commit locally, then,
while waiting for review, someone else pushes to master -- can I then
push once the patch is approved, without rebasing it?
We generally rebase before pushing, to keep the git history as linear as
reasonably possible.

Thanks,
Mark
Eli Zaretskii
2014-02-22 16:36:26 UTC
Permalink
X-Spam-Status: No, score=3D-0.0 required=3D5.0 tests=3DBAYES_40 aut=
olearn=3Ddisabled
=09version=3D3.3.2
Date: Sat, 22 Feb 2014 10:59:17 -0500
=20
=20
Date: Tue, 18 Feb 2014 18:32:34 +0100
=20
For convenience and to hopefully smooth the process, I=E2=80=
=99ve added you to
the Savannah group. Please post here for review before pushing.
Thanks, but I have one question after reading HACKING: is it requ=
ired
to rebase changes before pushing? Say, if I commit locally, then=
,
while waiting for review, someone else pushes to master -- can I =
then
push once the patch is approved, without rebasing it?
=20
We generally rebase before pushing, to keep the git history as line=
ar as
reasonably possible.
May I suggest to have this in HACKING, including the (probably
necessary) "pull --rebase" that should go with it?
Mark H Weaver
2014-02-23 14:21:12 UTC
Permalink
Post by Eli Zaretskii
Post by Mark H Weaver
We generally rebase before pushing, to keep the git history as linear as
reasonably possible.
May I suggest to have this in HACKING, including the (probably
necessary) "pull --rebase" that should go with it?
Sure, that makes sense. Would you like to propose a patch?

Thanks,
Mark
Eli Zaretskii
2014-02-23 18:06:05 UTC
Permalink
Date: Sun, 23 Feb 2014 09:21:12 -0500
Post by Eli Zaretskii
Post by Mark H Weaver
We generally rebase before pushing, to keep the git history as linear as
reasonably possible.
May I suggest to have this in HACKING, including the (probably
necessary) "pull --rebase" that should go with it?
Sure, that makes sense. Would you like to propose a patch?
I could try, but I'm still largely a git newbie (my dVCS of choice is
bzr), so perhaps it would be better if someone else did that.
Mark H Weaver
2014-02-19 07:50:36 UTC
Permalink
Post by Eli Zaretskii
Also, since the only way I could get a functional MinGW Guile was to
configure it without threads, I would suggest that this be the default
for MinGW, but that isn't a big deal.
FWIW, the situation seems to have improved since you last looked. In
the last couple of weeks, madsy on #guile reported cross-building a
recent Guile snapshot (stable-2.0 branch) using MinGW, with thread
support enabled and without --disable-posix, and it seems to work
reasonably well. It runs the REPL without problems and passes much of
the test suite.

Mark
Eli Zaretskii
2014-02-19 16:42:50 UTC
Permalink
Date: Wed, 19 Feb 2014 02:50:36 -0500
=20
Also, since the only way I could get a functional MinGW Guile was=
to
configure it without threads, I would suggest that this be the de=
fault
for MinGW, but that isn't a big deal.
=20
FWIW, the situation seems to have improved since you last looked. =
In
the last couple of weeks, madsy on #guile reported cross-building a
recent Guile snapshot (stable-2.0 branch) using MinGW
Which MinGW? It sounds like nowadays one needs to distinguish betwee=
n
the various flavors. The exact port of pthreads and gc might also
matter.
with thread support enabled and without --disable-posix, and it
seems to work reasonably well. It runs the REPL without problems
and passes much of the test suite.
Good to know; I hope to see a release some place near me in a
not-so-distant future, and will test this at that time.
Doug Evans
2014-02-18 17:31:00 UTC
Permalink
Post by Ludovic Courtès
Right, when Guile is built with pthread support, it has a signal
delivery thread. The actual SIGINT handler ('take_signal' in scmsigs.c)
just write one byte to a pipe; the signal delivery thread reads from
that pipe, and queues an async in the destination thread for later
execution.
Post by Doug Evans
I'll let guile-devel take over at this point - I understand the code,
but may miss something noteworthy.
There is code in scmsigs.c to handle the non-pthread case but it's not
clear how much is exported nor how well it works.
The non-pthread code is used when Guile is built without pthread
support. In that case, the async is queued directly from the signal
handler.
(I think we should aim to get rid of the signal-delivery thread
eventually, and I remember Mark mentioned it before too.)
Note that Python queues the asyncs directly from the signal handler,
even when it has thread support.
I'm not sure if there are any problems in Python's implementation;
asyncs can be queued from any thread but only the main thread runs them.

Guile would need to come up with its own implementation of course;
plus Guile can direct signals to any thread.
Ludovic Courtès
2014-02-18 17:42:36 UTC
Permalink
(Dropping GDB.)
[...]
Post by Doug Evans
Post by Ludovic Courtès
(I think we should aim to get rid of the signal-delivery thread
eventually, and I remember Mark mentioned it before too.)
Note that Python queues the asyncs directly from the signal handler,
even when it has thread support.
I'm not sure if there are any problems in Python's implementation;
asyncs can be queued from any thread but only the main thread runs them.
Guile would need to come up with its own implementation of course;
plus Guile can direct signals to any thread.
There are two difficulties:

1. The signal handler can only call async-signal-safe functions, which
complicates things.

2. In Guile, the ‘sigaction’ procedure can specify the thread that
will run the async.

Ludo’.
Doug Evans
2014-02-18 17:56:49 UTC
Permalink
Post by Ludovic Courtès
(Dropping GDB.)
[...]
Post by Doug Evans
Post by Ludovic Courtès
(I think we should aim to get rid of the signal-delivery thread
eventually, and I remember Mark mentioned it before too.)
Note that Python queues the asyncs directly from the signal handler,
even when it has thread support.
I'm not sure if there are any problems in Python's implementation;
asyncs can be queued from any thread but only the main thread runs them.
Guile would need to come up with its own implementation of course;
plus Guile can direct signals to any thread.
1. The signal handler can only call async-signal-safe functions, which
complicates things.
That's why I'm curious whether Python's solution has any problems.
Someone (potentially not a Guile developer - not sure how to address
copyright issues) should look into that. I will I hope, dunno how
soon though.
Post by Ludovic Courtès
2. In Guile, the 'sigaction' procedure can specify the thread that
will run the async.
Ludo'.
Guile solves that by registering ahead of time which thread.
I don't see it as an added difficulty.

btw, I'm testing a patch that proposes augmenting Guile's signal
handling support.
sigaction tries to do too much. For some apps libguile needs to
provide something with a finer granularity, so to speak.

I see (at least) two high level problems.

1) Some apps need ability to use their own signal handlers.
2) Remove need for a separate thread.

I'm not addressing (2) in this patch, though I am allowing for the day
when the signal delivery thread is gone (the API needn't change).

I'm addressing (1) by exporting two things:
a) ability to record a signal with Guile in an async-safe way (for the
present implementation that means basically by exporting the
pipe-writing part of take_signal).
b) ability to specify in advance which function to call and on which
thread to process the signal, basically by exporting install_handler.

I need to make sure these play well with sigaction. I haven't come
across any problems, but it's early. :-)

Thoughts?
Ludovic Courtès
2014-02-18 23:06:15 UTC
Permalink
Post by Doug Evans
I see (at least) two high level problems.
1) Some apps need ability to use their own signal handlers.
You mean the “real” signal handler, right?
Post by Doug Evans
2) Remove need for a separate thread.
I'm not addressing (2) in this patch, though I am allowing for the day
when the signal delivery thread is gone (the API needn't change).
a) ability to record a signal with Guile in an async-safe way (for the
present implementation that means basically by exporting the
pipe-writing part of take_signal).
b) ability to specify in advance which function to call and on which
thread to process the signal, basically by exporting install_handler.
Sounds like a plan. I wonder if this might expose too much, but we’ll
see with the patch. :-)

Thanks for looking into this!

Ludo’.
Doug Evans
2014-02-19 01:58:37 UTC
Permalink
Post by Doug Evans
I see (at least) two high level problems.
1) Some apps need ability to use their own signal handlers.
You mean the "real" signal handler, right?
Right.
Post by Doug Evans
2) Remove need for a separate thread.
I'm not addressing (2) in this patch, though I am allowing for the day
when the signal delivery thread is gone (the API needn't change).
a) ability to record a signal with Guile in an async-safe way (for the
present implementation that means basically by exporting the
pipe-writing part of take_signal).
b) ability to specify in advance which function to call and on which
thread to process the signal, basically by exporting install_handler.
Sounds like a plan. I wonder if this might expose too much, but we'll
see with the patch. :-)
If we can agree that libguile should support an app installing it's
own "real" signal handler then (a) falls out from that, and (b)
doesn't export anything that sigaction doesn't already export.

I'll hopefully have a patch ready in a few days (it's trivial, but
I've owe other folks some patches :-)).
Thanks for looking into this!
Ludo'.
Continue reading on narkive:
Loading...