Skip to content

Commit ebf5ebe

Browse files
Ingo MolnarLinus Torvalds
authored andcommitted
[PATCH] signal-fixes-2.5.59-A4
this is the current threading patchset, which accumulated up during the past two weeks. It consists of a biggest set of changes from Roland, to make threaded signals work. There were still tons of testcases and boundary conditions (mostly in the signal/exit/ptrace area) that we did not handle correctly. Roland's thread-signal semantics/behavior/ptrace fixes: - fix signal delivery race with do_exit() => signals are re-queued to the 'process' if do_exit() finds pending unhandled ones. This prevents signals getting lost upon thread-sys_exit(). - a non-main thread has died on one processor and gone to TASK_ZOMBIE, but before it's gotten to release_task a sys_wait4 on the other processor reaps it. It's only because it's ptraced that this gets through eligible_child. Somewhere in there the main thread is also dying so it reparents the child thread to hit that case. This means that there is a race where P might be totally invalid. - forget_original_parent is not doing the right thing when the group leader dies, i.e. reparenting threads to init when there is a zombie group leader. Perhaps it doesn't matter for any practical purpose without ptrace, though it makes for ppid=1 for each thread in core dumps, which looks funny. Incidentally, SIGCHLD here really should be p->exit_signal. - one of the gdb tests makes a questionable assumption about what kill will do when it has some threads stopped by ptrace and others running. exit races: 1. Processor A is in sys_wait4 case TASK_STOPPED considering task P. Processor B is about to resume P and then switch to it. While A is inside that case block, B starts running P and it clears P->exit_code, or takes a pending fatal signal and sets it to a new value. Depending on the interleaving, the possible failure modes are: a. A gets to its put_user after B has cleared P->exit_code => returns with WIFSTOPPED, WSTOPSIG==0 b. A gets to its put_user after B has set P->exit_code anew => returns with e.g. WIFSTOPPED, WSTOPSIG==SIGKILL A can spend an arbitrarily long time in that case block, because there's getrusage and put_user that can take page faults, and write_lock'ing of the tasklist_lock that can block. But even if it's short the race is there in principle. 2. This is new with NPTL, i.e. CLONE_THREAD. Two processors A and B are both in sys_wait4 case TASK_STOPPED considering task P. Both get through their tests and fetches of P->exit_code before either gets to P->exit_code = 0. => two threads return the same pid from waitpid. In other interleavings where one processor gets to its put_user after the other has cleared P->exit_code, it's like case 1(a). 3. SMP races with stop/cont signals First, take: kill(pid, SIGSTOP); kill(pid, SIGCONT); or: kill(pid, SIGSTOP); kill(pid, SIGKILL); It's possible for this to leave the process stopped with a pending SIGCONT/SIGKILL. That's a state that should never be possible. Moreover, kill(pid, SIGKILL) without any repetition should always be enough to kill a process. (Likewise SIGCONT when you know it's sequenced after the last stop signal, must be sufficient to resume a process.) 4. take: kill(pid, SIGKILL); // or any fatal signal kill(pid, SIGCONT); // or SIGKILL it's possible for this to cause pid to be reaped with status 0 instead of its true termination status. The equivalent scenario happens when the process being killed is in an _exit call or a trap-induced fatal signal before the kills. plus i've done stability fixes for bugs that popped up during beta-testing, and minor tidying of Roland's changes: - a rare tasklist corruption during exec, causing some very spurious and colorful crashes. - a copy_process()-related dereference of already freed thread structure if hit with a SIGKILL in the wrong moment. - SMP spinlock deadlocks in the signal code this patchset has been tested quite well in the 2.4 backport of the threading changes - and i've done some stresstesting on 2.5.59 SMP as well, and did an x86 UP testcompile + testboot as well.
1 parent 44a5a59 commit ebf5ebe

File tree

6 files changed

+779
-470
lines changed

6 files changed

+779
-470
lines changed

fs/exec.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -587,7 +587,7 @@ static inline int de_thread(struct signal_struct *oldsig)
587587
return -EAGAIN;
588588
}
589589
oldsig->group_exit = 1;
590-
__broadcast_thread_group(current, SIGKILL);
590+
zap_other_threads(current);
591591

592592
/*
593593
* Account for the thread group leader hanging around:
@@ -659,7 +659,8 @@ static inline int de_thread(struct signal_struct *oldsig)
659659
current->ptrace = ptrace;
660660
__ptrace_link(current, parent);
661661
}
662-
662+
663+
list_del(&current->tasks);
663664
list_add_tail(&current->tasks, &init_task.tasks);
664665
current->exit_signal = SIGCHLD;
665666
state = leader->state;
@@ -680,6 +681,7 @@ static inline int de_thread(struct signal_struct *oldsig)
680681
newsig->group_exit = 0;
681682
newsig->group_exit_code = 0;
682683
newsig->group_exit_task = NULL;
684+
newsig->group_stop_count = 0;
683685
memcpy(newsig->action, current->sig->action, sizeof(newsig->action));
684686
init_sigpending(&newsig->shared_pending);
685687

include/linux/sched.h

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,9 @@ struct signal_struct {
235235
int group_exit;
236236
int group_exit_code;
237237
struct task_struct *group_exit_task;
238+
239+
/* thread group stop support, overloads group_exit_code too */
240+
int group_stop_count;
238241
};
239242

240243
/*
@@ -508,7 +511,6 @@ extern int in_egroup_p(gid_t);
508511
extern void proc_caches_init(void);
509512
extern void flush_signals(struct task_struct *);
510513
extern void flush_signal_handlers(struct task_struct *);
511-
extern void sig_exit(int, int, struct siginfo *);
512514
extern int dequeue_signal(sigset_t *mask, siginfo_t *info);
513515
extern void block_all_signals(int (*notifier)(void *priv), void *priv,
514516
sigset_t *mask);
@@ -525,7 +527,7 @@ extern void do_notify_parent(struct task_struct *, int);
525527
extern void force_sig(int, struct task_struct *);
526528
extern void force_sig_specific(int, struct task_struct *);
527529
extern int send_sig(int, struct task_struct *, int);
528-
extern int __broadcast_thread_group(struct task_struct *p, int sig);
530+
extern void zap_other_threads(struct task_struct *p);
529531
extern int kill_pg(pid_t, int, int);
530532
extern int kill_sl(pid_t, int, int);
531533
extern int kill_proc(pid_t, int, int);
@@ -590,6 +592,8 @@ extern void exit_files(struct task_struct *);
590592
extern void exit_sighand(struct task_struct *);
591593
extern void __exit_sighand(struct task_struct *);
592594

595+
extern NORET_TYPE void do_group_exit(int);
596+
593597
extern void reparent_to_init(void);
594598
extern void daemonize(void);
595599
extern task_t *child_reaper;
@@ -762,6 +766,8 @@ static inline void cond_resched_lock(spinlock_t * lock)
762766
extern FASTCALL(void recalc_sigpending_tsk(struct task_struct *t));
763767
extern void recalc_sigpending(void);
764768

769+
extern void signal_wake_up(struct task_struct *t, int resume_stopped);
770+
765771
/*
766772
* Wrappers for p->thread_info->cpu access. No-op on UP.
767773
*/

kernel/exit.c

Lines changed: 124 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -647,7 +647,7 @@ NORET_TYPE void do_exit(long code)
647647
exit_namespace(tsk);
648648
exit_thread();
649649

650-
if (current->leader)
650+
if (tsk->leader)
651651
disassociate_ctty(1);
652652

653653
module_put(tsk->thread_info->exec_domain->module);
@@ -657,8 +657,31 @@ NORET_TYPE void do_exit(long code)
657657
tsk->exit_code = code;
658658
exit_notify();
659659
preempt_disable();
660-
if (current->exit_signal == -1)
661-
release_task(current);
660+
if (signal_pending(tsk) && !tsk->sig->group_exit
661+
&& !thread_group_empty(tsk)) {
662+
/*
663+
* This occurs when there was a race between our exit
664+
* syscall and a group signal choosing us as the one to
665+
* wake up. It could be that we are the only thread
666+
* alerted to check for pending signals, but another thread
667+
* should be woken now to take the signal since we will not.
668+
* Now we'll wake all the threads in the group just to make
669+
* sure someone gets all the pending signals.
670+
*/
671+
struct task_struct *t;
672+
read_lock(&tasklist_lock);
673+
spin_lock_irq(&tsk->sig->siglock);
674+
for (t = next_thread(tsk); t != tsk; t = next_thread(t))
675+
if (!signal_pending(t) && !(t->flags & PF_EXITING)) {
676+
recalc_sigpending_tsk(t);
677+
if (signal_pending(t))
678+
signal_wake_up(t, 0);
679+
}
680+
spin_unlock_irq(&tsk->sig->siglock);
681+
read_unlock(&tasklist_lock);
682+
}
683+
if (tsk->exit_signal == -1)
684+
release_task(tsk);
662685
schedule();
663686
BUG();
664687
/*
@@ -710,31 +733,44 @@ task_t *next_thread(task_t *p)
710733
}
711734

712735
/*
713-
* this kills every thread in the thread group. Note that any externally
714-
* wait4()-ing process will get the correct exit code - even if this
715-
* thread is not the thread group leader.
736+
* Take down every thread in the group. This is called by fatal signals
737+
* as well as by sys_exit_group (below).
716738
*/
717-
asmlinkage long sys_exit_group(int error_code)
739+
NORET_TYPE void
740+
do_group_exit(int exit_code)
718741
{
719-
unsigned int exit_code = (error_code & 0xff) << 8;
720-
721-
if (!thread_group_empty(current)) {
722-
struct signal_struct *sig = current->sig;
742+
BUG_ON(exit_code & 0x80); /* core dumps don't get here */
723743

744+
if (current->sig->group_exit)
745+
exit_code = current->sig->group_exit_code;
746+
else if (!thread_group_empty(current)) {
747+
struct signal_struct *const sig = current->sig;
748+
read_lock(&tasklist_lock);
724749
spin_lock_irq(&sig->siglock);
725-
if (sig->group_exit) {
726-
spin_unlock_irq(&sig->siglock);
727-
728-
/* another thread was faster: */
729-
do_exit(sig->group_exit_code);
730-
}
750+
if (sig->group_exit)
751+
/* Another thread got here before we took the lock. */
752+
exit_code = sig->group_exit_code;
753+
else {
731754
sig->group_exit = 1;
732755
sig->group_exit_code = exit_code;
733-
__broadcast_thread_group(current, SIGKILL);
756+
zap_other_threads(current);
757+
}
734758
spin_unlock_irq(&sig->siglock);
759+
read_unlock(&tasklist_lock);
735760
}
736761

737762
do_exit(exit_code);
763+
/* NOTREACHED */
764+
}
765+
766+
/*
767+
* this kills every thread in the thread group. Note that any externally
768+
* wait4()-ing process will get the correct exit code - even if this
769+
* thread is not the thread group leader.
770+
*/
771+
asmlinkage long sys_exit_group(int error_code)
772+
{
773+
do_group_exit((error_code & 0xff) << 8);
738774
}
739775

740776
static int eligible_child(pid_t pid, int options, task_t *p)
@@ -800,6 +836,8 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
800836
int ret;
801837

802838
list_for_each(_p,&tsk->children) {
839+
int exit_code;
840+
803841
p = list_entry(_p,struct task_struct,sibling);
804842

805843
ret = eligible_child(pid, options, p);
@@ -813,20 +851,69 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
813851
continue;
814852
if (!(options & WUNTRACED) && !(p->ptrace & PT_PTRACED))
815853
continue;
854+
if (ret == 2 && !(p->ptrace & PT_PTRACED) &&
855+
p->sig && p->sig->group_stop_count > 0)
856+
/*
857+
* A group stop is in progress and
858+
* we are the group leader. We won't
859+
* report until all threads have
860+
* stopped.
861+
*/
862+
continue;
816863
read_unlock(&tasklist_lock);
817864

818865
/* move to end of parent's list to avoid starvation */
819866
write_lock_irq(&tasklist_lock);
820867
remove_parent(p);
821868
add_parent(p, p->parent);
869+
870+
/*
871+
* This uses xchg to be atomic with
872+
* the thread resuming and setting it.
873+
* It must also be done with the write
874+
* lock held to prevent a race with the
875+
* TASK_ZOMBIE case (below).
876+
*/
877+
exit_code = xchg(&p->exit_code, 0);
878+
if (unlikely(p->state > TASK_STOPPED)) {
879+
/*
880+
* The task resumed and then died.
881+
* Let the next iteration catch it
882+
* in TASK_ZOMBIE. Note that
883+
* exit_code might already be zero
884+
* here if it resumed and did
885+
* _exit(0). The task itself is
886+
* dead and won't touch exit_code
887+
* again; other processors in
888+
* this function are locked out.
889+
*/
890+
p->exit_code = exit_code;
891+
exit_code = 0;
892+
}
893+
if (unlikely(exit_code == 0)) {
894+
/*
895+
* Another thread in this function
896+
* got to it first, or it resumed,
897+
* or it resumed and then died.
898+
*/
899+
write_unlock_irq(&tasklist_lock);
900+
continue;
901+
}
902+
/*
903+
* Make sure this doesn't get reaped out from
904+
* under us while we are examining it below.
905+
* We don't want to keep holding onto the
906+
* tasklist_lock while we call getrusage and
907+
* possibly take page faults for user memory.
908+
*/
909+
get_task_struct(p);
822910
write_unlock_irq(&tasklist_lock);
823911
retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0;
824912
if (!retval && stat_addr)
825-
retval = put_user((p->exit_code << 8) | 0x7f, stat_addr);
826-
if (!retval) {
827-
p->exit_code = 0;
913+
retval = put_user((exit_code << 8) | 0x7f, stat_addr);
914+
if (!retval)
828915
retval = p->pid;
829-
}
916+
put_task_struct(p);
830917
goto end_wait4;
831918
case TASK_ZOMBIE:
832919
/*
@@ -841,6 +928,13 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
841928
state = xchg(&p->state, TASK_DEAD);
842929
if (state != TASK_ZOMBIE)
843930
continue;
931+
if (unlikely(p->exit_signal == -1))
932+
/*
933+
* This can only happen in a race with
934+
* a ptraced thread dying on another
935+
* processor.
936+
*/
937+
continue;
844938
read_unlock(&tasklist_lock);
845939

846940
retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0;
@@ -857,11 +951,17 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
857951
retval = p->pid;
858952
if (p->real_parent != p->parent) {
859953
write_lock_irq(&tasklist_lock);
954+
/* Double-check with lock held. */
955+
if (p->real_parent != p->parent) {
860956
__ptrace_unlink(p);
861-
do_notify_parent(p, SIGCHLD);
957+
do_notify_parent(
958+
p, p->exit_signal);
862959
p->state = TASK_ZOMBIE;
960+
p = NULL;
961+
}
863962
write_unlock_irq(&tasklist_lock);
864-
} else
963+
}
964+
if (p != NULL)
865965
release_task(p);
866966
goto end_wait4;
867967
default:

kernel/fork.c

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -680,6 +680,7 @@ static inline int copy_sighand(unsigned long clone_flags, struct task_struct * t
680680
sig->group_exit = 0;
681681
sig->group_exit_code = 0;
682682
sig->group_exit_task = NULL;
683+
sig->group_stop_count = 0;
683684
memcpy(sig->action, current->sig->action, sizeof(sig->action));
684685
sig->curr_target = NULL;
685686
init_sigpending(&sig->shared_pending);
@@ -801,7 +802,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
801802
spin_lock_init(&p->alloc_lock);
802803
spin_lock_init(&p->switch_lock);
803804

804-
clear_tsk_thread_flag(p,TIF_SIGPENDING);
805+
clear_tsk_thread_flag(p, TIF_SIGPENDING);
805806
init_sigpending(&p->pending);
806807

807808
p->it_real_value = p->it_virt_value = p->it_prof_value = 0;
@@ -910,6 +911,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
910911
*/
911912
if (sigismember(&current->pending.signal, SIGKILL)) {
912913
write_unlock_irq(&tasklist_lock);
914+
retval = -EINTR;
913915
goto bad_fork_cleanup_namespace;
914916
}
915917

@@ -934,6 +936,17 @@ static struct task_struct *copy_process(unsigned long clone_flags,
934936
}
935937
p->tgid = current->tgid;
936938
p->group_leader = current->group_leader;
939+
940+
if (current->sig->group_stop_count > 0) {
941+
/*
942+
* There is an all-stop in progress for the group.
943+
* We ourselves will stop as soon as we check signals.
944+
* Make the new thread part of that group stop too.
945+
*/
946+
current->sig->group_stop_count++;
947+
set_tsk_thread_flag(p, TIF_SIGPENDING);
948+
}
949+
937950
spin_unlock(&current->sig->siglock);
938951
}
939952

@@ -1036,8 +1049,13 @@ struct task_struct *do_fork(unsigned long clone_flags,
10361049
init_completion(&vfork);
10371050
}
10381051

1039-
if (p->ptrace & PT_PTRACED)
1040-
send_sig(SIGSTOP, p, 1);
1052+
if (p->ptrace & PT_PTRACED) {
1053+
/*
1054+
* We'll start up with an immediate SIGSTOP.
1055+
*/
1056+
sigaddset(&p->pending.signal, SIGSTOP);
1057+
set_tsk_thread_flag(p, TIF_SIGPENDING);
1058+
}
10411059

10421060
wake_up_forked_process(p); /* do this last */
10431061
++total_forks;

0 commit comments

Comments
 (0)