Pthreads，第 2 部分：实践中的用法 · UIUC CS241 系统编程中文讲义

# Pthreads，第 2 部分：实践中的用法 > 原文：<https://github.com/angrave/SystemProgramming/wiki/Pthreads%2C-Part-2%3A-Usage-in-Practice> ## 更多 pthread 功能 ## 如何创建 pthread？参见 [Pthreads 第 1 部分](https://github.com/angrave/SystemProgramming/wiki/Pthreads,-Part-1:-Introduction)，其中介绍了`pthread_create`和`pthread_join` ## 如果我两次调用`pthread_create`，我的进程有多少栈？您的进程将包含三个栈 - 每个线程一个栈。第一个线程是在进程启动时创建的，您还创建了另外两个。实际上可以有更多的栈，但是现在让我们忽略这种复杂性。重要的想法是每个线程都需要一个栈，因为栈包含自动变量和旧的 CPU PC 寄存器，因此它可以在函数完成后返回执行调用函数。 ## 完整进程和线程之间有什么区别？此外，与进程不同，同一进程中的线程可以共享相同的全局内存（数据和堆段）。 ## `pthread_cancel`有什么作用？停止一个帖子。请注意，线程可能实际上不会立即停止。例如，它可以在线程进行操作系统调用时终止（例如`write`）。在实践中，`pthread_cancel`很少使用，因为它不会给线程提供自我清理的机会（例如，它可能已经打开了一些文件）。另一种实现是使用 boolean（int）变量，其值用于通知其他线程它们应该完成并清理。 ## `exit`和`pthread_exit`有什么区别？ `exit(42)`退出整个过程并设置进程退出值。这相当于 main 方法中的`return 42`。进程内的所有线程都将停止。 `pthread_exit(void *)`仅停止调用线程，即调用`pthread_exit`后线程永不返回。如果没有其他线程在运行，pthread 库将自动完成该过程。 `pthread_exit(...)`相当于从线程的函数返回;两者都完成线程并设置线程的返回值（void *指针）。在`main`线程中调用`pthread_exit`是简单程序确保所有线程完成的常用方法。例如，在以下程序中，`myfunc`线程可能没有时间开始。 ```c int main() { pthread_t tid1, tid2; pthread_create(&tid1, NULL, myfunc, "Jabberwocky"); pthread_create(&tid2, NULL, myfunc, "Vorpel"); exit(42); //or return 42; // No code is run after exit } ``` 接下来的两个程序将等待新线程完成 - ```c int main() { pthread_t tid1, tid2; pthread_create(&tid1, NULL, myfunc, "Jabberwocky"); pthread_create(&tid2, NULL, myfunc, "Vorpel"); pthread_exit(NULL); // No code is run after pthread_exit // However process will continue to exist until both threads have finished } ``` 或者，我们在从 main（或 call exit）返回之前加入每个线程（即等待它完成）。 ```c int main() { pthread_t tid1, tid2; pthread_create(&tid1, NULL, myfunc, "Jabberwocky"); pthread_create(&tid2, NULL, myfunc, "Vorpel"); // wait for both threads to finish : void* result; pthread_join(tid1, &result); pthread_join(tid2, &result); return 42; } ``` 注意 pthread_exit 版本会创建线程僵尸，但这不是一个长时间运行的进程，所以我们不在乎。 ## 如何终止一个线程？ * 从线程函数返回 * 致电`pthread_exit` * 用`pthread_cancel`取消线程 * 终止进程（例如 SIGTERM）;出口（）;从`main`返回 ## pthread_join 的目的是什么？ * 等待线程完成 * 清理线程资源 * 获取线程的返回值 ## 如果你不打电话给`pthread_join`会怎么样？完成的线程将继续消耗资源。最终，如果创建了足够的线程，`pthread_create`将失败。实际上，这只是长时间运行进程的一个问题，但对于简单，短暂的进程来说不是问题，因为当进程退出时，所有线程资源都会自动释放。 ## 我应该使用`pthread_exit`还是`pthread_join`？ `pthread_exit`和`pthread_join`都会让其他线程自己完成（即使在主线程中调用）。但是，当指定的线程完成时，只有`pthread_join`会返回给您。 `pthread_exit`不会等待并立即结束你的线程并且没有机会继续执行。 ## 你可以将指针从一个线程传递到另一个线程吗？是。但是，您需要非常小心栈变量的生命周期。 ``` pthread_t start_threads() { int start = 42; pthread_t tid; pthread_create(&tid, 0, myfunc, &start); // ERROR! return tid; } ``` 上面的代码无效，因为函数`start_threads`可能会在`myfunc`开始之前返回。该函数传递`start`的地址，但是在`myfunc`执行时，`start`不再在范围内，其地址将重新用于另一个变量。以下代码有效，因为栈变量的生命周期比后台线程长。 ``` void start_threads() { int start = 42; void *result; pthread_t tid; pthread_create(&tid, 0, myfunc, &start); // OK - start will be valid! pthread_join(tid, &result); } ``` ## 比赛条件简介 ## 如何创建具有不同起始值的十个线程。以下代码应该启动十个线程，其值为 0,1,2,3，... 9 然而，当运行时打印出`1 7 8 8 8 8 8 8 8 10`！你能明白为什么吗？ ```c #include <pthread.h> void* myfunc(void* ptr) { int i = *((int *) ptr); printf("%d ", i); return NULL; } int main() { // Each thread gets a different value of i to process int i; pthread_t tid; for(i =0; i < 10; i++) { pthread_create(&tid, NULL, myfunc, &i); // ERROR } pthread_exit(NULL); } ``` 上面的代码遭受`race condition` - i 的值正在改变。新线程稍后启动（在示例输出中，最后一个线程在循环结束后启动）。为了克服这种竞争条件，我们将为每个线程指定一个指向它自己的数据区域的指针。例如，对于每个线程，我们可能希望存储 id，起始值和输出值： ```c struct T { pthread_t id; int start; char result[100]; }; ``` 这些可以存储在一个数组中 - ``` struct T *info = calloc(10 , sizeof(struct T)); // reserve enough bytes for ten T structures ``` 并且每个数组元素都传递给每个线程 - ``` pthread_create(&info[i].id, NULL, func, &info[i]); ``` ## 为什么有些功能，例如 asctime，getenv，strtok，strerror 不是线程安全的吗？要回答这个问题，让我们看一个简单的函数，它也不是“线程安全的” ```c char *to_message(int num) { char static result [256]; if (num < 10) sprintf(result, "%d : blah blah" , num); else strcpy(result, "Unknown"); return result; } ``` 在上面的代码中，结果缓冲区存储在全局内存中。这很好 - 我们不希望返回指向栈上无效地址的指针，但整个内存中只有一个结果缓冲区。如果两个线程同时使用它，那么一个会破坏另一个： | 时间 | 线程 1 | 线程 2 | 评论 | | --- | --- | --- | --- | | 1 | to_m（5） | | | | 2 | | to_m（99） | 现在两个线程都会在结果缓冲区中看到“Unknown” | ## 什么是条件变量，信号量，互斥量？这些是同步锁，用于防止竞争条件并确保在同一程序中运行的线程之间正确同步。另外，这些锁在概念上与内核中使用的原语相同。 ## 在分叉过程中使用线程有什么好处吗？是!在线程之间共享信息很容易，因为（同一进程的）线程存在于同一个虚拟内存空间中。此外，创建线程比创建（分叉）进程要快得多。 ## 使用线程而不是分叉过程有什么缺点吗？是!没有隔离！当线程存在于同一进程中时，一个线程可以访问与其他线程相同的虚拟内存。单个线程可以终止整个过程（例如，通过尝试读取地址零）。 ## 你能用多个线程分叉一个进程吗？是!但是子进程只有一个线程（它是调用`fork`的线程的一个克隆。我们可以看到这是一个简单的例子，后台线程永远不会在子进程中打印出第二条消息。 ```c #include <pthread.h> #include <stdio.h> #include <unistd.h> static pid_t child = -2; void *sleepnprint(void *arg) { printf("%d:%s starting up...\n", getpid(), (char *) arg); while (child == -2) {sleep(1);} /* Later we will use condition variables */ printf("%d:%s finishing...\n",getpid(), (char*)arg); return NULL; } int main() { pthread_t tid1, tid2; pthread_create(&tid1,NULL, sleepnprint, "New Thread One"); pthread_create(&tid2,NULL, sleepnprint, "New Thread Two"); child = fork(); printf("%d:%s\n",getpid(), "fork()ing complete"); sleep(3); printf("%d:%s\n",getpid(), "Main thread finished"); pthread_exit(NULL); return 0; /* Never executes */ } ``` ``` 8970:New Thread One starting up... 8970:fork()ing complete 8973:fork()ing complete 8970:New Thread Two starting up... 8970:New Thread Two finishing... 8970:New Thread One finishing... 8970:Main thread finished 8973:Main thread finished ``` 实际上，在分叉之前创建线程可能会导致意外错误，因为（如上所示）其他线程在分叉时会立即终止。另一个线程可能只是锁定互斥锁（例如通过调用 malloc）并且永远不会再次解锁它。高级用户可能会发现`pthread_atfork`有用但我们建议您通常尽量避免在分叉之前创建线程，除非您完全理解此方法的局限性和困难。 ## 还有其他原因使`fork`可能比创建线程更可取。创建单独的进程很有用 * 需要更高安全性时（例如，Chrome 浏览器对不同的标签使用不同的进程） * 运行现有的完整程序时，需要一个新的过程（例如，启动'gcc'） * 当您遇到同步原语并且每个进程正在对系统中的某些操作进行操作时 ## 我怎样才能找到更多？请参阅[手册页](http://man7.org/linux/man-pages/man3/pthread_create.3.html)中的完整示例和 [pthread 参考指南](http://man7.org/linux/man-pages/man7/pthreads.7.html)另外：[简明的第三方示例代码解释创建，加入和退出](http://www.thegeekstuff.com/2012/04/terminate-c-thread/)