数据的抽象 · 面向指针编程

## 数据的抽象当你具备了一些 C 编程基础，并且能够理解上文中的内容，那么你就可以对各种类型的数据进行抽象了。我们为什么要对数据进行抽象？《计算机程序的构造和解释》的第 2 章的导言部分给出了很好的答案，即：许多程序在设计时就是为了模拟复杂的现象，因为它们就常常需要构造出一些运算对象，为了能够模拟真实世界中的现象的各个方面，需要将运算对象表示为一些组件的复合结构。下面来对自行车链的任意一个链节进行模拟： ![](https://box.kancloud.cn/2015-12-28_5680d099601b7.jpg) ~~~ struct chain_node { struct chain_node *prev; struct chain_node *next; void *shape; }; ~~~ 然后我们可以造出 3 个链节，然后可以造出世界上最短的车链： ~~~ struct chain_node a, b, c; a.next = &b; b.prev = &a; b.next = &c; c.prev = &b; c.next = &a; a.prev = &c; ~~~ 如果再多造一些链节，就可以得到周长大一些的车链，也能够制造出各种形状的多边形，但是最好是借助无名的内存空间。下面的代码可以创建一条具有 1000 个链节的链条： ~~~ struct chain_node *head = malloc(sizeof(struct chain_node)); struct chain_node *tail = head; for (int i = 0; i < 1000; i++) { struct chain_node *new_tail = malloc(sizeof(struct chain_node)); tail->next = new_tail; new_tail->prev = tail; tail = new_tail; } tail->next = head; head->prev = tail; ~~~ 如果我们将前面那个示例中的 `a`，`b`, `c` 视为三角形的三个顶点，那么我们所创造的三个链节构成的链条就变成了一个三角形。同理，上述所创建的 1000 个链节的链条就变成了一个 1000 条边首尾相接的多边形。如果学过拓扑学，那么自然可以发现任何与圆环同胚的结构都可以基于 `struct chain_node` 这种数据结构模拟出来，而我们所仰仗的东西仅仅是将三个指针封装到一个结构体中。事实上，`struct chain_node` 中的第三个指针 `void *shape` 还没被用到。这是一个 `void *` 类型的指针，是喜欢用 C 代码玩各种抽象的程序猿的最爱，因为它能引用任何类型数据所在内存空间的基地址。这就意味着 `struct chain_node` 可以借助 `shape` 指针获得强大的扩展能力。现在，我要制造一种很简陋的链节，它的形状仅仅是一个矩形的小铁片，上面打了两个小圆孔。我将它的数据结构设计为： ~~~ struct point { double x; double y; }; struct rectangle { double width; double height; }; struct circle { struct point *center; double radius; }; struct chain_node_shape { struct rectangle *body; struct circle *holes[2] ; }; ~~~ 基于这些数据结构，我就可以写出一个专门用来制造矩形小铁片的函数： ~~~ struct chain_node_shape * create_chain_node_shape(struct circle *c1, struct circle *c2, struct rectangle *rect) { struct chain_node_shape *ret = malloc(sizeof(struct chain_node_shape)); ret->body = rect; ret->holes[0] = c1; ret->holes[1] = c2; return ret; } ~~~ 然后再为 `create_chain_node_shape` 所接受的两种参数写出相应的构造函数： ~~~ struct circle * create_circle(struct point *center, double radius) { struct circle *ret = malloc(sizeof(struct circle)); ret->center = center; ret->radius = radius; return ret; } struct rectangle * create_rectangle(double w, double h) { struct rectangle *ret = malloc(sizeof(struct rectangle)); ret->width = w; ret->height = h; return ret; } ~~~ 为了让 `create_circle` 更方便使用，最好再创建一个 `struct point` 的构造函数： ~~~ struct point * create_point(double x, double y) { struct point *ret = malloc(sizeof(struct point)); ret->x = x; ret->y = y; return ret; } ~~~ 一切所需要的构件都已准备完毕，现在可以开始生产某种特定型号的链节了，即： ~~~ struct chain_node * create_chain_node(void) { double radius = 0.5; double left_x = 1.0; double left_y = 1.0; struct point *left_center = create_point(left_x, left_y); struct circle *left_hole = create_circle(left_center, radius); double right_x = 9.0; double right_y = 1.0; struct point *right_center = create_point(right_x, right_y); struct circle *right_hole = create_circle(right_center, radius); struct rectangle *body = create_rectangle(10.0, 2.0); struct chain_node *ret = malloc(sizeof(struct chain_node)); ret->prev = NULL; ret->next = NULL; ret->shape = create_chain_node_shape(left_hole, right_hole, body); return ret; } ~~~ 最后再将制造链条的代码略作修改： ~~~ struct chain_node *head = create_chain_node(); struct chain_node *tail = head; for (int i = 0; i < 1000; i++) { struct chain_node *new_tail = create_chain_node(); tail->next = new_tail; new_tail->prev = tail; tail = new_tail; } tail->next = head; head->prev = tail; ~~~ 现在我们所模拟的车链与现实中的车链已经有些形似了。上述代码虽然有些冗长，下文会对其进行重构，现在先来总结一下上述代码中指针的用法。仔细观察上述代码中我们所定义的结构体，它们的共同特征是：所有非 C 内建的数据类型都是结构体类型，当它们作为某个结构体成员类型时均被声明为指针类型。为什么要这样？如果你真的打算问这个问题，那么就请你观察一下上述的 5 个`create_xxx` 函数，你会发现这些 `create` 函数的参数与返回值也都是结构体类型的指针。将这些现象综合起来，可以得出以下结论： 1. 将结构体指针作为函数的参数与返回值，可以避免函数调用时发生过多的内存复制。 2. 当一个结构体类型作为其他结构体的成员类型时，将前者声明为指针类型，可以在后者的 `create` 函数中避免繁琐的解引用。 3. `void *` 指针可以引用任意类型的数据存储空间的基地址。例如在 `create_chain_node` 函数的定义中，我们将一个`struct chain_node_shape` 类型的指针赋给了 `void *` 类型的指针 `shape`。这三条结论是指针在数据抽象中的惯用手法，它不仅关系到数据结构的设计，也关系到数据结构的构造与销毁函数的设计。（上述代码为了省事，没有定义数据结构的销毁函数）