多应用+插件架构,代码干净,二开方便,首家独创一键云编译技术,文档视频完善,免费商用码云13.8K 广告
# 参数优化 该页面包含了 LightGBM 中所有的参数. **其他有用链接列表** * [参数](./Parameters.rst) * [Python API](./Python-API.rst) ## 针对 Leaf-wise (最佳优先) 树的参数优化 LightGBM uses the [leaf-wise](./Features.rst#leaf-wise-best-first-tree-growth) tree growth algorithm, while many other popular tools use depth-wise tree growth. Compared with depth-wise growth, the leaf-wise algorithm can convenge much faster. However, the leaf-wise growth may be over-fitting if not used with the appropriate parameters. LightGBM 使用 [leaf-wise](./Features.rst#leaf-wise-best-first-tree-growth) 的树生长策略, 而很多其他流行的算法采用 depth-wise 的树生长策略. 与 depth-wise 的树生长策略相较, leaf-wise 算法可以收敛的更快. 但是, 如果参数选择不当的话, leaf-wise 算法有可能导致过拟合. To get good results using a leaf-wise tree, these are some important parameters: 想要在使用 leaf-wise 算法时得到好的结果, 这里有几个重要的参数值得注意: 1. `num_leaves`. This is the main parameter to control the complexity of the tree model. Theoretically, we can set `num_leaves = 2^(max_depth)` to convert from depth-wise tree. However, this simple conversion is not good in practice. The reason is, when number of leaves are the same, the leaf-wise tree is much deeper than depth-wise tree. As a result, it may be over-fitting. Thus, when trying to tune the `num_leaves`, we should let it be smaller than `2^(max_depth)`. For example, when the `max_depth=6` the depth-wise tree can get good accuracy, but setting `num_leaves` to `127` may cause over-fitting, and setting it to `70` or `80` may get better accuracy than depth-wise. Actually, the concept `depth` can be forgotten in leaf-wise tree, since it doesn’t have a correct mapping from `leaves` to `depth`. 1. `num_leaves`. 这是控制树模型复杂度的主要参数. 理论上, 借鉴 depth-wise 树, 我们可以设置 `num_leaves = 2^(max_depth)` 但是, 这种简单的转化在实际应用中表现不佳. 这是因为, 当叶子数目相同时, leaf-wise 树要比 depth-wise 树深得多, 这就有可能导致过拟合. 因此, 当我们试着调整 `num_leaves` 的取值时, 应该让其小于 `2^(max_depth)`. 举个例子, 当 `max_depth=6` 时(这里译者认为例子中, 树的最大深度应为7), depth-wise 树可以达到较高的准确率.但是如果设置 `num_leaves` 为 `127` 时, 有可能会导致过拟合, 而将其设置为 `70` 或 `80` 时可能会得到比 depth-wise 树更高的准确率. 其实, `depth` 的概念在 leaf-wise 树中并没有多大作用, 因为并不存在一个从 `leaves` 到 `depth` 的合理映射. 2. `min_data_in_leaf`. This is a very important parameter to deal with over-fitting in leaf-wise tree. Its value depends on the number of training data and `num_leaves`. Setting it to a large value can avoid growing too deep a tree, but may cause under-fitting. In practice, setting it to hundreds or thousands is enough for a large dataset. 1. `min_data_in_leaf`. 这是处理 leaf-wise 树的过拟合问题中一个非常重要的参数. 它的值取决于训练数据的样本个树和 `num_leaves`. 将其设置的较大可以避免生成一个过深的树, 但有可能导致欠拟合. 实际应用中, 对于大数据集, 设置其为几百或几千就足够了. 2. `max_depth`. You also can use `max_depth` to limit the tree depth explicitly. 1. `max_depth`. 你也可以利用 `max_depth` 来显式地限制树的深度. ## 针对更快的训练速度 * Use bagging by setting `bagging_fraction` and `bagging_freq` * Use feature sub-sampling by setting `feature_fraction` * Use small `max_bin` * Use `save_binary` to speed up data loading in future learning * Use parallel learning, refer to [并行学习指南](./Parallel-Learning-Guide.rst) * 通过设置 `bagging_fraction` 和 `bagging_freq` 参数来使用 bagging 方法 * 通过设置 `feature_fraction` 参数来使用特征的子抽样 * 使用较小的 `max_bin` * 使用 `save_binary` 在未来的学习过程对数据加载进行加速 * 使用并行学习, 可参考 [并行学习指南](./Parallel-Learning-Guide.rst) ## 针对更好的准确率 * Use large `max_bin` (may be slower) * Use small `learning_rate` with large `num_iterations` * Use large `num_leaves` (may cause over-fitting) * Use bigger training data * Try `dart` * 使用较大的 `max_bin` (学习速度可能变慢) * 使用较小的 `learning_rate` 和较大的 `num_iterations` * 使用较大的 `num_leaves` (可能导致过拟合) * 使用更大的训练数据 * 尝试 `dart` ## 处理过拟合 * Use small `max_bin` * Use small `num_leaves` * Use `min_data_in_leaf` and `min_sum_hessian_in_leaf` * Use bagging by set `bagging_fraction` and `bagging_freq` * Use feature sub-sampling by set `feature_fraction` * Use bigger training data * Try `lambda_l1`, `lambda_l2` and `min_gain_to_split` for regularization * Try `max_depth` to avoid growing deep tree * 使用较小的 `max_bin` * 使用较小的 `num_leaves` * 使用 `min_data_in_leaf` 和 `min_sum_hessian_in_leaf` * 通过设置 `bagging_fraction` 和 `bagging_freq` 来使用 bagging * 通过设置 `feature_fraction` 来使用特征子抽样 * 使用更大的训练数据 * 使用 `lambda_l1`, `lambda_l2` 和 `min_gain_to_split` 来使用正则 * 尝试 `max_depth` 来避免生成过深的树