使用 Ray Tune 调优 XGBoost 超参数 #

本教程演示了如何使用 Ray Tune 优化 XGBoost 模型。您将学习：

XGBoost 的基础知识及其关键超参数
如何训练一个简单的 XGBoost 分类器（不进行超参数调优）
如何使用 Ray Tune 寻找最优超参数
高级技术，如早期停止和 GPU 加速

XGBoost 是目前最流行的机器学习算法之一。它在各种任务中表现出色，是许多 Kaggle 竞赛成功的关键。

注意

要运行本教程，您需要安装以下内容：

$ pip install -q "ray[tune]" scikit-learn xgboost

什么是 XGBoost #

XGBoost (eXtreme Gradient Boosting) 是梯度提升决策树强大而高效的实现。由于其以下特点，它已成为最流行的机器学习算法之一：

性能：在多种类型的问题上始终保持出色的结果
速度：高度优化的实现，可以利用 GPU 加速
灵活性：适用于多种类型的预测问题（分类、回归、排序）

关键概念

使用简单的决策树集成
树是按顺序构建的，每棵树都纠正前一棵树的错误
采用梯度下降最小化损失函数
即使单棵树可能有高偏差，使用提升集成也可以产生更好的预测并降低偏差

Single vs. ensemble learning — 在此图中，左侧的树深度为 2，右侧的树深度为 3。注意，每增加一层，分裂数就会增加 \(2^{(d-1)}\) 个，其中 d 是树的深度。#

提升算法从一棵小的决策树开始，评估它对给定示例的预测效果。构建下一棵树时，先前被错误分类的样本有更高的机会被用于生成树。这很有用，因为它避免了对易于分类的样本过度拟合，而是尝试构建也能分类困难示例的模型。请参阅此处以获得关于 Bagging 和 Boosting 算法的更详细介绍。

有许多提升算法。其核心原理都非常相似。XGBoost 使用二阶导数来寻找能够最大化增益（损失的倒数）的分裂点——这就是其名称的由来。实践中，XGBoost 通常比其他提升算法表现出更好的性能，尽管 LightGBM 特别是对于大型数据集而言，往往更快且内存效率更高。

训练一个简单的 XGBoost 分类器 #

让我们首先看看如何训练一个简单的 XGBoost 分类器。我们将使用 sklearn 数据集集合中包含的 breast_cancer 数据集。这是一个二元分类数据集。给定 30 个不同的输入特征，我们的任务是学习识别患有乳腺癌和未患乳腺癌的个体。

以下是训练一个简单 XGBoost 模型的完整代码：

SMOKE_TEST = False

import sklearn.datasets
import sklearn.metrics
from sklearn.model_selection import train_test_split
import xgboost as xgb


def train_breast_cancer(config):
    # Load dataset
    data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)
    # Split into train and test set
    train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)
    # Build input matrices for XGBoost
    train_set = xgb.DMatrix(train_x, label=train_y)
    test_set = xgb.DMatrix(test_x, label=test_y)
    # Train the classifier
    results = {}
    bst = xgb.train(
        config,
        train_set,
        evals=[(test_set, "eval")],
        evals_result=results,
        verbose_eval=False,
    )
    return results


results = train_breast_cancer(
    {"objective": "binary:logistic", "eval_metric": ["logloss", "error"]}
)
accuracy = 1.0 - results["eval"]["error"][-1]
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.9650

如您所见，代码非常简单。首先，加载数据集并将其分成 test 和 train 集。然后使用 xgb.train() 训练 XGBoost 模型。XGBoost 会自动评估我们在测试集上指定的指标。在本例中，它计算对数损失 (logloss) 和预测误差 (error)，即错误分类示例的百分比。要计算准确率，我们只需用 1.0 减去误差即可。即使在这个简单的例子中，大多数运行也能达到超过 0.90 的良好准确率。

您可能已经注意到我们传递给 XGBoost 算法的 config 参数。这是一个字典，您可以在其中指定 XGBoost 算法的参数。在这个简单的例子中，我们只传递了 objective 和 eval_metric 参数。值 binary:logistic 告诉 XGBoost 我们要训练一个用于二元分类任务的逻辑回归模型。您可以在XGBoost 文档中找到所有有效目标的概述。

使用 Ray Train 扩展 XGBoost 训练 #

在使用 Ray 进行 XGBoost 和 LightGBM 的分布式训练和推理中，我们介绍了如何使用 Ray Train 扩展 XGBoost 单模型训练。本教程的其余部分将重点介绍如何使用 Ray Tune 优化 XGBoost 模型的超参数。

XGBoost 超参数 #

即使使用默认设置，XGBoost 也能在乳腺癌数据集上获得良好的准确率。然而，与许多机器学习算法一样，有许多参数可以调优，这可能会带来更好的性能。让我们在下面探讨其中的一些。

最大树深度#

回顾一下，XGBoost 内部使用许多决策树模型来进行预测。训练决策树时，我们需要告诉算法树可以增长到多大。这个参数称为树的深度。

Decision tree depth — 在此图中，左侧的树深度为 2，右侧的树深度为 3。注意，每增加一层，分裂数就会增加 \(2^{(d-1)}\) 个，其中 d 是树的深度。#

树深度是一个与模型复杂度相关的属性。如果只允许短树，模型可能不够精确——它们会欠拟合数据。如果允许非常大的树，单个模型可能会对数据过拟合。实践中，这个参数的良好起始点通常在 2 到 6 之间。

XGBoost 的默认值为 3。

最小子节点权重#

当决策树创建新叶子时，它将一个节点剩余的数据分成两组。如果其中一组只有很少的样本，通常不再有意义进一步分割。其中一个原因是当样本较少时，模型更难训练。

Minimum child weight — 在此示例中，我们从 100 个示例开始。在第一个节点，它们分别被分成 4 个和 96 个样本。在下一步中，我们的模型可能会发现进一步分割这 4 个示例没有意义。因此，它只在右侧继续添加叶子。#

模型用来决定分割节点是否有意义的参数称为最小子节点权重。在线性回归的情况下，这仅仅是每个子节点所需的绝对节点数。在其他目标函数中，这个值是使用示例的权重确定的，因此得名。

值越大，树受到的约束越多，深度越浅。因此，这个参数也影响模型复杂度。对于噪声数据或小型数据集，较小的值更佳。值可以介于 0 和无穷大之间，并取决于样本大小。对于乳腺癌数据集只有 500 个示例的情况，0 到 10 之间的值应该是合理的。

XGBoost 的默认值为 1。

样本抽样比例#

我们添加的每棵决策树都是在整个训练数据集的子样本上训练的。样本的概率根据 XGBoost 算法进行加权，但我们可以决定在哪个样本比例上训练每棵决策树。

将此值设置为 0.7 意味着我们在每次训练迭代前随机抽样 70% 的训练数据集。较低的值会生成更多样化的树，较高的值会生成更相似的树。较低的值有助于防止过拟合。

XGBoost 的默认值为 1。

学习率 / Eta#

回顾一下，XGBoost 顺序训练许多决策树，并且后来的树更有可能在先前树错误分类的数据上进行训练。这实际上意味着早期的树对容易的样本（即那些易于分类的样本）做出决策，而后续的树对较难的样本做出决策。因此，合理地假设后来的树不如早期的树准确。

为了解决这一事实，XGBoost 使用了一个称为 Eta 的参数，有时也称为学习率。请不要将其与梯度下降中的学习率混淆！原始关于随机梯度提升的论文是这样介绍这个参数的：

\[ F_m(x) = F_{m-1}(x) + \eta \cdot \gamma_{lm} \textbf{1}(x \in R_{lm}) ]

这只是一个复杂的方式来说明，当我们训练新的决策树，用 \(\gamma_{lm} \textbf{1}(x \in R_{lm})\) 表示时，我们希望用因子 \(\eta\) 来抑制它对先前预测 \(F_{m-1}(x)\) 的影响。

这个参数的典型值在 0.01 和 0.3` 之间。

XGBoost 的默认值为 0.3。

提升轮数#

最后，我们可以决定执行多少轮提升，这意味着我们最终训练多少棵决策树。当我们进行大量样本抽样或使用较小的学习率时，增加提升轮数可能更有意义。

XGBoost 的默认值为 10。

整合起来#

让我们看看这在代码中是什么样子！我们只需要调整一下我们的 config 字典：

config = {
    "objective": "binary:logistic",
    "eval_metric": ["logloss", "error"],
    "max_depth": 2,
    "min_child_weight": 0,
    "subsample": 0.8,
    "eta": 0.2,
}
results = train_breast_cancer(config)
accuracy = 1.0 - results["eval"]["error"][-1]
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.9231

其余部分保持不变。请注意，我们在这里不调整 num_boost_rounds。结果也应显示出超过 90% 的高准确率。

调优配置参数 #

XGBoost 的默认参数已经能达到良好的准确率，即使我们上一节中的猜测也能获得远高于 90% 的准确率。然而，我们的猜测仅仅是猜测。通常我们并不知道哪种参数组合才能在机器学习任务中真正获得最佳结果。

不幸的是，我们有很多超参数组合可以尝试。我们应该将 max_depth=3 与 subsample=0.8 结合，还是与 subsample=0.9 结合？其他参数怎么办？

这就是超参数调优发挥作用的地方。通过使用 Ray Tune 等调优库，我们可以尝试超参数的各种组合。使用复杂的搜索策略，可以选择这些参数，使其有可能获得良好结果（避免昂贵的穷举搜索）。此外，对于表现不佳的试验可以提前停止，以减少计算资源的浪费。最后，Ray Tune 还可以并行训练这些运行，从而大大提高搜索速度。

让我们从一个使用 Tune 进行超参数调优的基本示例开始。我们只需要对代码块进行一些修改：

import sklearn.datasets
import sklearn.metrics

from ray import tune


def train_breast_cancer(config):
    # Load dataset
    data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)
    # Split into train and test set
    train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)
    # Build input matrices for XGBoost
    train_set = xgb.DMatrix(train_x, label=train_y)
    test_set = xgb.DMatrix(test_x, label=test_y)
    # Train the classifier
    results = {}
    xgb.train(
        config,
        train_set,
        evals=[(test_set, "eval")],
        evals_result=results,
        verbose_eval=False,
    )
    # Return prediction accuracy
    accuracy = 1.0 - results["eval"]["error"][-1]
    tune.report({"mean_accuracy": accuracy, "done": True})


config = {
    "objective": "binary:logistic",
    "eval_metric": ["logloss", "error"],
    "max_depth": tune.randint(1, 9),
    "min_child_weight": tune.choice([1, 2, 3]),
    "subsample": tune.uniform(0.5, 1.0),
    "eta": tune.loguniform(1e-4, 1e-1),
}
tuner = tune.Tuner(
    train_breast_cancer,
    tune_config=tune.TuneConfig(num_samples=10),
    param_space=config,
)
results = tuner.fit()

显示代码单元格输出隐藏代码单元格输出

Tune 状态

当前时间	2025-02-11 16:13:34
运行时间	00:00:01.87
内存	22.5/36.0 GiB

系统信息

使用 FIFO 调度算法。
逻辑资源使用：1.0/12 CPU, 0/0 GPU

试验状态

试验名称	状态	位置	eta	最大深度	最小子节点权重	样本抽样比例	准确率	迭代次数	总时间 (s)
train_breast_cancer_31c9f_00000	已终止	127.0.0.1:89735	0.0434196	8	1	0.530351	0.909091	1	0.0114911
train_breast_cancer_31c9f_00001	已终止	127.0.0.1:89734	0.0115669	6	2	0.996519	0.615385	1	0.01138
train_breast_cancer_31c9f_00002	已终止	127.0.0.1:89740	0.00124339	7	3	0.536078	0.629371	1	0.0096581
train_breast_cancer_31c9f_00003	已终止	127.0.0.1:89742	0.000400434	6	3	0.90014	0.601399	1	0.0103199
train_breast_cancer_31c9f_00004	已终止	127.0.0.1:89738	0.0121308	6	3	0.843156	0.629371	1	0.00843
train_breast_cancer_31c9f_00005	已终止	127.0.0.1:89733	0.0344144	2	3	0.513071	0.895105	1	0.00800109
train_breast_cancer_31c9f_00006	已终止	127.0.0.1:89737	0.0530037	7	2	0.920801	0.965035	1	0.0117419
train_breast_cancer_31c9f_00007	已终止	127.0.0.1:89741	0.000230442	3	3	0.946852	0.608392	1	0.00917387
train_breast_cancer_31c9f_00008	已终止	127.0.0.1:89739	0.00166323	4	1	0.588879	0.636364	1	0.011095
train_breast_cancer_31c9f_00009	已终止	127.0.0.1:89736	0.0753618	3	3	0.55103	0.909091	1	0.00776482

2025-02-11 16:13:34,649	INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-13-31' in 0.0057s.
2025-02-11 16:13:34,652	INFO tune.py:1041 -- Total run time: 1.88 seconds (1.86 seconds for the tuning loop).

(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000000)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000001)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000002)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000003)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000004)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000005)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000006)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000007)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000008)
(train_breast_cancer pid=90413) Checkpoint successfully created at: Checkpoint(filesystem=local, path=/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-17-11/train_breast_cancer_b412c_00000_0_eta=0.0200,max_depth=4,min_child_weight=2,subsample=0.7395_2025-02-11_16-17-11/checkpoint_000009)

如您所见，实际训练函数中的更改很小。我们不再返回准确率值，而是使用 session.report() 将其报告给 Tune。我们的 config 字典也只做了少量改动。我们不再传递硬编码的参数，而是告诉 Tune 从一系列有效选项中选择值。这里有许多选项，所有这些都在Tune 文档中有详细解释。

简要解释一下，它们的作用如下：

tune.randint(min, max) 在 min 和 max 之间选择一个随机整数值。请注意，max 是不包含的，因此不会被抽样到。
tune.choice([a, b, c]) 随机选择列表中的一项。每项被抽样的机会相同。
tune.uniform(min, max) 在 min 和 max 之间抽样一个浮点数。注意，这里的 max 也是不包含的。
tune.loguniform(min, max, base=10) 在 min 和 max 之间抽样一个浮点数，但首先对这些边界应用对数变换。因此，这使得从不同数量级中抽样值变得容易。

我们传递给 TuneConfig() 的 num_samples=10 选项意味着我们将从这个搜索空间中抽样 10 种不同的超参数配置。

训练运行的输出可能如下所示：

 Number of trials: 10/10 (10 TERMINATED)
 +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------+
 | Trial name                      | status     | loc   |         eta |   max_depth |   min_child_weight |   subsample |      acc |   iter |   total time (s) |
 |---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------|
 | train_breast_cancer_b63aa_00000 | TERMINATED |       | 0.000117625 |           2 |                  2 |    0.616347 | 0.916084 |      1 |        0.0306492 |
 | train_breast_cancer_b63aa_00001 | TERMINATED |       | 0.0382954   |           8 |                  2 |    0.581549 | 0.937063 |      1 |        0.0357082 |
 | train_breast_cancer_b63aa_00002 | TERMINATED |       | 0.000217926 |           1 |                  3 |    0.528428 | 0.874126 |      1 |        0.0264609 |
 | train_breast_cancer_b63aa_00003 | TERMINATED |       | 0.000120929 |           8 |                  1 |    0.634508 | 0.958042 |      1 |        0.036406  |
 | train_breast_cancer_b63aa_00004 | TERMINATED |       | 0.00839715  |           5 |                  1 |    0.730624 | 0.958042 |      1 |        0.0389378 |
 | train_breast_cancer_b63aa_00005 | TERMINATED |       | 0.000732948 |           8 |                  2 |    0.915863 | 0.958042 |      1 |        0.0382841 |
 | train_breast_cancer_b63aa_00006 | TERMINATED |       | 0.000856226 |           4 |                  1 |    0.645209 | 0.916084 |      1 |        0.0357089 |
 | train_breast_cancer_b63aa_00007 | TERMINATED |       | 0.00769908  |           7 |                  1 |    0.729443 | 0.909091 |      1 |        0.0390737 |
 | train_breast_cancer_b63aa_00008 | TERMINATED |       | 0.00186339  |           5 |                  3 |    0.595744 | 0.944056 |      1 |        0.0343912 |
 | train_breast_cancer_b63aa_00009 | TERMINATED |       | 0.000950272 |           3 |                  2 |    0.835504 | 0.965035 |      1 |        0.0348201 |
 +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------+

我们找到的最佳配置使用了 eta=0.000950272、max_depth=3、min_child_weight=2、subsample=0.835504，并达到了 0.965035 的准确率。

早期停止 #

当前，Tune 会抽样 10 种不同的超参数配置，并在所有配置上训练一个完整的 XGBoost 模型。在我们的这个小示例中，训练速度非常快。然而，如果训练时间更长，大量的计算资源会花费在最终表现不佳（例如，准确率低）的试验上。如果能尽早识别并停止这些试验，就能避免浪费资源，这将是很有好处的。

这就是 Tune 的调度器发挥作用的地方。Tune 的 TrialScheduler 负责启动和停止试验。Tune 实现了多种不同的调度器，每种调度器都在Tune 文档中有详细描述。在本例中，我们将使用 AsyncHyperBandScheduler 或 ASHAScheduler。

这个调度器的基本思想是：我们抽样一定数量的超参数配置。每个配置都会训练特定数量的迭代。在这些迭代后，只保留表现最佳的超参数。这些超参数是根据某种损失指标（通常是评估损失）选择的。这个循环会重复进行，直到我们找到最佳配置。

ASHAScheduler 需要知道三件事：

应该使用哪个指标来识别表现不佳的试验？
这个指标应该最大化还是最小化？
每个试验训练多少次迭代？

还有更多参数，这些参数在文档中有解释。

最后，我们需要向 Tune 报告损失指标。我们使用 XGBoost 接受并在每次评估轮次后调用的 Callback 来完成此操作。Ray Tune 提供了两个 XGBoost 回调函数可供使用。 TuneReportCallback 只将评估指标报告回 Tune。 TuneReportCheckpointCallback 还会保存每次评估轮次后的检查点。本示例中我们将只使用后者，以便之后可以检索保存的模型。

来自 eval_metrics 配置设置的这些参数随后会通过回调函数自动报告给 Tune。这里将报告原始误差，而不是准确率。为了显示达到的最佳准确率，我们稍后会将其反转。

我们还将加载最佳检查点模型，以便将其用于预测。最佳模型是根据我们传递给 TunerConfig() 的 metric 和 mode 参数选择的。

import sklearn.datasets
import sklearn.metrics
from ray.tune.schedulers import ASHAScheduler
from sklearn.model_selection import train_test_split
import xgboost as xgb

from ray import tune
from ray.tune.integration.xgboost import TuneReportCheckpointCallback


def train_breast_cancer(config: dict):
    # This is a simple training function to be passed into Tune
    # Load dataset
    data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)
    # Split into train and test set
    train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)
    # Build input matrices for XGBoost
    train_set = xgb.DMatrix(train_x, label=train_y)
    test_set = xgb.DMatrix(test_x, label=test_y)
    # Train the classifier, using the Tune callback
    xgb.train(
        config,
        train_set,
        evals=[(test_set, "eval")],
        verbose_eval=False,
        # `TuneReportCheckpointCallback` defines the checkpointing frequency and format.
        callbacks=[TuneReportCheckpointCallback(frequency=1)],
    )


def get_best_model_checkpoint(results):
    best_result = results.get_best_result()

    # `TuneReportCheckpointCallback` provides a helper method to retrieve the
    # model from a checkpoint.
    best_bst = TuneReportCheckpointCallback.get_model(best_result.checkpoint)

    accuracy = 1.0 - best_result.metrics["eval-error"]
    print(f"Best model parameters: {best_result.config}")
    print(f"Best model total accuracy: {accuracy:.4f}")
    return best_bst


def tune_xgboost(smoke_test=False):
    search_space = {
        # You can mix constants with search space objects.
        "objective": "binary:logistic",
        "eval_metric": ["logloss", "error"],
        "max_depth": tune.randint(1, 9),
        "min_child_weight": tune.choice([1, 2, 3]),
        "subsample": tune.uniform(0.5, 1.0),
        "eta": tune.loguniform(1e-4, 1e-1),
    }
    # This will enable aggressive early stopping of bad trials.
    scheduler = ASHAScheduler(
        max_t=10, grace_period=1, reduction_factor=2  # 10 training iterations
    )

    tuner = tune.Tuner(
        train_breast_cancer,
        tune_config=tune.TuneConfig(
            metric="eval-logloss",
            mode="min",
            scheduler=scheduler,
            num_samples=1 if smoke_test else 10,
        ),
        param_space=search_space,
    )
    results = tuner.fit()
    return results


results = tune_xgboost(smoke_test=SMOKE_TEST)

# Load the best model checkpoint.
best_bst = get_best_model_checkpoint(results)

# You could now do further predictions with
# best_bst.predict(...)

显示代码单元格输出隐藏代码单元格输出

Tune 状态

当前时间	2025-02-11 16:13:35
运行时间	00:00:01.05
内存	22.5/36.0 GiB

系统信息

使用 AsyncHyperBand: 已停止数量=1
Bracket: Iter 8.000: -0.6414526407118444 | Iter 4.000: -0.6439705872452343 | Iter 2.000: -0.6452721030145259 | Iter 1.000: -0.6459394399519567
逻辑资源使用：1.0/12 CPU, 0/0 GPU

试验状态

试验名称	状态	位置	eta	最大深度	最小子节点权重	样本抽样比例	迭代次数	总时间 (s)	eval-logloss	eval-error
train_breast_cancer_32eb5_00000	已终止	127.0.0.1:89763	0.000830475	5	1	0.675899	10	0.0169384	0.640195	0.342657

2025-02-11 16:13:35,717	INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/Users/rdecal/ray_results/train_breast_cancer_2025-02-11_16-13-34' in 0.0018s.
2025-02-11 16:13:35,719	INFO tune.py:1041 -- Total run time: 1.05 seconds (1.04 seconds for the tuning loop).

Best model parameters: {'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 5, 'min_child_weight': 1, 'subsample': 0.675899175238225, 'eta': 0.0008304750981897656}
Best model total accuracy: 0.6573

我们的运行输出可能如下所示：

 Number of trials: 10/10 (10 TERMINATED)
 +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------+
 | Trial name                      | status     | loc   |         eta |   max_depth |   min_child_weight |   subsample |   iter |   total time (s) |   eval-logloss |   eval-error |
 |---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------|
 | train_breast_cancer_ba275_00000 | TERMINATED |       | 0.00205087  |           2 |                  1 |    0.898391 |     10 |        0.380619  |       0.678039 |     0.090909 |
 | train_breast_cancer_ba275_00001 | TERMINATED |       | 0.000183834 |           4 |                  3 |    0.924939 |      1 |        0.0228798 |       0.693009 |     0.111888 |
 | train_breast_cancer_ba275_00002 | TERMINATED |       | 0.0242721   |           7 |                  2 |    0.501551 |     10 |        0.376154  |       0.54472  |     0.06993  |
 | train_breast_cancer_ba275_00003 | TERMINATED |       | 0.000449692 |           5 |                  3 |    0.890212 |      1 |        0.0234981 |       0.692811 |     0.090909 |
 | train_breast_cancer_ba275_00004 | TERMINATED |       | 0.000376393 |           7 |                  2 |    0.883609 |      1 |        0.0231569 |       0.692847 |     0.062937 |
 | train_breast_cancer_ba275_00005 | TERMINATED |       | 0.00231942  |           3 |                  3 |    0.877464 |      2 |        0.104867  |       0.689541 |     0.083916 |
 | train_breast_cancer_ba275_00006 | TERMINATED |       | 0.000542326 |           1 |                  2 |    0.578584 |      1 |        0.0213971 |       0.692765 |     0.083916 |
 | train_breast_cancer_ba275_00007 | TERMINATED |       | 0.0016801   |           1 |                  2 |    0.975302 |      1 |        0.02226   |       0.691999 |     0.083916 |
 | train_breast_cancer_ba275_00008 | TERMINATED |       | 0.000595756 |           8 |                  3 |    0.58429  |      1 |        0.0221152 |       0.692657 |     0.06993  |
 | train_breast_cancer_ba275_00009 | TERMINATED |       | 0.000357845 |           8 |                  1 |    0.637776 |      1 |        0.022635  |       0.692859 |     0.090909 |
 +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------+


 Best model parameters: {'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 7, 'min_child_weight': 2, 'subsample': 0.5015513240240503, 'eta': 0.024272050872920895}
 Best model total accuracy: 0.9301

如您所见，大多数试验仅运行了几次迭代就被停止了。只有两个最有希望的试验运行了完整的 10 次迭代。

您还可以确保所有可用资源都被充分利用，因为调度器会终止试验并释放资源。这可以通过 ResourceChangingScheduler 实现。相关示例请参阅此处：XGBoost 动态资源示例。

使用共享 GPU #

通常可以通过除了 CPU 外使用 GPU 来加速训练。然而，您通常拥有的 GPU 数量不如要运行的试验数量多。例如，如果您并行运行 10 个 Tune 试验，通常无法获得 10 个独立的 GPU。

Tune 支持共享 GPU。这意味着每个任务都会被分配一部分 GPU 内存用于训练。对于 10 个任务，它可能看起来像这样：

config = {
    "objective": "binary:logistic",
    "eval_metric": ["logloss", "error"],
    "tree_method": "gpu_hist",
    "max_depth": tune.randint(1, 9),
    "min_child_weight": tune.choice([1, 2, 3]),
    "subsample": tune.uniform(0.5, 1.0),
    "eta": tune.loguniform(1e-4, 1e-1),
}

tuner = tune.Tuner(
    tune.with_resources(train_breast_cancer, resources={"cpu": 1, "gpu": 0.1}),
    tune_config=tune.TuneConfig(num_samples=1 if SMOKE_TEST else 10),
    param_space=config,
)
results = tuner.fit()

因此，每个任务使用可用 GPU 内存的 10%。您还需要告诉 XGBoost 使用 gpu_hist 树方法，以便它知道应该使用 GPU。

结论 #

现在您应该对如何训练 XGBoost 模型以及如何调优超参数以获得最佳结果有了基本了解。在我们这个简单的例子中，调优参数对准确率没有产生巨大影响。但在更大型的应用中，智能的超参数调优可以让模型从似乎完全无法学习，转变为超越所有其他模型的卓越表现。

使用 Ray Tune 调优 XGBoost 超参数 #

什么是 XGBoost #

训练一个简单的 XGBoost 分类器 #

使用 Ray Train 扩展 XGBoost 训练 #

XGBoost 超参数 #

最大树深度#

最小子节点权重#

样本抽样比例#

学习率 / Eta#

提升轮数#

整合起来#

调优配置参数 #

Tune 状态

系统信息

试验状态

早期停止 #

Tune 状态

系统信息

试验状态

使用共享 GPU #

结论 #

更多 XGBoost 示例 #

了解更多 #