
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/ensemble/plot_forest_hist_grad_boosting_comparison.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_ensemble_plot_forest_hist_grad_boosting_comparison.py>`
        to download the full example code or to run this example in your browser via JupyterLite or Binder.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_ensemble_plot_forest_hist_grad_boosting_comparison.py:


===============================================================
Comparing Random Forests and Histogram Gradient Boosting models
===============================================================

In this example we compare the performance of Random Forest (RF) and Histogram
Gradient Boosting (HGBT) models in terms of score and computation time for a
regression dataset, though **all the concepts here presented apply to
classification as well**.

The comparison is made by varying the parameters that control the number of
trees according to each estimator:

- `n_estimators` controls the number of trees in the forest. It's a fixed number.
- `max_iter` is the maximum number of iterations in a gradient boosting
  based model. The number of iterations corresponds to the number of trees for
  regression and binary classification problems. Furthermore, the actual number
  of trees required by the model depends on the stopping criteria.

HGBT uses gradient boosting to iteratively improve the model's performance by
fitting each tree to the negative gradient of the loss function with respect to
the predicted value. RFs, on the other hand, are based on bagging and use a
majority vote to predict the outcome.

See the :ref:`User Guide <ensemble>` for more information on ensemble models or
see :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py` for an
example showcasing some other features of HGBT models.

.. GENERATED FROM PYTHON SOURCE LINES 29-33

.. code-block:: Python


    # Authors: The scikit-learn developers
    # SPDX-License-Identifier: BSD-3-Clause








.. GENERATED FROM PYTHON SOURCE LINES 34-36

Load dataset
------------

.. GENERATED FROM PYTHON SOURCE LINES 36-42

.. code-block:: Python


    from sklearn.datasets import fetch_california_housing

    X, y = fetch_california_housing(return_X_y=True, as_frame=True)
    n_samples, n_features = X.shape








.. GENERATED FROM PYTHON SOURCE LINES 43-48

HGBT uses a histogram-based algorithm on binned feature values that can
efficiently handle large datasets (tens of thousands of samples or more) with
a high number of features (see :ref:`Why_it's_faster`). The scikit-learn
implementation of RF does not use binning and relies on exact splitting, which
can be computationally expensive.

.. GENERATED FROM PYTHON SOURCE LINES 48-51

.. code-block:: Python


    print(f"The dataset consists of {n_samples} samples and {n_features} features")





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The dataset consists of 20640 samples and 8 features




.. GENERATED FROM PYTHON SOURCE LINES 52-65

Compute score and computation times
-----------------------------------

Notice that many parts of the implementation of
:class:`~sklearn.ensemble.HistGradientBoostingClassifier` and
:class:`~sklearn.ensemble.HistGradientBoostingRegressor` are parallelized by
default.

The implementation of :class:`~sklearn.ensemble.RandomForestRegressor` and
:class:`~sklearn.ensemble.RandomForestClassifier` can also be run on multiple
cores by using the `n_jobs` parameter, here set to match the number of
physical cores on the host machine. See :ref:`parallelism` for more
information.

.. GENERATED FROM PYTHON SOURCE LINES 65-71

.. code-block:: Python


    import joblib

    N_CORES = joblib.cpu_count(only_physical_cores=True)
    print(f"Number of physical cores: {N_CORES}")





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Number of physical cores: 2




.. GENERATED FROM PYTHON SOURCE LINES 72-81

Unlike RF, HGBT models offer an early-stopping option (see
:ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_early_stopping.py`)
to avoid adding new unnecessary trees. Internally, the algorithm uses an
out-of-sample set to compute the generalization performance of the model at
each addition of a tree. Thus, if the generalization performance is not
improving for more than `n_iter_no_change` iterations, it stops adding trees.

The other parameters of both models were tuned but the procedure is not shown
here to keep the example simple.

.. GENERATED FROM PYTHON SOURCE LINES 81-112

.. code-block:: Python


    import pandas as pd

    from sklearn.ensemble import HistGradientBoostingRegressor, RandomForestRegressor
    from sklearn.model_selection import GridSearchCV, KFold

    models = {
        "Random Forest": RandomForestRegressor(
            min_samples_leaf=5, random_state=0, n_jobs=N_CORES
        ),
        "Hist Gradient Boosting": HistGradientBoostingRegressor(
            max_leaf_nodes=15, random_state=0, early_stopping=False
        ),
    }
    param_grids = {
        "Random Forest": {"n_estimators": [10, 20, 50, 100]},
        "Hist Gradient Boosting": {"max_iter": [10, 20, 50, 100, 300, 500]},
    }
    cv = KFold(n_splits=4, shuffle=True, random_state=0)

    results = []
    for name, model in models.items():
        grid_search = GridSearchCV(
            estimator=model,
            param_grid=param_grids[name],
            return_train_score=True,
            cv=cv,
        ).fit(X, y)
        result = {"model": name, "cv_results": pd.DataFrame(grid_search.cv_results_)}
        results.append(result)








.. GENERATED FROM PYTHON SOURCE LINES 113-127

.. Note::
 Tuning the `n_estimators` for RF generally results in a waste of computer
 power. In practice one just needs to ensure that it is large enough so that
 doubling its value does not lead to a significant improvement of the testing
 score.

Plot results
------------
We can use a `plotly.express.scatter
<https://plotly.com/python-api-reference/generated/plotly.express.scatter.html>`_
to visualize the trade-off between elapsed computing time and mean test score.
Passing the cursor over a given point displays the corresponding parameters.
Error bars correspond to one standard deviation as computed in the different
folds of the cross-validation.

.. GENERATED FROM PYTHON SOURCE LINES 127-201

.. code-block:: Python


    import plotly.colors as colors
    import plotly.express as px
    from plotly.subplots import make_subplots

    fig = make_subplots(
        rows=1,
        cols=2,
        shared_yaxes=True,
        subplot_titles=["Train time vs score", "Predict time vs score"],
    )
    model_names = [result["model"] for result in results]
    colors_list = colors.qualitative.Plotly * (
        len(model_names) // len(colors.qualitative.Plotly) + 1
    )

    for idx, result in enumerate(results):
        cv_results = result["cv_results"].round(3)
        model_name = result["model"]
        param_name = next(iter(param_grids[model_name].keys()))
        cv_results[param_name] = cv_results["param_" + param_name]
        cv_results["model"] = model_name

        scatter_fig = px.scatter(
            cv_results,
            x="mean_fit_time",
            y="mean_test_score",
            error_x="std_fit_time",
            error_y="std_test_score",
            hover_data=param_name,
            color="model",
        )
        line_fig = px.line(
            cv_results,
            x="mean_fit_time",
            y="mean_test_score",
        )

        scatter_trace = scatter_fig["data"][0]
        line_trace = line_fig["data"][0]
        scatter_trace.update(marker=dict(color=colors_list[idx]))
        line_trace.update(line=dict(color=colors_list[idx]))
        fig.add_trace(scatter_trace, row=1, col=1)
        fig.add_trace(line_trace, row=1, col=1)

        scatter_fig = px.scatter(
            cv_results,
            x="mean_score_time",
            y="mean_test_score",
            error_x="std_score_time",
            error_y="std_test_score",
            hover_data=param_name,
        )
        line_fig = px.line(
            cv_results,
            x="mean_score_time",
            y="mean_test_score",
        )

        scatter_trace = scatter_fig["data"][0]
        line_trace = line_fig["data"][0]
        scatter_trace.update(marker=dict(color=colors_list[idx]))
        line_trace.update(line=dict(color=colors_list[idx]))
        fig.add_trace(scatter_trace, row=1, col=2)
        fig.add_trace(line_trace, row=1, col=2)

    fig.update_layout(
        xaxis=dict(title="Train time (s) - lower is better"),
        yaxis=dict(title="Test R2 score - higher is better"),
        xaxis2=dict(title="Predict time (s) - lower is better"),
        legend=dict(x=0.72, y=0.05, traceorder="normal", borderwidth=1),
        title=dict(x=0.5, text="Speed-score trade-off of tree-based ensembles"),
    )






.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>            <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-AMS-MML_SVG"></script><script type="text/javascript">if (window.MathJax && window.MathJax.Hub && window.MathJax.Hub.Config) {window.MathJax.Hub.Config({SVG: {font: "STIX-Web"}});}</script>                <script type="text/javascript">window.PlotlyConfig = {MathJaxConfig: 'local'};</script>
            <script charset="utf-8" src="https://cdn.plot.ly/plotly-3.3.0.min.js" integrity="sha256-bO3dS6yCpk9aK4gUpNELtCiDeSYvGYnK7jFI58NQnHI=" crossorigin="anonymous"></script>                <div id="0be215ad-cc5f-4232-a018-f5e96d15f67b" class="plotly-graph-div" style="height:525px; width:100%;"></div>            <script type="text/javascript">                window.PLOTLYENV=window.PLOTLYENV || {};                                if (document.getElementById("0be215ad-cc5f-4232-a018-f5e96d15f67b")) {                    Plotly.newPlot(                        "0be215ad-cc5f-4232-a018-f5e96d15f67b",                        [{"customdata":{"dtype":"i1","bdata":"ChQyZA==","shape":"4, 1"},"error_x":{"array":{"dtype":"f8","bdata":"eekmMQisfD956SYxCKx8P5qZmZmZmak\u002fVOOlm8QgwD8="}},"error_y":{"array":{"dtype":"f8","bdata":"ukkMAiuHhj+6SQwCK4eGPzvfT42XboI\u002fO99PjZdugj8="}},"hovertemplate":"model=Random Forest\u003cbr\u003emean_fit_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cbr\u003en_estimators=%{customdata[0]}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"Random Forest","marker":{"color":"#636EFA","symbol":"circle"},"mode":"markers","name":"Random Forest","orientation":"v","showlegend":true,"x":{"dtype":"f8","bdata":"tvP91Hjp1j89CtejcD3mPwAAAAAAAPw\u002fhxbZzvdTDUA="},"xaxis":"x","y":{"dtype":"f8","bdata":"x0s3iUFg6T\u002fFILByaJHpPxkEVg4tsuk\u002fw\u002fUoXI\u002fC6T8="},"yaxis":"y","type":"scatter"},{"hovertemplate":"mean_fit_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"","line":{"color":"#636EFA","dash":"solid"},"marker":{"symbol":"circle"},"mode":"lines","name":"","orientation":"v","showlegend":false,"x":{"dtype":"f8","bdata":"tvP91Hjp1j89CtejcD3mPwAAAAAAAPw\u002fhxbZzvdTDUA="},"xaxis":"x","y":{"dtype":"f8","bdata":"x0s3iUFg6T\u002fFILByaJHpPxkEVg4tsuk\u002fw\u002fUoXI\u002fC6T8="},"yaxis":"y","type":"scatter"},{"customdata":{"dtype":"i1","bdata":"ChQyZA==","shape":"4, 1"},"error_x":{"array":{"dtype":"f8","bdata":"\u002fKnx0k1iUD8AAAAAAAAAAPyp8dJNYlA\u002fexSuR+F6dD8="}},"error_y":{"array":{"dtype":"f8","bdata":"ukkMAiuHhj+6SQwCK4eGPzvfT42XboI\u002fO99PjZdugj8="}},"hovertemplate":"mean_score_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cbr\u003en_estimators=%{customdata[0]}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"","marker":{"color":"#636EFA","symbol":"circle"},"mode":"markers","name":"","orientation":"v","showlegend":false,"x":{"dtype":"f8","bdata":"eekmMQisjD956SYxCKyMP5zEILByaKE\u002fuB6F61G4rj8="},"xaxis":"x2","y":{"dtype":"f8","bdata":"x0s3iUFg6T\u002fFILByaJHpPxkEVg4tsuk\u002fw\u002fUoXI\u002fC6T8="},"yaxis":"y2","type":"scatter"},{"hovertemplate":"mean_score_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"","line":{"color":"#636EFA","dash":"solid"},"marker":{"symbol":"circle"},"mode":"lines","name":"","orientation":"v","showlegend":false,"x":{"dtype":"f8","bdata":"eekmMQisjD956SYxCKyMP5zEILByaKE\u002fuB6F61G4rj8="},"xaxis":"x2","y":{"dtype":"f8","bdata":"x0s3iUFg6T\u002fFILByaJHpPxkEVg4tsuk\u002fw\u002fUoXI\u002fC6T8="},"yaxis":"y2","type":"scatter"},{"customdata":{"dtype":"i2","bdata":"CgAUADIAZAAsAfQB","shape":"6, 1"},"error_x":{"array":{"dtype":"f8","bdata":"\u002fKnx0k1iYD\u002f8qfHSTWJQP\u002fp+arx0k4g\u002feekmMQisjD9YObTIdr6fP\u002fp+arx0k5g\u002f"}},"error_y":{"array":{"dtype":"f8","bdata":"eekmMQisfD\u002f8qfHSTWKAP\u002fp+arx0k3g\u002f+n5qvHSTaD\u002f6fmq8dJNoP\u002fp+arx0k2g\u002f"}},"hovertemplate":"model=Hist Gradient Boosting\u003cbr\u003emean_fit_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cbr\u003emax_iter=%{customdata[0]}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"Hist Gradient Boosting","marker":{"color":"#EF553B","symbol":"circle"},"mode":"markers","name":"Hist Gradient Boosting","orientation":"v","showlegend":true,"x":{"dtype":"f8","bdata":"exSuR+F6pD\u002fJdr6fGi+tP6RwPQrXo8A\u002fjZduEoPAyj93vp8aL93gP0a28\u002f3UeOk\u002f"},"xaxis":"x","y":{"dtype":"f8","bdata":"EFg5tMh24j8MAiuHFtnmP5qZmZmZmek\u002fvHSTGARW6j9g5dAi2\u002fnqP4lBYOXQIus\u002f"},"yaxis":"y","type":"scatter"},{"hovertemplate":"mean_fit_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"","line":{"color":"#EF553B","dash":"solid"},"marker":{"symbol":"circle"},"mode":"lines","name":"","orientation":"v","showlegend":false,"x":{"dtype":"f8","bdata":"exSuR+F6pD\u002fJdr6fGi+tP6RwPQrXo8A\u002fjZduEoPAyj93vp8aL93gP0a28\u002f3UeOk\u002f"},"xaxis":"x","y":{"dtype":"f8","bdata":"EFg5tMh24j8MAiuHFtnmP5qZmZmZmek\u002fvHSTGARW6j9g5dAi2\u002fnqP4lBYOXQIus\u002f"},"yaxis":"y","type":"scatter"},{"customdata":{"dtype":"i2","bdata":"CgAUADIAZAAsAfQB","shape":"6, 1"},"error_x":{"array":{"dtype":"f8","bdata":"AAAAAAAAAAAAAAAAAAAAAPyp8dJNYlA\u002f\u002fKnx0k1iYD\u002f8qfHSTWJQP\u002fyp8dJNYmA\u002f"}},"error_y":{"array":{"dtype":"f8","bdata":"eekmMQisfD\u002f8qfHSTWKAP\u002fp+arx0k3g\u002f+n5qvHSTaD\u002f6fmq8dJNoP\u002fp+arx0k2g\u002f"}},"hovertemplate":"mean_score_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cbr\u003emax_iter=%{customdata[0]}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"","marker":{"color":"#EF553B","symbol":"circle"},"mode":"markers","name":"","orientation":"v","showlegend":false,"x":{"dtype":"f8","bdata":"\u002fKnx0k1icD\u002f6fmq8dJN4Pzm0yHa+n4o\u002f+n5qvHSTmD956SYxCKysP7pJDAIrh7Y\u002f"},"xaxis":"x2","y":{"dtype":"f8","bdata":"EFg5tMh24j8MAiuHFtnmP5qZmZmZmek\u002fvHSTGARW6j9g5dAi2\u002fnqP4lBYOXQIus\u002f"},"yaxis":"y2","type":"scatter"},{"hovertemplate":"mean_score_time=%{x}\u003cbr\u003emean_test_score=%{y}\u003cextra\u003e\u003c\u002fextra\u003e","legendgroup":"","line":{"color":"#EF553B","dash":"solid"},"marker":{"symbol":"circle"},"mode":"lines","name":"","orientation":"v","showlegend":false,"x":{"dtype":"f8","bdata":"\u002fKnx0k1icD\u002f6fmq8dJN4Pzm0yHa+n4o\u002f+n5qvHSTmD956SYxCKysP7pJDAIrh7Y\u002f"},"xaxis":"x2","y":{"dtype":"f8","bdata":"EFg5tMh24j8MAiuHFtnmP5qZmZmZmek\u002fvHSTGARW6j9g5dAi2\u002fnqP4lBYOXQIus\u002f"},"yaxis":"y2","type":"scatter"}],                        {"template":{"data":{"histogram2dcontour":[{"type":"histogram2dcontour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"choropleth":[{"type":"choropleth","colorbar":{"outlinewidth":0,"ticks":""}}],"histogram2d":[{"type":"histogram2d","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmap":[{"type":"heatmap","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"contourcarpet":[{"type":"contourcarpet","colorbar":{"outlinewidth":0,"ticks":""}}],"contour":[{"type":"contour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"surface":[{"type":"surface","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"mesh3d":[{"type":"mesh3d","colorbar":{"outlinewidth":0,"ticks":""}}],"scatter":[{"fillpattern":{"fillmode":"overlay","size":10,"solidity":0.2},"type":"scatter"}],"parcoords":[{"type":"parcoords","line":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolargl":[{"type":"scatterpolargl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"bar":[{"error_x":{"color":"#2a3f5f"},"error_y":{"color":"#2a3f5f"},"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"bar"}],"scattergeo":[{"type":"scattergeo","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolar":[{"type":"scatterpolar","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"histogram":[{"marker":{"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"histogram"}],"scattergl":[{"type":"scattergl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatter3d":[{"type":"scatter3d","line":{"colorbar":{"outlinewidth":0,"ticks":""}},"marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattermap":[{"type":"scattermap","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattermapbox":[{"type":"scattermapbox","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterternary":[{"type":"scatterternary","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattercarpet":[{"type":"scattercarpet","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"carpet":[{"aaxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"baxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"type":"carpet"}],"table":[{"cells":{"fill":{"color":"#EBF0F8"},"line":{"color":"white"}},"header":{"fill":{"color":"#C8D4E3"},"line":{"color":"white"}},"type":"table"}],"barpolar":[{"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"barpolar"}],"pie":[{"automargin":true,"type":"pie"}]},"layout":{"autotypenumbers":"strict","colorway":["#636efa","#EF553B","#00cc96","#ab63fa","#FFA15A","#19d3f3","#FF6692","#B6E880","#FF97FF","#FECB52"],"font":{"color":"#2a3f5f"},"hovermode":"closest","hoverlabel":{"align":"left"},"paper_bgcolor":"white","plot_bgcolor":"#E5ECF6","polar":{"bgcolor":"#E5ECF6","angularaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"radialaxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"ternary":{"bgcolor":"#E5ECF6","aaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"baxis":{"gridcolor":"white","linecolor":"white","ticks":""},"caxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"coloraxis":{"colorbar":{"outlinewidth":0,"ticks":""}},"colorscale":{"sequential":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"sequentialminus":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"diverging":[[0,"#8e0152"],[0.1,"#c51b7d"],[0.2,"#de77ae"],[0.3,"#f1b6da"],[0.4,"#fde0ef"],[0.5,"#f7f7f7"],[0.6,"#e6f5d0"],[0.7,"#b8e186"],[0.8,"#7fbc41"],[0.9,"#4d9221"],[1,"#276419"]]},"xaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"yaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"scene":{"xaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"yaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"zaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2}},"shapedefaults":{"line":{"color":"#2a3f5f"}},"annotationdefaults":{"arrowcolor":"#2a3f5f","arrowhead":0,"arrowwidth":1},"geo":{"bgcolor":"white","landcolor":"#E5ECF6","subunitcolor":"white","showland":true,"showlakes":true,"lakecolor":"white"},"title":{"x":0.05},"mapbox":{"style":"light"}}},"xaxis":{"anchor":"y","domain":[0.0,0.45],"title":{"text":"Train time (s) - lower is better"}},"yaxis":{"anchor":"x","domain":[0.0,1.0],"title":{"text":"Test R2 score - higher is better"}},"xaxis2":{"anchor":"y2","domain":[0.55,1.0],"title":{"text":"Predict time (s) - lower is better"}},"yaxis2":{"anchor":"x2","domain":[0.0,1.0],"matches":"y","showticklabels":false},"annotations":[{"font":{"size":16},"showarrow":false,"text":"Train time vs score","x":0.225,"xanchor":"center","xref":"paper","y":1.0,"yanchor":"bottom","yref":"paper"},{"font":{"size":16},"showarrow":false,"text":"Predict time vs score","x":0.775,"xanchor":"center","xref":"paper","y":1.0,"yanchor":"bottom","yref":"paper"}],"legend":{"x":0.72,"y":0.05,"traceorder":"normal","borderwidth":1},"title":{"x":0.5,"text":"Speed-score trade-off of tree-based ensembles"}},                        {"responsive": true}                    )                };            </script>        </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 202-227

Both HGBT and RF models improve when increasing the number of trees in the
ensemble. However, the scores reach a plateau where adding new trees just
makes fitting and scoring slower. The RF model reaches such plateau earlier
and can never reach the test score of the largest HGBDT model.

Note that the results shown on the above plot can change slightly across runs
and even more significantly when running on other machines: try to run this
example on your own local machine.

Overall, one should often observe that the Histogram-based gradient boosting
models uniformly dominate the Random Forest models in the "test score vs
training speed trade-off" (the HGBDT curve should be on the top left of the RF
curve, without ever crossing). The "test score vs prediction speed" trade-off
can also be more disputed, but it's most often favorable to HGBDT. It's always
a good idea to check both kinds of model (with hyper-parameter tuning) and
compare their performance on your specific problem to determine which model is
the best fit but **HGBT almost always offers a more favorable speed-accuracy
trade-off than RF**, either with the default hyper-parameters or including the
hyper-parameter tuning cost.

There is one exception to this rule of thumb though: when training a
multiclass classification model with a large number of possible classes, HGBDT
fits internally one-tree per class at each boosting iteration while the trees
used by the RF models are naturally multiclass which should improve the speed
accuracy trade-off of the RF models in this case.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 43.474 seconds)


.. _sphx_glr_download_auto_examples_ensemble_plot_forest_hist_grad_boosting_comparison.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/1.8.X?urlpath=lab/tree/notebooks/auto_examples/ensemble/plot_forest_hist_grad_boosting_comparison.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: lite-badge

      .. image:: images/jupyterlite_badge_logo.svg
        :target: ../../lite/lab/index.html?path=auto_examples/ensemble/plot_forest_hist_grad_boosting_comparison.ipynb
        :alt: Launch JupyterLite
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_forest_hist_grad_boosting_comparison.ipynb <plot_forest_hist_grad_boosting_comparison.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_forest_hist_grad_boosting_comparison.py <plot_forest_hist_grad_boosting_comparison.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_forest_hist_grad_boosting_comparison.zip <plot_forest_hist_grad_boosting_comparison.zip>`


.. include:: plot_forest_hist_grad_boosting_comparison.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
