
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/neural_networks/plot_mnist_filters.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_neural_networks_plot_mnist_filters.py>`
        to download the full example code or to run this example in your browser via JupyterLite or Binder.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_neural_networks_plot_mnist_filters.py:


=====================================
Visualization of MLP weights on MNIST
=====================================

Sometimes looking at the learned coefficients of a neural network can provide
insight into the learning behavior. For example if weights look unstructured,
maybe some were not used at all, or if very large coefficients exist, maybe
regularization was too low or the learning rate too high.

This example shows how to plot some of the first layer weights in a
MLPClassifier trained on the MNIST dataset.

The input data consists of 28x28 pixel handwritten digits, leading to 784
features in the dataset. Therefore the first layer weight matrix has the shape
(784, hidden_layer_sizes[0]).  We can therefore visualize a single column of
the weight matrix as a 28x28 pixel image.

To make the example run faster, we use very few hidden units, and train only
for a very short time. Training longer would result in weights with a much
smoother spatial appearance. The example will throw a warning because it
doesn't converge, in this case this is what we want because of resource
usage constraints on our Continuous Integration infrastructure that is used
to build this documentation on a regular basis.

.. GENERATED FROM PYTHON SOURCE LINES 26-75



.. image-sg:: /auto_examples/neural_networks/images/sphx_glr_plot_mnist_filters_001.png
   :alt: plot mnist filters
   :srcset: /auto_examples/neural_networks/images/sphx_glr_plot_mnist_filters_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Iteration 1, loss = 0.44139186
    Iteration 2, loss = 0.19174891
    Iteration 3, loss = 0.13983521
    Iteration 4, loss = 0.11378556
    Iteration 5, loss = 0.09443967
    Iteration 6, loss = 0.07846529
    Iteration 7, loss = 0.06506307
    Iteration 8, loss = 0.05534985
    Training set score: 0.986429
    Test set score: 0.953061






|

.. code-block:: Python


    # Authors: The scikit-learn developers
    # SPDX-License-Identifier: BSD-3-Clause

    import warnings

    import matplotlib.pyplot as plt

    from sklearn.datasets import fetch_openml
    from sklearn.exceptions import ConvergenceWarning
    from sklearn.model_selection import train_test_split
    from sklearn.neural_network import MLPClassifier

    # Load data from https://www.openml.org/d/554
    X, y = fetch_openml("mnist_784", version=1, return_X_y=True, as_frame=False)
    X = X / 255.0

    # Split data into train partition and test partition
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.7)

    mlp = MLPClassifier(
        hidden_layer_sizes=(40,),
        max_iter=8,
        alpha=1e-4,
        solver="sgd",
        verbose=10,
        random_state=1,
        learning_rate_init=0.2,
    )

    # this example won't converge because of resource usage constraints on
    # our Continuous Integration infrastructure, so we catch the warning and
    # ignore it here
    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=ConvergenceWarning, module="sklearn")
        mlp.fit(X_train, y_train)

    print("Training set score: %f" % mlp.score(X_train, y_train))
    print("Test set score: %f" % mlp.score(X_test, y_test))

    fig, axes = plt.subplots(4, 4)
    # use global min / max to ensure all weights are shown on the same scale
    vmin, vmax = mlp.coefs_[0].min(), mlp.coefs_[0].max()
    for coef, ax in zip(mlp.coefs_[0].T, axes.ravel()):
        ax.matshow(coef.reshape(28, 28), cmap=plt.cm.gray, vmin=0.5 * vmin, vmax=0.5 * vmax)
        ax.set_xticks(())
        ax.set_yticks(())

    plt.show()


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 10.212 seconds)


.. _sphx_glr_download_auto_examples_neural_networks_plot_mnist_filters.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/1.8.X?urlpath=lab/tree/notebooks/auto_examples/neural_networks/plot_mnist_filters.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: lite-badge

      .. image:: images/jupyterlite_badge_logo.svg
        :target: ../../lite/lab/index.html?path=auto_examples/neural_networks/plot_mnist_filters.ipynb
        :alt: Launch JupyterLite
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_mnist_filters.ipynb <plot_mnist_filters.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_mnist_filters.py <plot_mnist_filters.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_mnist_filters.zip <plot_mnist_filters.zip>`


.. include:: plot_mnist_filters.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
