1====================
2Data Files Support
3====================
4
5The distutils have traditionally allowed installation of "data files", which
6are placed in a platform-specific location.  However, the most common use case
7for data files distributed with a package is for use *by* the package, usually
8by including the data files **inside the package directory**.
9
10Setuptools offers three ways to specify this most common type of data files to
11be included in your package's [#datafiles]_.
12First, you can simply use the ``include_package_data`` keyword, e.g.::
13
14    from setuptools import setup, find_packages
15    setup(
16        ...
17        include_package_data=True
18    )
19
20This tells setuptools to install any data files it finds in your packages.
21The data files must be specified via the |MANIFEST.in|_ file.
22(They can also be tracked by a revision control system, using an appropriate
23plugin such as :pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
24See the section below on :ref:`Adding Support for Revision
25Control Systems` for information on how to write such plugins.)
26
27If you want finer-grained control over what files are included (for example,
28if you have documentation files in your package directories and want to exclude
29them from installation), then you can also use the ``package_data`` keyword,
30e.g.::
31
32    from setuptools import setup, find_packages
33    setup(
34        ...
35        package_data={
36            # If any package contains *.txt or *.rst files, include them:
37            "": ["*.txt", "*.rst"],
38            # And include any *.msg files found in the "hello" package, too:
39            "hello": ["*.msg"],
40        }
41    )
42
43The ``package_data`` argument is a dictionary that maps from package names to
44lists of glob patterns.  The globs may include subdirectory names, if the data
45files are contained in a subdirectory of the package.  For example, if the
46package tree looks like this::
47
48    setup.py
49    src/
50        mypkg/
51            __init__.py
52            mypkg.txt
53            data/
54                somefile.dat
55                otherdata.dat
56
57The setuptools setup file might look like this::
58
59    from setuptools import setup, find_packages
60    setup(
61        ...
62        packages=find_packages("src"),  # include all packages under src
63        package_dir={"": "src"},   # tell distutils packages are under src
64
65        package_data={
66            # If any package contains *.txt files, include them:
67            "": ["*.txt"],
68            # And include any *.dat files found in the "data" subdirectory
69            # of the "mypkg" package, also:
70            "mypkg": ["data/*.dat"],
71        }
72    )
73
74Notice that if you list patterns in ``package_data`` under the empty string,
75these patterns are used to find files in every package, even ones that also
76have their own patterns listed.  Thus, in the above example, the ``mypkg.txt``
77file gets included even though it's not listed in the patterns for ``mypkg``.
78
79Also notice that if you use paths, you *must* use a forward slash (``/``) as
80the path separator, even if you are on Windows.  Setuptools automatically
81converts slashes to appropriate platform-specific separators at build time.
82
83If datafiles are contained in a subdirectory of a package that isn't a package
84itself (no ``__init__.py``), then the subdirectory names (or ``*``) are required
85in the ``package_data`` argument (as shown above with ``"data/*.dat"``).
86
87When building an ``sdist``, the datafiles are also drawn from the
88``package_name.egg-info/SOURCES.txt`` file, so make sure that this is removed if
89the ``setup.py`` ``package_data`` list is updated before calling ``setup.py``.
90
91.. note::
92   If using the ``include_package_data`` argument, files specified by
93   ``package_data`` will *not* be automatically added to the manifest unless
94   they are listed in the |MANIFEST.in|_ file or by a plugin like
95   :pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
96
97.. https://docs.python.org/3/distutils/setupscript.html#installing-package-data
98
99Sometimes, the ``include_package_data`` or ``package_data`` options alone
100aren't sufficient to precisely define what files you want included.  For
101example, you may want to include package README files in your revision control
102system and source distributions, but exclude them from being installed.  So,
103setuptools offers an ``exclude_package_data`` option as well, that allows you
104to do things like this::
105
106    from setuptools import setup, find_packages
107    setup(
108        ...
109        packages=find_packages("src"),  # include all packages under src
110        package_dir={"": "src"},   # tell distutils packages are under src
111
112        include_package_data=True,    # include everything in source control
113
114        # ...but exclude README.txt from all packages
115        exclude_package_data={"": ["README.txt"]},
116    )
117
118The ``exclude_package_data`` option is a dictionary mapping package names to
119lists of wildcard patterns, just like the ``package_data`` option.  And, just
120as with that option, a key of ``""`` will apply the given pattern(s) to all
121packages.  However, any files that match these patterns will be *excluded*
122from installation, even if they were listed in ``package_data`` or were
123included as a result of using ``include_package_data``.
124
125In summary, the three options allow you to:
126
127``include_package_data``
128    Accept all data files and directories matched by |MANIFEST.in|_ or added by
129    a :ref:`plugin <Adding Support for Revision Control Systems>`.
130
131``package_data``
132    Specify additional patterns to match files that may or may
133    not be matched by |MANIFEST.in|_ or added by
134    a :ref:`plugin <Adding Support for Revision Control Systems>`.
135
136``exclude_package_data``
137    Specify patterns for data files and directories that should *not* be
138    included when a package is installed, even if they would otherwise have
139    been included due to the use of the preceding options.
140
141NOTE: Due to the way the distutils build process works, a data file that you
142include in your project and then stop including may be "orphaned" in your
143project's build directories, requiring you to run ``setup.py clean --all`` to
144fully remove them.  This may also be important for your users and contributors
145if they track intermediate revisions of your project using Subversion; be sure
146to let them know when you make changes that remove files from inclusion so they
147can run ``setup.py clean --all``.
148
149
150.. _Accessing Data Files at Runtime:
151
152Accessing Data Files at Runtime
153-------------------------------
154
155Typically, existing programs manipulate a package's ``__file__`` attribute in
156order to find the location of data files.  However, this manipulation isn't
157compatible with PEP 302-based import hooks, including importing from zip files
158and Python Eggs.  It is strongly recommended that, if you are using data files,
159you should use :mod:`importlib.resources` to access them.
160:mod:`importlib.resources` was added to Python 3.7 and the latest version of
161the library is also available via the :pypi:`importlib-resources` backport.
162See :doc:`importlib-resources:using` for detailed instructions [#importlib]_.
163
164.. tip:: Files inside the package directory should be *read-only* to avoid a
165   series of common problems (e.g. when multiple users share a common Python
166   installation, when the package is loaded from a zip file, or when multiple
167   instances of a Python application run in parallel).
168
169   If your Python package needs to write to a file for shared data or configuration,
170   you can use standard platform/OS-specific system directories, such as
171   ``~/.local/config/$appname`` or ``/usr/share/$appname/$version`` (Linux specific) [#system-dirs]_.
172   A common approach is to add a read-only template file to the package
173   directory that is then copied to the correct system directory if no
174   pre-existing file is found.
175
176
177Non-Package Data Files
178----------------------
179
180Historically, ``setuptools`` by way of ``easy_install`` would encapsulate data
181files from the distribution into the egg (see `the old docs
182<https://github.com/pypa/setuptools/blob/52aacd5b276fedd6849c3a648a0014f5da563e93/docs/setuptools.txt#L970-L1001>`_). As eggs are deprecated and pip-based installs
183fall back to the platform-specific location for installing data files, there is
184no supported facility to reliably retrieve these resources.
185
186Instead, the PyPA recommends that any data files you wish to be accessible at
187run time be included **inside the package**.
188
189
190----
191
192.. [#datafiles] ``setuptools`` consider a *package data file* any non-Python
193   file **inside the package directory** (i.e., that co-exists in the same
194   location as the regular ``.py`` files being distributed).
195
196.. [#system-dirs] These locations can be discovered with the help of
197   third-party libraries such as :pypi:`platformdirs`.
198
199.. [#importlib] Recent versions of :mod:`importlib.resources` available in
200   Pythons' standard library should be API compatible with
201   :pypi:`importlib-metadata`. However this might vary depending on which version
202   of Python is installed.
203
204
205.. |MANIFEST.in| replace:: ``MANIFEST.in``
206.. _MANIFEST.in: https://packaging.python.org/en/latest/guides/using-manifest-in/
207