==================== Data Files Support ==================== The distutils have traditionally allowed installation of "data files", which are placed in a platform-specific location. However, the most common use case for data files distributed with a package is for use *by* the package, usually by including the data files **inside the package directory**. Setuptools offers three ways to specify this most common type of data files to be included in your package's [#datafiles]_. First, you can simply use the ``include_package_data`` keyword, e.g.:: from setuptools import setup, find_packages setup( ... include_package_data=True ) This tells setuptools to install any data files it finds in your packages. The data files must be specified via the |MANIFEST.in|_ file. (They can also be tracked by a revision control system, using an appropriate plugin such as :pypi:`setuptools-scm` or :pypi:`setuptools-svn`. See the section below on :ref:`Adding Support for Revision Control Systems` for information on how to write such plugins.) If you want finer-grained control over what files are included (for example, if you have documentation files in your package directories and want to exclude them from installation), then you can also use the ``package_data`` keyword, e.g.:: from setuptools import setup, find_packages setup( ... package_data={ # If any package contains *.txt or *.rst files, include them: "": ["*.txt", "*.rst"], # And include any *.msg files found in the "hello" package, too: "hello": ["*.msg"], } ) The ``package_data`` argument is a dictionary that maps from package names to lists of glob patterns. The globs may include subdirectory names, if the data files are contained in a subdirectory of the package. For example, if the package tree looks like this:: setup.py src/ mypkg/ __init__.py mypkg.txt data/ somefile.dat otherdata.dat The setuptools setup file might look like this:: from setuptools import setup, find_packages setup( ... packages=find_packages("src"), # include all packages under src package_dir={"": "src"}, # tell distutils packages are under src package_data={ # If any package contains *.txt files, include them: "": ["*.txt"], # And include any *.dat files found in the "data" subdirectory # of the "mypkg" package, also: "mypkg": ["data/*.dat"], } ) Notice that if you list patterns in ``package_data`` under the empty string, these patterns are used to find files in every package, even ones that also have their own patterns listed. Thus, in the above example, the ``mypkg.txt`` file gets included even though it's not listed in the patterns for ``mypkg``. Also notice that if you use paths, you *must* use a forward slash (``/``) as the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time. If datafiles are contained in a subdirectory of a package that isn't a package itself (no ``__init__.py``), then the subdirectory names (or ``*``) are required in the ``package_data`` argument (as shown above with ``"data/*.dat"``). When building an ``sdist``, the datafiles are also drawn from the ``package_name.egg-info/SOURCES.txt`` file, so make sure that this is removed if the ``setup.py`` ``package_data`` list is updated before calling ``setup.py``. .. note:: If using the ``include_package_data`` argument, files specified by ``package_data`` will *not* be automatically added to the manifest unless they are listed in the |MANIFEST.in|_ file or by a plugin like :pypi:`setuptools-scm` or :pypi:`setuptools-svn`. .. https://docs.python.org/3/distutils/setupscript.html#installing-package-data Sometimes, the ``include_package_data`` or ``package_data`` options alone aren't sufficient to precisely define what files you want included. For example, you may want to include package README files in your revision control system and source distributions, but exclude them from being installed. So, setuptools offers an ``exclude_package_data`` option as well, that allows you to do things like this:: from setuptools import setup, find_packages setup( ... packages=find_packages("src"), # include all packages under src package_dir={"": "src"}, # tell distutils packages are under src include_package_data=True, # include everything in source control # ...but exclude README.txt from all packages exclude_package_data={"": ["README.txt"]}, ) The ``exclude_package_data`` option is a dictionary mapping package names to lists of wildcard patterns, just like the ``package_data`` option. And, just as with that option, a key of ``""`` will apply the given pattern(s) to all packages. However, any files that match these patterns will be *excluded* from installation, even if they were listed in ``package_data`` or were included as a result of using ``include_package_data``. In summary, the three options allow you to: ``include_package_data`` Accept all data files and directories matched by |MANIFEST.in|_ or added by a :ref:`plugin `. ``package_data`` Specify additional patterns to match files that may or may not be matched by |MANIFEST.in|_ or added by a :ref:`plugin `. ``exclude_package_data`` Specify patterns for data files and directories that should *not* be included when a package is installed, even if they would otherwise have been included due to the use of the preceding options. NOTE: Due to the way the distutils build process works, a data file that you include in your project and then stop including may be "orphaned" in your project's build directories, requiring you to run ``setup.py clean --all`` to fully remove them. This may also be important for your users and contributors if they track intermediate revisions of your project using Subversion; be sure to let them know when you make changes that remove files from inclusion so they can run ``setup.py clean --all``. .. _Accessing Data Files at Runtime: Accessing Data Files at Runtime ------------------------------- Typically, existing programs manipulate a package's ``__file__`` attribute in order to find the location of data files. However, this manipulation isn't compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs. It is strongly recommended that, if you are using data files, you should use :mod:`importlib.resources` to access them. :mod:`importlib.resources` was added to Python 3.7 and the latest version of the library is also available via the :pypi:`importlib-resources` backport. See :doc:`importlib-resources:using` for detailed instructions [#importlib]_. .. tip:: Files inside the package directory should be *read-only* to avoid a series of common problems (e.g. when multiple users share a common Python installation, when the package is loaded from a zip file, or when multiple instances of a Python application run in parallel). If your Python package needs to write to a file for shared data or configuration, you can use standard platform/OS-specific system directories, such as ``~/.local/config/$appname`` or ``/usr/share/$appname/$version`` (Linux specific) [#system-dirs]_. A common approach is to add a read-only template file to the package directory that is then copied to the correct system directory if no pre-existing file is found. Non-Package Data Files ---------------------- Historically, ``setuptools`` by way of ``easy_install`` would encapsulate data files from the distribution into the egg (see `the old docs `_). As eggs are deprecated and pip-based installs fall back to the platform-specific location for installing data files, there is no supported facility to reliably retrieve these resources. Instead, the PyPA recommends that any data files you wish to be accessible at run time be included **inside the package**. ---- .. [#datafiles] ``setuptools`` consider a *package data file* any non-Python file **inside the package directory** (i.e., that co-exists in the same location as the regular ``.py`` files being distributed). .. [#system-dirs] These locations can be discovered with the help of third-party libraries such as :pypi:`platformdirs`. .. [#importlib] Recent versions of :mod:`importlib.resources` available in Pythons' standard library should be API compatible with :pypi:`importlib-metadata`. However this might vary depending on which version of Python is installed. .. |MANIFEST.in| replace:: ``MANIFEST.in`` .. _MANIFEST.in: https://packaging.python.org/en/latest/guides/using-manifest-in/