1.. _`package_discovery`:
2
3========================================
4Package Discovery and Namespace Package
5========================================
6
7.. note::
8    a full specification for the keyword supplied to ``setup.cfg`` or
9    ``setup.py`` can be found at :doc:`keywords reference <keywords>`
10
11.. note::
12    the examples provided here are only to demonstrate the functionality
13    introduced. More metadata and options arguments need to be supplied
14    if you want to replicate them on your system. If you are completely
15    new to setuptools, the :doc:`quickstart section <quickstart>` is a good
16    place to start.
17
18``Setuptools`` provide powerful tools to handle package discovery, including
19support for namespace package.
20
21Normally, you would specify the package to be included manually in the following manner:
22
23.. tab:: setup.cfg
24
25    .. code-block:: ini
26
27        [options]
28        #...
29        packages =
30            mypkg1
31            mypkg2
32
33.. tab:: setup.py
34
35    .. code-block:: python
36
37        setup(
38            # ...
39            packages=['mypkg1', 'mypkg2']
40        )
41
42.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_
43
44    .. code-block:: toml
45
46        # ...
47        [tool.setuptools]
48        packages = ["mypkg1", "mypkg2"]
49        # ...
50
51
52If your packages are not in the root of the repository you also need to
53configure ``package_dir``:
54
55.. tab:: setup.cfg
56
57    .. code-block:: ini
58
59        [options]
60        # ...
61        package_dir =
62            = src
63            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)
64        # OR
65        package_dir =
66            mypkg1 = lib1
67            # mypkg1.mod corresponds to lib1/mod.py
68            # mypkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
69            mypkg2 = lib2
70            # mypkg2.mod corresponds to lib2/mod.py
71            mypkg2.subpkg = lib3
72            # mypkg2.subpkg.mod corresponds to lib3/mod.py
73
74.. tab:: setup.py
75
76    .. code-block:: python
77
78        setup(
79            # ...
80            package_dir = {"": "src"}
81            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)
82        )
83
84        # OR
85
86        setup(
87            # ...
88            package_dir = {
89                "mypkg1": "lib1",   # mypkg1.mod corresponds to lib1/mod.py
90                                    # mypkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
91                "mypkg2": "lib2",   # mypkg2.mod corresponds to lib2/mod.py
92                "mypkg2.subpkg": "lib3"  # mypkg2.subpkg.mod corresponds to lib3/mod.py
93                # ...
94        )
95
96.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_
97
98    .. code-block:: toml
99
100        [tool.setuptools]
101        # ...
102        package-dir = {"" = "src"}
103            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)
104
105        # OR
106
107        [tool.setuptools.package-dir]
108        mypkg1 = "lib1"
109            # mypkg1.mod corresponds to lib1/mod.py
110            # mypkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
111        mypkg2 = "lib2"
112            # mypkg2.mod corresponds to lib2/mod.py
113        "mypkg2.subpkg" = "lib3"
114            # mypkg2.subpkg.mod corresponds to lib3/mod.py
115        # ...
116
117This can get tiresome really quickly. To speed things up, you can rely on
118setuptools automatic discovery, or use the provided tools, as explained in
119the following sections.
120
121
122.. _auto-discovery:
123
124Automatic discovery
125===================
126
127.. warning:: Automatic discovery is an **experimental** feature and might change
128   (or be completely removed) in the future.
129   See :ref:`custom-discovery` for a stable way of configuring ``setuptools``.
130
131By default ``setuptools`` will consider 2 popular project layouts, each one with
132its own set of advantages and disadvantages [#layout1]_ [#layout2]_ as
133discussed in the following sections.
134
135Setuptools will automatically scan your project directory looking for these
136layouts and try to guess the correct values for the :ref:`packages <declarative
137config>` and :doc:`py_modules </references/keywords>` configuration.
138
139.. important::
140   Automatic discovery will **only** be enabled if you **don't** provide any
141   configuration for ``packages`` and ``py_modules``.
142   If at least one of them is explicitly set, automatic discovery will not take place.
143
144   **Note**: specifying ``ext_modules`` might also prevent auto-discover from
145   taking place, unless your opt into :doc:`pyproject_config` (which will
146   disable the backward compatible behaviour).
147
148.. _src-layout:
149
150src-layout
151----------
152The project should contain a ``src`` directory under the project root and
153all modules and packages meant for distribution are placed inside this
154directory::
155
156    project_root_directory
157    ├── pyproject.toml
158    ├── setup.cfg  # or setup.py
159    ├── ...
160    └── src/
161        └── mypkg/
162            ├── __init__.py
163            ├── ...
164            └── mymodule.py
165
166This layout is very handy when you wish to use automatic discovery,
167since you don't have to worry about other Python files or folders in your
168project root being distributed by mistake. In some circumstances it can be
169also less error-prone for testing or when using :pep:`420`-style packages.
170On the other hand you cannot rely on the implicit ``PYTHONPATH=.`` to fire
171up the Python REPL and play with your package (you will need an
172`editable install`_ to be able to do that).
173
174.. _flat-layout:
175
176flat-layout
177-----------
178*(also known as "adhoc")*
179
180The package folder(s) are placed directly under the project root::
181
182    project_root_directory
183    ├── pyproject.toml
184    ├── setup.cfg  # or setup.py
185    ├── ...
186    └── mypkg/
187        ├── __init__.py
188        ├── ...
189        └── mymodule.py
190
191This layout is very practical for using the REPL, but in some situations
192it can be can be more error-prone (e.g. during tests or if you have a bunch
193of folders or Python files hanging around your project root)
194
195To avoid confusion, file and folder names that are used by popular tools (or
196that correspond to well-known conventions, such as distributing documentation
197alongside the project code) are automatically filtered out in the case of
198*flat-layout*:
199
200.. autoattribute:: setuptools.discovery.FlatLayoutPackageFinder.DEFAULT_EXCLUDE
201
202.. autoattribute:: setuptools.discovery.FlatLayoutModuleFinder.DEFAULT_EXCLUDE
203
204.. warning::
205   If you are using auto-discovery with *flat-layout*, ``setuptools`` will
206   refuse to create :term:`distribution archives <Distribution Package>` with
207   multiple top-level packages or modules.
208
209   This is done to prevent common errors such as accidentally publishing code
210   not meant for distribution (e.g. maintenance-related scripts).
211
212   Users that purposefully want to create multi-package distributions are
213   advised to use :ref:`custom-discovery` or the ``src-layout``.
214
215There is also a handy variation of the *flat-layout* for utilities/libraries
216that can be implemented with a single Python file:
217
218single-module distribution
219^^^^^^^^^^^^^^^^^^^^^^^^^^
220
221A standalone module is placed directly under the project root, instead of
222inside a package folder::
223
224    project_root_directory
225    ├── pyproject.toml
226    ├── setup.cfg  # or setup.py
227    ├── ...
228    └── single_file_lib.py
229
230
231.. _custom-discovery:
232
233Custom discovery
234================
235
236If the automatic discovery does not work for you
237(e.g., you want to *include* in the distribution top-level packages with
238reserved names such as ``tasks``, ``example`` or ``docs``, or you want to
239*exclude* nested packages that would be otherwise included), you can use
240the provided tools for package discovery:
241
242.. tab:: setup.cfg
243
244    .. code-block:: ini
245
246        [options]
247        packages = find:
248        #or
249        packages = find_namespace:
250
251.. tab:: setup.py
252
253    .. code-block:: python
254
255        from setuptools import find_packages
256        # or
257        from setuptools import find_namespace_packages
258
259.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_
260
261    .. code-block:: toml
262
263        # ...
264        [tool.setuptools.packages]
265        find = {}  # Scanning implicit namespaces is active by default
266        # OR
267        find = {namespace = false}  # Disable implicit namespaces
268
269
270Finding simple packages
271-----------------------
272Let's start with the first tool. ``find:`` (``find_packages()``) takes a source
273directory and two lists of package name patterns to exclude and include, and
274then return a list of ``str`` representing the packages it could find. To use
275it, consider the following directory::
276
277    mypkg
278    ├── setup.cfg  # and/or setup.py, pyproject.toml
279    └── src
280        ├── pkg1
281        │   └── __init__.py
282        ├── pkg2
283        │   └── __init__.py
284        ├── aditional
285        │   └── __init__.py
286        └── pkg
287            └── namespace
288                └── __init__.py
289
290To have setuptools to automatically include packages found
291in ``src`` that starts with the name ``pkg`` and not ``additional``:
292
293.. tab:: setup.cfg
294
295    .. code-block:: ini
296
297        [options]
298        packages = find:
299        package_dir =
300            =src
301
302        [options.packages.find]
303        where = src
304        include = pkg*
305        exclude = additional
306
307    .. note::
308        ``pkg`` does not contain an ``__init__.py`` file, therefore
309        ``pkg.namespace`` is ignored by ``find:`` (see ``find_namespace:`` below).
310
311.. tab:: setup.py
312
313    .. code-block:: python
314
315        setup(
316            # ...
317            packages=find_packages(
318                where='src',
319                include=['pkg*'],
320                exclude=['additional'],
321            ),
322            package_dir={"": "src"}
323            # ...
324        )
325
326
327    .. note::
328        ``pkg`` does not contain an ``__init__.py`` file, therefore
329        ``pkg.namespace`` is ignored by ``find_packages()``
330        (see ``find_namespace_packages()`` below).
331
332.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_
333
334    .. code-block:: toml
335
336        [tool.setuptools.packages.find]
337        where = ["src"]
338        include = ["pkg*"]
339        exclude = ["additional"]
340        namespaces = false
341
342    .. note::
343        When using ``tool.setuptools.packages.find`` in ``pyproject.toml``,
344        setuptools will consider :pep:`implicit namespaces <420>` by default when
345        scanning your project directory.
346        To avoid ``pkg.namespace`` from being added to your package list
347        you can set ``namespaces = false``. This will prevent any folder
348        without an ``__init__.py`` file from being scanned.
349
350.. important::
351   ``include`` and ``exclude`` accept strings representing :mod:`glob` patterns.
352   These patterns should match the **full** name of the Python module (as if it
353   was written in an ``import`` statement).
354
355   For example if you have ``util`` pattern, it will match
356   ``util/__init__.py`` but not ``util/files/__init__.py``.
357
358   The fact that the parent package is matched by the pattern will not dictate
359   if the submodule will be included or excluded from the distribution.
360   You will need to explicitly add a wildcard (e.g. ``util*``)
361   if you want the pattern to also match submodules.
362
363.. _Namespace Packages:
364
365Finding namespace packages
366--------------------------
367``setuptools``  provides the ``find_namespace:`` (``find_namespace_packages()``)
368which behaves similarly to ``find:`` but works with namespace package.
369
370Before diving in, it is important to have a good understanding of what
371:pep:`namespace packages <420>` are. Here is a quick recap.
372
373When you have two packages organized as follows:
374
375.. code-block:: bash
376
377    /Users/Desktop/timmins/foo/__init__.py
378    /Library/timmins/bar/__init__.py
379
380If both ``Desktop`` and ``Library`` are on your ``PYTHONPATH``, then a
381namespace package called ``timmins`` will be created automatically for you when
382you invoke the import mechanism, allowing you to accomplish the following:
383
384.. code-block:: pycon
385
386    >>> import timmins.foo
387    >>> import timmins.bar
388
389as if there is only one ``timmins`` on your system. The two packages can then
390be distributed separately and installed individually without affecting the
391other one.
392
393Now, suppose you decide to package the ``foo`` part for distribution and start
394by creating a project directory organized as follows::
395
396   foo
397   ├── setup.cfg  # and/or setup.py, pyproject.toml
398   └── src
399       └── timmins
400           └── foo
401               └── __init__.py
402
403If you want the ``timmins.foo`` to be automatically included in the
404distribution, then you will need to specify:
405
406.. tab:: setup.cfg
407
408    .. code-block:: ini
409
410        [options]
411        package_dir =
412            =src
413        packages = find_namespace:
414
415        [options.packages.find]
416        where = src
417
418    ``find:`` won't work because timmins doesn't contain ``__init__.py``
419    directly, instead, you have to use ``find_namespace:``.
420
421    You can think of ``find_namespace:`` as identical to ``find:`` except it
422    would count a directory as a package even if it doesn't contain ``__init__.py``
423    file directly.
424
425.. tab:: setup.py
426
427    .. code-block:: python
428
429        setup(
430            # ...
431            packages=find_namespace_packages(where='src'),
432            package_dir={"": "src"}
433            # ...
434        )
435
436    When you use ``find_packages()``, all directories without an
437    ``__init__.py`` file will be disconsidered.
438    On the other hand, ``find_namespace_packages()`` will scan all
439    directories.
440
441.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_
442
443    .. code-block:: toml
444
445        [tool.setuptools.packages.find]
446        where = ["src"]
447
448    When using ``tool.setuptools.packages.find`` in ``pyproject.toml``,
449    setuptools will consider :pep:`implicit namespaces <420>` by default when
450    scanning your project directory.
451
452After installing the package distribution, ``timmins.foo`` would become
453available to your interpreter.
454
455.. warning::
456   Please have in mind that ``find_namespace:`` (setup.cfg),
457   ``find_namespace_packages()`` (setup.py) and ``find`` (pyproject.toml) will
458   scan **all** folders that you have in your project directory if you use a
459   :ref:`flat-layout`.
460
461   If used naïvely, this might result in unwanted files being added to your
462   final wheel. For example, with a project directory organized as follows::
463
464       foo
465       ├── docs
466       │   └── conf.py
467       ├── timmins
468       │   └── foo
469       │       └── __init__.py
470       └── tests
471           └── tests_foo
472               └── __init__.py
473
474   final users will end up installing not only ``timmins.foo``, but also
475   ``docs`` and ``tests.tests_foo``.
476
477   A simple way to fix this is to adopt the aforementioned :ref:`src-layout`,
478   or make sure to properly configure the ``include`` and/or ``exclude``
479   accordingly.
480
481.. tip::
482   After :ref:`building your package <building>`, you can have a look if all
483   the files are correct (nothing missing or extra), by running the following
484   commands:
485
486   .. code-block:: bash
487
488      tar tf dist/*.tar.gz
489      unzip -l dist/*.whl
490
491   This requires the ``tar`` and ``unzip`` to be installed in your OS.
492   On Windows you can also use a GUI program such as 7zip_.
493
494
495Legacy Namespace Packages
496=========================
497The fact you can create namespace package so effortlessly above is credited
498to `PEP 420 <https://www.python.org/dev/peps/pep-0420/>`_. It use to be more
499cumbersome to accomplish the same result. Historically, there were two methods
500to create namespace packages. One is the ``pkg_resources`` style supported by
501``setuptools`` and the other one being ``pkgutils`` style offered by
502``pkgutils`` module in Python. Both are now considered deprecated despite the
503fact they still linger in many existing packages. These two differ in many
504subtle yet significant aspects and you can find out more on `Python packaging
505user guide <https://packaging.python.org/guides/packaging-namespace-packages/>`_
506
507
508``pkg_resource`` style namespace package
509----------------------------------------
510This is the method ``setuptools`` directly supports. Starting with the same
511layout, there are two pieces you need to add to it. First, an ``__init__.py``
512file directly under your namespace package directory that contains the
513following:
514
515.. code-block:: python
516
517    __import__("pkg_resources").declare_namespace(__name__)
518
519And the ``namespace_packages`` keyword in your ``setup.cfg`` or ``setup.py``:
520
521.. tab:: setup.cfg
522
523    .. code-block:: ini
524
525        [options]
526        namespace_packages = timmins
527
528.. tab:: setup.py
529
530    .. code-block:: python
531
532        setup(
533            # ...
534            namespace_packages=['timmins']
535        )
536
537And your directory should look like this
538
539.. code-block:: bash
540
541   foo
542   ├── setup.cfg  # and/or setup.py, pyproject.toml
543   └── src
544       └── timmins
545           ├── __init__.py
546           └── foo
547               └── __init__.py
548
549Repeat the same for other packages and you can achieve the same result as
550the previous section.
551
552``pkgutil`` style namespace package
553-----------------------------------
554This method is almost identical to the ``pkg_resource`` except that the
555``namespace_packages`` declaration is omitted and the ``__init__.py``
556file contains the following:
557
558.. code-block:: python
559
560    __path__ = __import__('pkgutil').extend_path(__path__, __name__)
561
562The project layout remains the same and ``setup.cfg`` remains the same.
563
564
565----
566
567
568.. [#experimental]
569   Support for specifying package metadata and build configuration options via
570   ``pyproject.toml`` is experimental and might change (or be completely
571   removed) in the future. See :doc:`/userguide/pyproject_config`.
572.. [#layout1] https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
573.. [#layout2] https://blog.ionelmc.ro/2017/09/25/rehashing-the-src-layout/
574
575.. _editable install: https://pip.pypa.io/en/stable/cli/pip_install/#editable-installs
576.. _7zip: https://www.7-zip.org
577