Python: Packaging and Build

Python is renowned for its readability and simplicity, but as your projects grow, maintaining code organization and managing dependencies can become complex.

In this blog post, we’ll explore best practices for organizing Python code, dependency management, and packaging your projects efficiently using Poetry.

Python Code Organization

Proper code organization is crucial for maintaining a clean and maintainable codebase. Here are some guidelines to consider:

Directory Structure

A well-structured directory layout can make your project more accessible and comprehensible. A common Python project structure might look like this:

my_project/
 - my_package/
   -  module 1/
     - file1.py
     - ...
     - test/
       - file1_test.py
       - ...
   -  module 2/
     - file1.py
     - file2.py
     - ...
     - test/
       - file1_test.py
       - ...
 - README.md
 - pyproject.toml
 - LICENSE
  • my_package: This directory contains your Python modules. Organize your code logically into different modules within packages.
  • my_package/module1: A module of my package.
  • my_package/module1/test: This directory holds your unit tests for module1 files. For every module in your package, create a corresponding test folder and test files.
  • README.md: Write a clear and informative README file to provide an overview of your project, usage instructions, and any other relevant information.
  • pyproject.toml: Project configuration file.
  • LICENSE: Add a relevant license.

Modularization

Split your code into modules, each focusing on a specific functionality. Avoid writing long monolithic files. For example, if you’re building a web application, separate your code into modules for routing, authentication, database operations, etc; if you are building an AI application, separate your code into modules for training, inference, deployment, modeling, and so on.

Docstrings and Comments

Use docstrings to document your functions, classes, and modules. This makes your code more understandable for others (and your future self). Additionally, use comments sparingly but effectively to explain complex logic or important decisions.

Virtual Environments

Use virtual environments to isolate your project’s dependencies from the system-wide Python installation. Poetry, by default, creates and manages a virtual environment for each project, making this process seamless.

Conda for virtual environment

You can also use conda to manage/create a virtual environment for your projects.

conda create --name myenv python=3.10

The above command creates a virtual environment with python 3.10 installed.

Poetry

Now that we’ve covered code organization, let’s delve into dependency management and packaging your Python project with Poetry.

Whether you are a seasoned developer or just starting your coding journey, setting up and managing Python projects can be a daunting task. Fortunately, there are tools available to streamline this process, and one of the most popular ones is Poetry.

What is Poetry

Poetry is a dependency management and packaging tool for Python that aims to simplify and standardize project setup and distribution. It was created to address some of the pain points associated with Python’s traditional methods for managing dependencies, such as pip and requirements.txt files. Poetry provides a more modern and user-friendly approach to Python project management.

Getting Started with Poetry

To get started with Poetry, you’ll need to install it first. You can do this using pip:

pip install poetry

Once Poetry is installed, you can create a new Python project or convert an existing one to use Poetry for dependency management.

Creating a New Python Project

To create a new Python project with Poetry, navigate to the directory where you want to create your project and run:

poetry new my_project

This command will generate a new project structure with a basic Python package and a pyproject.toml file. The pyproject.toml file is where you’ll define your project’s dependencies and other configuration settings.

Migrating dependency manager

If you’re transitioning from another dependency manager like pip and requirements.txt, Poetry can help. You can convert an existing project to use Poetry with the poetry init command:

poetry init

Follow the prompts to generate a pyproject.toml file and add your dependencies.

Adding Dependencies

With Poetry, adding dependencies to your project is a breeze. You can use the add command to specify the package you want to add:

poetry add <package-name> == <version constraint>

# for example
poetry add numpy==1.25.0

Poetry will automatically manage the versions of the packages you add, The above command does the following :

  1. Install the package in your environment if it doesn’t exist
  2. Add and update the pyproject.toml file accordingly.
  3. Update the poetry.lock file

This helps ensure that your project always uses compatible packages.

Using dependencies group

Poetry provides a way to organize your dependencies by groups. For instance, you might have dependencies only needed to test your project or build the documentation.

To declare a new dependency group, use a tool.poetry.group.<group> section where <group> is the name of your dependency group (for instance, test):

The above can be automatically generated when you add a dependency to that group

poetry add  <package-name> == <version constraint>  -G <group>

# for example
poetry add pytest -G test

Managing Virtual Environments

Poetry also simplifies virtual environment management. When you create a new project or add dependencies, Poetry automatically creates a virtual environment for your project if the current session doesn’t have one, if you are already using a conda environment, it uses that environment. This isolates your project’s dependencies from the global Python environment, preventing conflicts and ensuring version compatibility.

The following is not applicable when you use a conda environment.

To activate the poetry-created virtual environment for your project, you can use:

poetry shell

Lockfile and Dependency Resolution

One of Poetry’s standout features is its use of a lockfile (poetry.lock) to lock the exact versions of your project’s dependencies. This ensures that everyone working on your project uses the same package versions, eliminating the it works on my machine problem. To generate or update the lockfile, use the following command:

poetry lock

Dependency Constraints

In more complex projects, you might need to specify constraints on your dependencies. Poetry allows you to set constraints on package versions in the pyproject.toml file. For example:

[tool.poetry.dependencies]

package-name = { version = "<version constraint>", python = ">=3.8,<4.0" }

This ensures that package name is installed with a version compatible with Python 3.8 to 3.X (any versions below 4.0)

Custom Dependency Sources

By default, Poetry fetches packages from the Python Package Index (PyPI). However, you might need to use packages from private repositories or custom sources. Poetry supports this through the pyproject.toml file. You can specify custom sources like this:

[tool.poetry.source]

name = "my-repo"
url = "https://my-private-repo.com/pypi/simple"

Now, when you declare dependencies, you can specify the source:

[tool.poetry.dependencies]

my-package = { version = "*", source = "my-repo" }

Dependency Overrides

In some cases, you may need to use a modified or patched version of a dependency. Poetry allows you to override specific package versions with local or Git-based paths:

[tool.poetry.dependencies]

requests = { path = "/path/to/custom/requests" }

This is incredibly useful when working on projects that depend on third-party packages with your own modifications.

Installing Dependencies

After you’ve defined your dependencies in the pyproject.toml file, you can use the poetry install command to install them:

poetry install

Poetry will create a virtual environment for your project if it doesn’t already exist and install the specified dependencies into that environment.

To install optional group dependencies

poetry install --with test

Poetry Plugins

Poetry’s plugin system is a powerful feature that can extend its functionality to meet your project’s unique requirements. Plugins can automate tasks, add custom commands, and even integrate with other tools. To install a plugin, use poetry plugin add:

poetry plugin add my-plugin

You can then configure and use the plugin according to its documentation.

Target Environment Management

Poetry allows you to specify environment markers in your pyproject.toml. These markers define dependencies based on the target environment. For example, you can specify dependencies that are only required for specific Python versions or operating systems:

[tool.poetry.dependencies] 

# for example, installing `requests` package on posix systems
requests = { version = "*", python = ">=3.8,<=3.11", os_name = "posix" }

This fine-grained control helps ensure that your project remains compatible with various environments.

Python Executables

If your project is not just a library but also has Python applications or scripts, Poetry provides an excellent way to specify executable scripts. In the pyproject.toml file, you can define entry points:

[tool.poetry.scripts]

my-script = "my_package:main_function"

Now, you can execute your script using poetry run:

poetry run my-script

Generating Dependency Graphs

Understanding your project’s dependency graph is crucial for managing complex projects. Poetry can generate a visual representation of your project’s dependencies:

poetry show --tree

This command will display a tree-like structure of your project’s dependencies, helping you identify potential conflicts or issues.

Building and Packaging

Poetry provides a comprehensive set of commands for building and packaging your Python project. You can create source distributions, wheels, and even executable files (if your project is a script) with ease. The poetry build command, as mentioned earlier, creates distribution packages in the dist directory.

poetry build

You can also customize the build process by modifying the pyproject.toml file. For instance, you can specify package metadata and additional settings for building:

[build-system]

requires = ["poetry-core>=1.0.0"]

build-backend = "poetry.core.masonry.api"

Publishing to PyPI

Once your project is built and ready for distribution, you can publish it to the Python Package Index (PyPI) using Poetry (only if you want to make it open source, else use private registry) . First, you need to create a PyPI account and configure your PyPI credentials:

poetry config pypi-token.pypi <your-token>

Then, you can publish your package:

poetry publish --build

This command will build your project (if not already built) and publish it to PyPI.

Example

An example project using the best practices discussed in this post can be found here GitHub - n3011/mypackage: Demo for python package

You can download this package to the local system using Git. Learn about git

git clone https://github.com/n3011/mypackage.git

cd mypackage

And play around with the sample files, adding dependencies etc.

References

  1. Learn more about poetry
  2. Learn about conda environments
  3. Python style
  4. Version constraints