Best friends building Python package
I’ve recently published a Python package. At least for me, I become more careful and comprehensive to the codes, once decided publishing the project. Immediately, I stand on the opposite an start questioning and criticizing every piece of codes. But I have to say this helps me improve the codes a lot. For example, in machine learning projects, I start thinking how flexible my structure could be adapted to other data and model. Publishing your codes if definitely something you want to try!
“If I have seen further it is by standing on the shoulders of giants.”
–Issac Newton
There are fantastic already fantastic tools and pipelines which helps you build a robust, flexible, clean and beautiful project. In this post, I would like to introduce those friends of mine to you.
venv
Python environments can drive one crazy. There are lots of nice tools but personally I prefer venv
which is a standard library of Python. First you create a virtual enviroment with
$ python3.9 -m venv .venv
$ source .venv/bin/activate
This creates a .venv folder to store your environment. You can gitignore it by adding .venv/
to .gitignore
. Now you can start managing your packages.
pip + pip-tools
Pip is also a standard library of Python which helps you manage your dependency. My practice is to use with pip-tools.
$ python -m pip install --upgrade pip setuptools pip-tools
First you need a .in
specifying the packages you need, e.g.
# requirements/train.in
torch
seaborn
...
Based on this file, piptools find a list of packages you need. You can output them in a .txt
file.
$ python -m piptools compile requirements/train.in --output-file requirements/train.txt
Now you have all the packages you need in the .txt
file, e.g.
#
# This file is autogenerated by pip-compile with Python 3.9
# by the following command:
#
# pip-compile --output-file=requirements/train.txt requirements/train.in
#
absl-py==2.1.0
# via ml-collections
annotated-types==0.6.0
# via pydantic
anyio==4.3.0
...
At the end, we just need to install these packages from pip:
$ python -m pip install -r requirements/train.txt
If you also integrate your package into environment in an editable mode, run
$ python -m pip install --editable .
Just keep in mind that this requires you to setup the package first.
Check the following blogs explaining why using native library:
- Why not tell people to “simply” use pyenv, poetry, pipx or anaconda
- Back to basics with pip and venv
- Boring Python: dependency management
folder tree
A good folder tree is also very useful managing your project. Here is how I order my stuff.
projectname/
├── .venv/
├── dist/
│ ├── projectname-1.0.0-py3-none-any.whl
│ └── projectname-1.0.0.tar.gz
├── notebooks/
│ └── notebook.ipynb
├── scripts/
│ └── script.py
├── data/
│ └── data.npy
├── src/
│ └──packagename/
│ ├── __init__.py
│ └── subpackage/
│ ├── __init__.py
│ ├── subpackage.py
├── tests/
│ └── test_subpackage.py
├── docs/
│ └── Makefile
├── .gitignore
├── Makefile
├── Dockerfile
├── requirements/
│ ├── dev-requirements.in
│ ├── dev-requirements.txt
│ ├── requirements.in
│ ├── requirements.txt
│
├── pyproject.toml
├── MANIFEST.in
├── tox.ini
├── LICENSE
└── README.md
You probably already know what they are for by just google their names. Anyway, I would explain the files later when we meet them.
VScode, pep8, black, flake8, isort, pylance,
Already we have the environment and can code now! Here are some tools that helps you writing better code. I use VScode with the following extensions installed.
-
pylance
helps writing codes faster -
pep8
andflake8
helps linting Python codes -
black
andisort
helps formatting Python codes You can easily installed them in the extensions market of VScode
pytest
After coding, you should always test your code before publishing. Pytest facilitate this. Writing test codes is very easy following the official tutorial and after that you only need to run
$ python -m pytest
which output clean and clear test results:
============================= test session starts ==============================
platform darwin -- Python 3.9.13, pytest-8.2.2, pluggy-1.5.0
rootdir: /Users/hous/Github/NeuralHedge
configfile: pyproject.toml
plugins: anyio-4.4.0
collected 4 items
tests/test_data.py .. [ 50%]
tests/test_nn.py . [ 75%]
tests/test_utils.py . [100%]
=============================== warnings summary ===============================
tests/test_nn.py::test_network
/Users/hous/Github/NeuralHedge/.venv/lib/python3.9/site-packages/torch/nn/modules/lazy.py:181: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment.
warnings.warn('Lazy modules are a new feature under heavy development '
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================= 4 passed, 1 warning in 3.68s =========================
setuptools
Now you are ready to publish your codes. I follow the pipelines:
- Packaging Python Projects
- User guide: setuptools
- How to Publish an Open-Source Python Package to PyP
Always check the latest tutorial! pyproject.toml
is the new standardized format to describe project metadata declaratively, introduced with PEP 621, but many projects are still using the setup.py
approach.
tox
With tox, you can even test your codes in different environments. Simply write a configuration of tests and environments.
# tox.in
[tox]
env_list = py38, py39
[testenv]
deps = pytest
command = pytest tests
Then tox
would do everything for you
$ tox
.pkg: _optional_hooks> python /Users/hous/Github/NeuralHedge/.venv/lib/python3.9/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_sdist> python /Users/hous/Github/NeuralHedge/.venv/lib/python3.9/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /Users/hous/Github/NeuralHedge/.venv/lib/python3.9/site-packages/pyproject_api/_backend.py True setuptools.build_meta
py38: install_package> python -I -m pip install --force-reinstall --no-deps /Users/hous/Github/NeuralHedge/.tox/.tmp/package/4/neuralhedge-0.1.0.tar.gz
py38: OK ✔ in 4.98 seconds
py39: install_package> python -I -m pip install --force-reinstall --no-deps /Users/hous/Github/NeuralHedge/.tox/.tmp/package/5/neuralhedge-0.1.0.tar.gz
py38: OK (4.98 seconds)
py39: OK (2.94 seconds)
congratulations :) (8.03 seconds)
Reference
This blog is greatly inspired by