12 Foolproof Steps to Create Python Package [Tutorial]


Written by - Deepak Prasad

What are Python Packages?

In the vast ecosystem of software development, Python packages serve as fundamental building blocks that encapsulate reusable code into modular components. A Python package is essentially a way to distribute one or more modules so that they can be easily installed, managed, and used. These packages can be anything from simple libraries that carry out specific tasks to complex frameworks that enable high-level functionalities. Whether you're dealing with data analysis, web development, or machine learning, you're likely using several Python packages to streamline your work.

Python packages are a way of organizing and distributing code so that it's easy to manage, update, and share. Whether you're building a small utility library or a comprehensive framework, packages are the way to go for distributing your software. Python packages can be easily shared with others, uploaded to the Python Package Index (PyPI), and even sold as proprietary solutions.

 

Step 1: Prerequisites to Create Python Package

Before you can successfully create Python package, there are some fundamental prerequisites you'll need to have in place.

1. Install Python

Your first step is to make sure you have Python installed on your system. If you don't already have it, download it from the official Python website . To check if Python is installed, open your command line (cmd on Windows, Terminal on macOS or Linux) and type the following command:

python3 --version

You should see the Python version displayed. If not, it means Python isn't installed, and you'll need to download and install it.

For Windows:

  • Visit the official Python website and download the installer that suits your system (32-bit or 64-bit).
  • Run the downloaded .exe file.
  • Make sure to check the box that says "Add Python to PATH" before clicking on "Install Now." This will make it easier to run Python from the command line.
  • Once the installer finishes, Python should be installed on your system.

For macOS:

  • Go to the official Python website and download the macOS installer.
  • Open the .pkg file you downloaded to start the installation process.
  • Follow the on-screen instructions to complete the installation.

For Linux:

Python is often pre-installed on Linux distributions. If not, you can install it using package managers.

For Debian-based systems like Ubuntu:

sudo apt update
sudo apt install python3

For Red Hat-based systems like Fedora:

sudo dnf install python3

2. Install pip3

pip is Python's package installer, and you'll need it to install various Python packages from the Python Package Index (PyPI) to create python package. If you've installed Python 3 from the official website or using a package manager, pip3 is likely already installed.

To check if pip3 is installed:

pip3 --version

If pip3 is not installed, you can install it using the following command:

sudo apt update                  # For Ubuntu and other Debian-based distributions
sudo apt install python3-pip     # For Ubuntu and other Debian-based distributions

On Windows the Python installer from the official Python website usually includes pip by default for Windows.

 

Step 2: Setting Up Your Development Environment

Creating a virtual environment is an essential first step create python package. A virtual environment helps to isolate your package and its dependencies, ensuring that you have a clean slate on which to develop. In this section, we will delve into the steps required to set up a virtual environment using Python's built-in venv module.

Here is how to set up a virtual environment for your DataValidator project:

Open your command line and navigate to the folder where you want to create your DataValidator project. If you haven't created a directory for your DataValidator project, do it now:

mkdir DataValidator
cd DataValidator

Now, let's create the virtual environment before we create python package. The command to create a virtual environment may differ depending on your operating system.

# On Windows
python -m venv venv

# On macOS and Linux
python3 -m venv venv

This will create a new folder named venv inside your DataValidator project folder, containing the virtual environment.

Before you can start installing packages or running Python code, you need to activate the virtual environment.

# On Windows:
.\venv\Scripts\Activate

# On macOS and Linux:
source venv/bin/activate

Once activated, your command line should show the name of the activated environment, in this case, venv.

(venv) deepak@deepak-VirtualBox:~/DataValidator$ 

You can verify that you're using the Python interpreter from within the virtual environment by checking its location:

which python3  # On macOS and Linux
where python3  # On Windows

It should point to the Python executable inside the venv folder.

/home/deepak/DataValidator/venv/bin/python3

Now your environment is ready to create python package.

 

Step 3: Planning Your Package

When planning to create python package, deciding whether it will be a single module or consist of multiple modules is crucial. A single module is generally easier to manage but might become cluttered as the package grows, while multiple modules are more organized but might make the package more complex.

For our DataValidator package, we decide to go with multiple modules to separate different validation functionalities, such as:

  • string_validator.py: For validating string data
  • email_validator.py: For validating email addresses
  • phone_validator.py: For validating phone numbers

 

Step 4: Creating Your Package Directory Structure

Creating a well-organized directory structure is an essential step to create Python package processes that are both maintainable and scalable. It's important to group related files together and to give files and folders names that reflect their purpose.

You can set up your DataValidator package directory like this:

DataValidator/
├── data_validator/
│   ├── __init__.py
│   ├── string_validator.py
│   ├── email_validator.py
│   └── phone_validator.py
├── tests/
│   ├── __init__.py
│   └── test_validators.py
├── setup.py
└── README.md

Here:

  • The data_validator folder contains the actual Python modules.
  • The tests folder will contain all your unit tests.
  • setup.py will help in packaging your Python code.
  • README.md for documenting your package.

__init__.py and Its Significance

In Python packages, an __init__.py file signifies that a directory should be considered a Python package or sub-package. It can be empty or include initialization code to run when the package or sub-package is imported.

In our DataValidator package, the __init__.py file inside the data_validator directory can be used to initialize the package and make it easier to import modules.

For example, your data_validator/__init__.py file could look like:

from .string_validator import validate_string
from .email_validator import validate_email
from .phone_validator import validate_phone

After you create python package, this allows users to perform imports like:

from data_validator import validate_email, validate_string, validate_phone

 

Step 5: Writing the Code

In this section, we will discuss Python best practices when we create Python package, the importance of docstrings, and how to manage version control with Git while creating your DataValidator package.

1. string_validator.py

This module validates strings based on length or content.

def is_string_empty(s):
    """
    Checks if the given string is empty.

    Parameters:
        s (str): String to check.

    Returns:
        bool: True if empty, False otherwise.
    """
    return len(s.strip()) == 0

def has_special_characters(s):
    """
    Checks if the given string has special characters.

    Parameters:
        s (str): String to check.

    Returns:
        bool: True if has special characters, False otherwise.
    """
    return any(not c.isalnum() for c in s)

2. email_validator.py

This module validates email addresses.

def validate_email(email):
    """
    Validates if the input is an email.

    Parameters:
        email (str): The email address to validate.

    Returns:
        bool: True if valid email, False otherwise.
    """
    return "@" in email and "." in email

3. phone_validator.py

This module validates phone numbers based on a simple rule: they must be numeric and 10 digits long.

def validate_phone(phone):
    """
    Validates if the input is a phone number.

    Parameters:
        phone (str): The phone number to validate.

    Returns:
        bool: True if valid phone number, False otherwise.
    """
    return phone.isdigit() and len(phone) == 10

 

Step 6: Dependency Management

One of the crucial aspects to create Python package is managing dependencies. Dependencies are the third-party libraries or modules that your package relies on to function correctly. In this step, we will discuss how to manage these dependencies for your DataValidator package effectively.

 

Using requirements.txt or Pipfile - Which on to use?

Two popular ways to manage package dependencies when we create Python package are through a requirements.txt file or a Pipfile.

  • requirements.txt: This is a plain text file that lists all of the third-party packages your project relies on.
  • Pipfile: This file is used by Pipenv and provides a higher-level dependency management system with more features than requirements.txt.

Let's assume that our DataValidator package will require the following third-party packages:

  1. validators for advanced validation.
  2. pytest for running tests.
  3. requests for some future features that will require HTTP requests.

1. Using requirements.txt

In this example, create a requirements.txt file in the root directory of your project where we create python package. Populate this file as follows:

validators==0.18.2
pytest==6.2.5
requests==2.26.0

This explicitly defines which versions of the dependencies your package relies on.

End-users can install all the dependencies at once using:

pip3 install -r requirements.txt

2. Using Pipfile

If you're using Pipenv where you create python package, a Pipfile will be generated when you install packages using pipenv install.

  1. Open your terminal and navigate to your project directory.
  2. Run the following commands:
pip3 install pipenv
pipenv install validators
pipenv install pytest
pipenv install requests

A Pipfile will be generated in your project directory, and it will look something like:

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
validators = "==0.18.2"
pytest = "==6.2.5"
requests = "==2.26.0"

[dev-packages]

[requires]
python_version = "3.9"

3. Managing Third-party Libraries

Sometimes, you may need to use third-party libraries that have their own dependencies. It's essential to ensure that these nested dependencies don't conflict with each other.

Suppose the validators package you are using requires a specific version of another package called six. You would need to ensure that if your package also relies on six, both dependencies require the same version. If not, you'll have to decide which version is essential for your package and possibly look for alternatives to conflicting packages.

You can check dependencies of your third-party libraries where you create python package by running:

pip3 show validators

This will list all the dependencies that validators relies on, allowing you to manage them effectively.

 

Step 7: Writing Unit Tests using unittest

Python’s built-in unittest framework is commonly used for writing robust tests.

Let's assume we're writing unit tests for our DataValidator package. Create a test_validators.py file in the tests directory of your package, and add the following example code:

import unittest
from data_validator import validate_email, validate_phone, is_string_empty

class TestDataValidator(unittest.TestCase):

    def test_email_validator(self):
        self.assertTrue(validate_email("test@example.com"))
        self.assertFalse(validate_email("testexample.com"))

    def test_phone_validator(self):
        self.assertTrue(validate_phone("1234567890"))
        self.assertFalse(validate_phone("123-456-7890"))
        
    def test_string_empty(self):
        self.assertTrue(is_string_empty("   "))
        self.assertFalse(is_string_empty("Not empty"))

if __name__ == "__main__":
    unittest.main()

In this test suite, we have three tests:

  1. test_email_validator: Validates the validate_email function from our package.
  2. test_phone_validator: Validates the validate_phone function from our package.
  3. test_string_empty: Validates the is_string_empty function from our package.

How to Run the Tests

Navigate to the root directory of your project where you create python package and run:

python3 -m unittest tests/test_validators.py

 

Step 8: Create Documentation (README)

The README is often the first file people interact with when encountering your package. It provides an overview of what your package does, how to install it, and basic usage examples so it is important to include it to create Python package.

# DataValidator

## Overview

DataValidator is a Python package that provides various validation utilities for strings, emails, and phone numbers.

## Installation

To install DataValidator, run the following command:

```bash
pip install DataValidator
```

## Usage

Here are some quick examples:

```python
from data_validator import validate_email, validate_phone

print(validate_email("test@example.com"))  # True
print(validate_phone("1234567890"))  # True
```

## Contributing

Feel free to open an issue or submit a pull request if you find a bug or have a feature request.

## License

MIT

 

Step 9: Packaging Your Code

While working to create Python package, it offers several ways to package code, and the setuptools library is one of the most commonly used methods for this. We'll explain how to create a setup.py file and generate distribution packages.

The setup.py file is a build script for setuptools. It provides metadata about your package and contains instructions for packaging, distributing, and installing modules.

Here's a simple setup.py file for our DataValidator package:

from setuptools import setup, find_packages

setup(
    name='DataValidator',
    version='0.1',
    packages=find_packages(),
    install_requires=[
        'validators==0.18.2',
        'pytest==6.2.5',
        'requests==2.26.0',
    ],
    author='Your Name',
    author_email='your.email@example.com',
    description='A utility package for validating data',
)

This setup.py file specifies the package name, version, dependencies, and other information. When you run this script, setuptools will use this information to create a distributable package.

Once you have a setup.py file, you can generate different types of distribution packages:

  • Source Distribution (sdist)
  • Built Distribution (bdist_wheel)

To generate these, first, make sure you have setuptools and wheel installed:

pip3 install setuptools wheel

Then, run the following commands:

python3 setup.py sdist
python3 setup.py bdist_wheel

This will generate a dist directory where you create python package containing your distribution packages, for example:

  • DataValidator-0.1.tar.gz (sdist)
  • DataValidator-0.1-py3-none-any.whl (bdist_wheel)

 

install_requires Vs requirements.txt - Are they dependent?

When you create python package, the install_requires parameter in the setup.py file specifies the package dependencies that need to be installed for your package to work correctly. When a user installs your package using pip, these dependencies are automatically installed if they are not already present on the user's system.

The list of dependencies in install_requires often mirrors the list of packages in a requirements.txt file, which is another common way to manage project dependencies. The requirements.txt file is usually used in development environments and can be installed with pip install -r requirements.txt.

  • Common Source: Both install_requires in setup.py and requirements.txt typically list the same dependencies, though requirements.txt might include additional packages useful for development but not needed to run the package.
  • Version Pinning: In both places, you can specify the version of the dependencies. For example, if your code depends on version 0.18.2 of a package called validators, you would include 'validators==0.18.2' in both install_requires and requirements.txt.
  • Flexibility: install_requires allows for more flexible version specification using operators like >=, <=, etc., useful when packaging libraries. requirements.txt can do this as well, but it's often used to pin dependencies to specific versions for application development.
  • Automatic Installation: Dependencies listed in install_requires are automatically installed when someone installs your package. In contrast, requirements.txt needs to be manually installed using pip install -r requirements.txt.

Let's say your requirements.txt file looks like this:

validators==0.18.2
pytest==6.2.5
requests==2.26.0

And your setup.py contains:

install_requires=[
    'validators==0.18.2',
    'pytest==6.2.5',
    'requests==2.26.0',
],

When a user installs your DataValidator package, the packages listed in install_requires will be installed automatically. The requirements.txt can be used by developers working on your DataValidator package to set up a virtual environment with the necessary dependencies.

So when we create python package, while both serve to specify dependencies, they are used in slightly different contexts. install_requires is for users of your package, and requirements.txt is for developers working on it.

 

Step 10: Publishing Your Package

After we create python package, the Python Package Index (PyPI) is the go-to repository for Python packages, and we'll be using it as our example.

Uploading Your Package to PyPI

Before you can upload your package, you need to have an account on PyPI.

  • Navigate to the PyPI website.
  • Click on "Register" and follow the steps to create your account.

To upload your package to PyPI, you'll first need to install a tool called twine.

pip3 install twine

Once twine is installed, navigate to your project directory (where the dist directory resides) and execute the following:

twine upload dist/*

Example for DataValidator

Let's say your dist directory contains:

  • DataValidator-0.1.tar.gz (sdist)
  • DataValidator-0.1-py3-none-any.whl (bdist_wheel)

After running twine upload dist/*, these files will be uploaded to PyPI, making your DataValidator package publicly available.

Versioning Your Package

When you create python package, it's essential to have a versioning strategy for your package, especially if you plan on making future updates. Semantic versioning is commonly used in Python packages. The version number is usually specified in the setup.py file.

setup(
    name='DataValidator',
    version='0.1',
    ...
)

When you update your package, remember to also update the version number in setup.py to reflect the changes according to your versioning strategy (e.g., major, minor, or patch updates).

NOTE: If you make a small, backward-compatible change, you could change the version number from '0.1' to '0.1.1'. For breaking changes, you could update it to '0.2' or '1.0' based on the impact.

 

Step 11: Local Testing and Verification

You can install the package locally either directly from the source code or by using the distribution files you created earlier (sdist or bdist_wheel).

Direct Installation from Source Code

Navigate to the project directory and run:

pip3 install .

Installation Using Distribution Files

If you've created a dist directory that contains your package files (DataValidator-0.1.tar.gz, DataValidator-0.1-py3-none-any.whl), you can install from them as follows:

pip3 install dist/DataValidator-0.1-py3-none-any.whl

or

pip3 install dist/DataValidator-0.1.tar.gz

After installing your package locally, you should verify its functionality to ensure it behaves as expected.

Import the Package:

Open a Python interpreter and try importing your package.

import DataValidator

Run Some Basic Tests

Use the functions and classes in your package to make sure they are working as intended.

from DataValidator import validate_email, validate_phone

print(validate_email("test@example.com"))  # Should return True
print(validate_phone("1234567890"))       # Should return True

Check Dependencies

Ensure that all dependencies listed in install_requires or requirements.txt are installed correctly. You can list installed packages with pip freeze.

Uninstall and Reinstall

It may also be useful to uninstall the package and reinstall it to make sure the installation process is seamless.

pip3 uninstall DataValidator
pip3 install .

Check README and Documentation

Ensure that your README file and any other documentation you may have are included in the package and are accessible to the users.

 

Step 12: Some Advanced Topics

Creating Executable Scripts

Python packages often come with executable scripts that users can run from the command line. These scripts can simplify complex tasks or serve as utilities related to your package.

Suppose you want to add a script that validates an email and a phone number from the command line. You would:

Create a new Python script named validate_data.py in your package directory.

from DataValidator import validate_email, validate_phone
import argparse

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--email', help='Email to validate')
    parser.add_argument('--phone', help='Phone number to validate')
    args = parser.parse_args()

    if args.email:
        print("Email validation:", validate_email(args.email))
    
    if args.phone:
        print("Phone validation:", validate_phone(args.phone))

if __name__ == "__main__":
    main()

In your setup.py, add an entry_points section.

setup(
    ...
    entry_points = {
        'console_scripts': ['validate-data=DataValidator.validate_data:main'],
    }
    ...
)

After installing your package, users can run validate-data --email test@example.com --phone 1234567890 directly from the command line.

Multi-Python Version Support

Supporting multiple Python versions where we create python package can broaden your package's user base. This involves ensuring your code is compatible with the versions you want to support and specifying this information in your package metadata.

Testing: Use tools like tox to test your package against different Python versions.

pip3 install tox

Create a tox.ini file with the Python versions you want to support.

[tox]
envlist = py36, py37, py38, py39

[testenv]
deps = pytest
commands = pytest

Specify Versions in setup.py: In your setup.py file, specify which Python versions your package is compatible with.

setup(
    ...
    classifiers=[
        ...
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
        'Programming Language :: Python :: 3.9',
        ...
    ],
    python_requires='>=3.6, <4',
    ...
)

This ensures that your package will only be installable on compatible versions of Python.

 

Summary and Conclusion

To create Python package from scratch might seem daunting at first, but hopefully, this comprehensive guide has demystified the process for you. From the basic prerequisites to advanced topics, we've covered a wide array of elements that contribute to effective Python package creation. The guide went through crucial steps like setting up a development environment, coding, testing, documenting, and finally, publishing the package. Each of these steps contributes to making your package robust, user-friendly, and easily distributable.

Whether you're a beginner looking to create your first Python package or an experienced developer aiming to refine your packaging skills, these guidelines provide a detailed roadmap. By following these steps, you should be well-equipped to build a Python package that not only meets your needs but can also serve the broader Python community.

 

Further Additional Resources

  • Python Packaging Authority (PyPA)
    The PyPA provides an extensive set of resources and tools for Python packaging, making it the authoritative source for Python package creators.
  • Python Packaging User Guide
    This guide provides comprehensive information on package creation, distribution, installation, and more.
  • Python Package Index (PyPI)
    PyPI is the primary repository for Python packages, and its website also offers helpful guides and FAQs on Python packaging.
  • setuptools Documentation
    The setuptools library is widely used for Python packaging, and its official documentation offers in-depth tutorials and API guides.
  • tox Documentation
    If you're interested in supporting multiple Python versions, tox is an invaluable tool. The documentation explains how to configure and run tests for various Python versions.
  • unittest Documentation
    If you're new to testing, Python’s unittest module is a good place to start, offering a wide variety of built-in ways to test your code.

 

Deepak Prasad

He is the founder of GoLinuxCloud and brings over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels in various domains, from development to DevOps, Networking, and Security, ensuring robust and efficient solutions for diverse projects. You can reach out to him on his LinkedIn profile or join on Facebook page.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

Leave a Comment