Tech Blog: Packaging a Python project with setuptools to create a command-line tool for merging PDFs
Posted by Abhishek R. S. on 2024-07-03
The blog is about...
Packaging a Python project with setuptools to create a command-line tool to merge multiple PDFs into a single PDF file.
This project can be found in the merge_pdfs GitHub repo.
1.1) Using click for parsing command-line args
Click is user-friendly package for parsing command-line args.
It provides decorators that are very easy to use. It has sophosticated features that makes it worth using.
In this project, a particular feature to combine multiple PDF files, provided via command-line as inputs, into a single PDF file.
@click.option() can be used for a named argument. This can be made multiple by setting the flag with the same name.
It is that simple. This option is needed because one may need support to combine more than 2 PDF files.
Otherwise, one will have to use the tool multiple times which I feel is simply a waste of time.
In its current implementation, one can use a single command to merge as many PDF files as they need.
1.2) Packaging the tool with setuptools
Using setuptools to package the project to a command-line tool is straight-forward.
This can be done by creating a setup.py file in the project.
In the script a setup function can be defined with certain relevant arguments that does the job.
Most of the arguments are self-explanatory like name of the tool, author, version, dependencies etc.
Two important things to note are the following
-
find_packages()
can be used for recursively finding the packages in the project
-
The entry_points needs to be mentioned with the following convention
[console_scripts]
name_of_the_cli_tool = module_name.script_name:main_function_name
1.3) Installing the package as a command-line tool
After cloning the project repository,
the following command can be used to install the package as a command-line tool.
python setup.py develop
1.4) Using the command-line tool after installing
After installing, the command-line tool can be used in the following way,
merge-pdfs --file-pdf --file-pdf --file-pdf .....
A quick thing to note in the command is that any number of PDF file can be given as input in a single command.
Multiple PDF files can be merged to a single PDF file with a single command. This is a pretty useful tool I have built here, I would say.
Main takeaway
These are some of the new things that I learned in this project. The first is the usage of click package to parse
command-line args. The decorator style usage that it provides is much more user friendly than argparse .
The second is to package a python project and convert it to a command-line executable tool using python setuptools .
Since, this is a relatively simple tool, I did not face much difficulties during this project.
Next Steps
-
Package some more useful tools for document and data processing.
-
Package some AI tools for processing text docs, like using pre-trained HuggingFace transformers to summarize a document etc.
|