Why a Package Manager is vital to DevOps
A Package Manager or Package-Management System ("PM / PMS") is a collection of Software tools that Automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system, and a Software application in a consistent manner.
HubBucket Registry is a Package Manager Pipeline for:
- GitHub Package Registry - https://help.github.com/en/articles/about-github-package-registry
- NPM - https://www.npmjs.com/
- RubyGems - https://rubygems.org/
- Docker Hub - https://hub.docker.com/
Package Managers ("PM") are used to automate the process of installing, upgrading, configuring, and removing programs.
There are many package managers today for Unix/Linux-based systems. By mid-2010s, package managers made their way to Windows as well.
Package managers are also used for installing and managing modules for languages such as Python, Ruby, etc.
A package is simply an archive that contains binaries of software, configuration files, and information about dependencies.
The general workflow starts with the user requesting a package using the Package Manager ("PM") available in the system. The PM then finds the requested package from a known location and downloads it.
The PM then installs the package and advises on any manual steps that it finds necessary.
Why is the package manager required?
Unix began its journey by being a programmer's OS. This means that every time a new program was written it had to be compiled, linked and run.
Unix got the ability to use libraries ("shared objects"), ELF executables, etc. To solve the task of building more complicated software easily, make was developed. Source code was getting shipped with a Makefile (the file that's used by make). But it was still a laborious task as the developer or the maintainer had to take care of the dependencies.
Instead of running / make / commands every time on every machine having the same configuration, it was thought that we can have a package manager to ship the executable and also the dependencies to other machines. Hence, the earliest PMs started evolving with this idea.
Today's Linux distributions contain thousands of packages. This has come about due to its modular design, code reuse, and collaborative code creation. However, there's a trade-off between code reuse and incompatible dependencies. Package managers solve this complexity by streamlining the process.
What are the basic functions of a Package Manager ("PM")?
The basic functions of the PM are the following:
- Working with file archivers to extract package archives
- Ensuring the integrity and authenticity of the package by verifying their digital certificates and checksums
- Looking up, downloading, installing or updating existing software from a software repository or app store
- Grouping packages by function to reduce user confusion
- Managing dependencies to ensure a package is installed with all packages it requires, thus avoiding dependency hell
The user interface of a PM can be a command line, a graphical interface, or both. Often users can search for packages by name or category. Some even show user reviews or ratings of packages. Batch installation is also possible with PM. Some may support "safe upgrading" (retain existing versions) or "holding" (lock package to a specific version).
What exactly is Dependency Hell?
In Windows, the equivalent term could be DLL Hell. When a package depends on another package as a prerequisite, it will either not install or work as expected if the latter is missing or incorrectly set up. A developer then attempts to install the dependency, which in turn may depend on yet more packages. This could quickly become unmanageable if the developer tries to install all these dependencies manually. It could also happen that a dependent package is installed but it's of an older incompatible version.
Package managers solve this problem by resolving dependencies. Because every package comes with metadata, the PM knows what are the dependencies and what versions of those dependencies ought to be used. Package managers, therefore, solve the problem of dependency hell.
Where do packages get downloaded from?
Packages get downloaded from software repositories, often simply called repos. Alternatives terms include sources and feeds. These repos are available online at well-defined locations and they serve as a central distribution point for packages.
For performance and redundancy, these repos may be mirrored by many other locations worldwide. As an example, Cygwin uses mirror sites. Local repos may also mirror remote official repos for saving bandwidth or tighter privacy. Ubuntu's apt-mirror provides this feature.
While most developers will use these repos to download packages, advanced developers can also contribute or upload packages to be hosted at these repos. All repos publish the process that developers need to follow to upload packages. Official repos have a strict review and approval process. Community-managed repos may have a more relaxed process. In all cases, repositories are meant to be malware-free.
How would a package manager know the location of the repository?
Every package manager has associated configuration files that point to repository locations. For example, in Ubuntu, /etc/apt/sources.list contains the locations of repositories. This would include the official repos but users can also update this file for getting packages from other repos. Likewise, configuration for Fedora and CentOS distributions are at /etc/yum.conf for YUM and /etc/dnf/dnf.conf for DNF. For Arch Linux, it is at /etc/pacman.conf when pacman is used.
When adding third-party repos to a package manager, users must take care to check that those repos can be trusted. This is important so that you don't end up with a malware infecting your system. In fact, this is one of the problems solved by trusted repos. Instead of downloading software from a third-party website, downloading it via the package manager from a trusted repo is a more secure practice.
What's in a package?
A package includes the concerned software, which may be an application or shared library. If it's a development package, it will include source files (such as header files) to build your software that depends on a library. Packages are meant for specific distributions and therefore installation paths, desktop integration and startup scripts are set up to the targeted distribution. Package formats could include *.tgz (for source code archives), *.deb (for Debian) or *.rpm (for Red Hat).
Packages include metadata as well. This will include the summary, description, list of files, version, authorship, targeted architecture, file checksums, licensing, and dependent packages. This metadata is essential for the package manager to do its job correctly.
Examples of Software Language Package Managers:
Modern languages are delivered as a core part that comes with the default installation plus a wide range of optional packages that can be installed when necessary. Those that manage these add-ons are called language package managers. Within the scope of a project or application, the term dependency manager is used. The term package manager is used at system/language level whereas dependency manager is used at the project level. For example, in PHP, PEAR can be called a package manager while Composer is a dependency manager.
Let's say, you're working on a Python project. This may depend on many other Python packages for correct execution. Moreover, another Python project will have its own dependencies. A dependency manager helps developers manage these dependencies and share their project settings in a consistent manner with others.
Here are some examples, in the format of "language: (manager, repository)":
- Python: (pip, PyPI)
- PHP: (Composer, Packagist)
- Ruby: (RubyGems, RubyGems), (Bundler, Bundler)
- Web frontend: (Bower, Bower), (Yarn, (Yarn, NPM))
- Java: (Maven, Maven), (Gradle, (Ivy, Maven, etc.))
If there are options, how should I choose a package manager?
Here are a few things to consider:
- Ease of use: A graphical interface can be great for beginners. For command-line interface, commands have to be intuitive.
- Features: Look for more than just managing packages: list available/installed packages, search, filter, remote/local installs, wildcard support, source/binary package support, etc.
- Customization: Check if it supports customization such as interactive mode to let user decide on the next step during installation.
- Speed: Faster the better and this might depend on how well caching is done.
- Ease of development: For package developers, the workflow should be easy including a simple upload to a repository.
Get in Touch
Find us at the office
287 Decatur, Street
Brooklyn, NY (Silicon Alley)
Give us a ring
Mon - Fri, 10:00am-5:00pm