0.9.9 uses a library called rpmUtils
which is provided by yum
. This library contains code which parses version strings from an RPM package. This code contains a bug which results in incorrect version and release strings being output by rpmUtils
and further results in createrepo
generating incorrect metadata for several packages.CentOS 6 and CentOS 7 use createrepo
to generate metadata for the official YUM package repository, and because of this, the official metadata has incorrect version and release strings for several packages.
RPM file format
RPM is a binary file format containing a few sections of metadata followed by a compressed CPIO archive. The RPM header contains an index structure which consists of 16-byte entries. An index entry contains a tag value, tag type, data offset, and a count.
Some useful tag values include name, version, release, architecture, and so on.
The librpm library can be used to work with RPM files and allows programs to read and extract RPM file information, among other operations. The rpm
user program works by using librpm.
For example, it’s possible to dump all known tags on the command line by running: rpm --querytags
RPM EVR strings
RPM versions are fully expressed as strings of the form: epoch:version-release
. This is where the acronym EVR comes from: EpochVersionRelease. Throughout the librpm source, these strings are referred to as EVR.
is a special version number that can be set by the person writing the RPM packaging information if the software does not have a version number scheme that librpm can parse. In most cases, this value is not set by the packaging author.
Official algorithm for parsing EVRs
The official algorithm for parsing EVR strings can be found in librpm. The exact file and line number will vary depending on the version of the source, but version 18.104.22.168 (available here: http://www.rpm.org/wiki/Releases/22.214.171.124) contains a file called ./lib/rpmds.c
which has a function named parseEVR
which can be found at line 949.
The important take-away from this simple algorithm is that the release string begins with the character following the last hyphen in the EVR string (found by using the C function strrchr
For example, the string: 7.4.160-1 is parsed by librpm into a version of 7.4.160 and a release of 1.
I assume the source provided by librpm contains the official parsing algorithm since librpm is used to power rpm
and the other command line tools that actually install, remove, and deal with rpm
files once they are placed on the target system.
Buggy code in createrepo
, and rpmUtils
You can clone the yum
source from git by following the instructions here: http://yum.baseurl.org/.
contains a directory named rpmUtils
which is installed as a python library package when yum is installed. In the rpmUtils
library, a function named stringToVersion
around line 391 attempts to parse an EVR string and output the version, release, and epoch.Tragically, this code attempts to locate the start of the release string by using a python function called find
(described here: https://docs.python.org/2/library/string.html#string.find).
This function will create a release string starting with the first character following the first hyphen.
for generating metadata, but it also duplicates this code when dealing with deltarpms.You can clone the createrepo
source from git by following the instructions here: http://createrepo.baseurl.org/.
0.9.9 contains code to parsee EVR strings in the file ./createrepo/deltarpms.py
around line 70 in a method named _stringToVersion
The result of the yum
algorithms for most packages is equivalent to the result of the librpm algorithm.
However, there are a few packages for which createrepo
will generate different version and release strings than librpm will generate. Since rpm
and many other command line tools are built upon librpm this can lead to subtle bugs in version comparison which can affect upgrades, downgrades, and package installation.
generates metadata describing what each package “requires” so that it can be installed and what it “provides” after it has been installed. When doing so, it includes the version and release of each requirement.An example of this bug can be seen with the package maven-repository-builder-1.0-0.5.alpha2.el7.noarch.rpm
generates the following “provide” entry for maven-repository-builder
:Note that the provide entry has the EVR string “1.0-alpha-2” which was split (incorrectly) on the first hyphen.
The correct metadata that should have been generated is:
This is an unfortunate bug because now when yum
or other tools attempt to satisfy the dependency graph, the tools will be using different understandings of the version and release strings for maven-repository-builder
and may be unable to find a path through the graph even though one exists.
There are 12 affected packages in the official CentOS 7 repository according to my tests:
You should use createrepo
from the tag in the git source tree createrepo_0_4_10
(SHA e9ab4444d67cd79533441e8d9b65488f423661a2) as this version has an EVR parsing algorithm which is identical to librpm. This version of createrepo
does not use rpmUtils
and does not suffer from this bug.
Hopefully, both yum
will be modified to fix this bug and reduce code duplication in the future.
Alternatively, you can avoid tracking this (and other) bugs in packaging tools by uploading your RPMs and other packages to packagecloud.io.