TL;DR
This post briefly goes over the contents of a gem package, the RubyGem Gem::Specification
, Gem::Indexer
and Gem::Server
classes, a breakdown of the different RubyGem index files (specs.4.8
, prerelease_specs.4.8
, and latest_specs.4.8
) and how they are used when determining dependencies and installing gem packages onto a system.
What’s in a gem?
A gem is just a tar archive containing the gem’s files and metadata. The files we are looking for are the metadata.gz
and data.tar.gz
files.
Uncompress a gem archive like this:
$ tar -vxf coolgem-1.6.4.gem
x metadata.gz
x data.tar.gz
Let’s take a look at these data.tar.gz
and metadata.gz
files.
data.tar.gz
The data.tar.gz
contains the data payload for the gem package. The data
folder that gets extracted contains the executable code, required files, and anything else that was included in the .gemspec
file.
uncompressed data folder
├── LICENSE
├── README
├── Rakefile
├── bin
│ └── coolgem
├── lib
│ └── coolgem
│ └── test
│ ├── gem
│ │ └── version.rb
│ └── gem.rb
└── coolgem.gemspec
metadata.gz
The metadata.gz
file is just a gzipped file containing a YAML representation of the gem package, as defined in a gem’s .gemspec
file. The uncompressed YAML metadata
contains the gem package information and includes version information, dependencies, and lists all the required paths and files related to the data payload.
The RubyGem Gem::Specification
Class
The Gem::Specification
class contains information (including runtime and development dependency information) for a specific gem. It’s typically defined in a .gemspec
file, and it’s used by the gem build
command when creating a gem package. This is an example of a Gem::Specification
used within a .gemspec
file:
Gem::Specification.new do |s|
s.name = 'coolgem'
s.version = '1.6.4'
s.licenses = ['MIT']
s.summary = "Do cool stuff with things"
s.description = "This does cool stuff with some things"
s.authors = ["An Person"]
s.email = 'an.person@packagecloud.io'
s.files = Dir.glob("{bin,lib}/**/*")+ %w(LICENSE README Rakefile)
s.executables = ['bin/somegem']
s.require_paths = ["lib", "bin"]
s.homepage = 'https://packagecloud.io'
s.add_development_dependency "rake", "10.4.2"
end
The RubyGem Gem::Indexer
Class
The Gem::Indexer
class is used to build the gem repository index. On initialization, the Gem::Indexer
takes a directory
as its first argument, and an optional second parameter to build indices for older versions of RubyGems. This directory
is a path to a directory that contains a gems
sub-directory which holds/will hold all the .gem
files to be indexed. The secondary options hash is used to set a value (build_modern: false
or build_legacy
on older versions) for indices targeted by versions of RubyGems prior to 1.2.
> Gem::Indexer.new(directory, { build_modern: true })
Building the index files
The generate_index
method can be called on an instance of Gem::Indexer
to build the necessary indices used by the RubyGems API.
> Gem::Indexer.new('/path/to/repo',{ build_modern:true }).generate_index
Generating Marshal quick index gemspecs for 2 gems
.
Complete
Generated Marshal quick index gemspecs: 0.001s
Generating specs index
Generated specs index: 0.000s
Generating latest specs index
Generated latest specs index: 0.000s
Generating prerelease specs index
Generated prerelease specs index: 0.000s
Compressing indicies
Compressed indicies: 0.001s
=> ["specs.4.8", "specs.4.8.gz", "latest_specs.4.8", "latest_specs.4.8.gz", "prerelease_specs.4.8", "prerelease_specs.4.8.gz"]
RubyGem Index Files
The RubyGem index files are Marshal
‘d and gzipped arrays and the 4.8
in the filename is referring to the current version of Ruby’s Marshal
-ing format. Read more about Ruby’s Marshal format and Marshal
‘ing Library.
specs.4.8
specs.4.8.gz
latest_specs.4.8
latest_specs.4.8.gz
prerelease_specs.4.8
prerelease_specs.4.8.gz
specs.4.8.gz
The specs.4.8.gz
file is a Marshal
‘d and gzipped array that contains smaller arrays that hold the name
, version
and platform
for each non-prerelease gem that has been indexed.
> Marshal.load(Gem.gunzip(File.read("specs.4.8.gz")))
=> [["coolgem", #<Gem::Version "1.6.4">, "ruby"]]
prerelease_specs.4.8.gz
Similarly, the prerelease_specs.4.8.gz
file is a Marshal
‘d and gzipped array that contains smaller arrays holding the name
, version
and platform
for each prerelease gem that has been indexed.
> Marshal.load(File.read("prerelease_specs.4.8"))
=> [["rack", #<Gem::Version "1.6.0.beta2">, "ruby"]]
latest_specs.4.8.gz
The latest_specs.4.8.gz
file is also a a Marshal
‘d and gzipped array containing smaller arrays holding the name
, version
and platform
for only the latest non-prerelease gems that have been indexed. This index file is only useful when you are certain you want to install the latest version of a gem.
> Marshal.load(Gem.gunzip(File.read("latest_specs.4.8.gz")))
=> [["coolgem", #<Gem::Version "1.6.4">, "ruby"]]
Gem dependencies and RubyGem gemspec files
The dependency information for an indexed gem can be found inside a gem’s gemspec
file. When the Gem::Indexer
generates the index for a set of gems, it iterates over each gem in the gems directory
and parses the gemspec
s from the .gem
files. When the index is generated, these parsed gemspecs
are then placed into a directory named quick/Marshal.4.8/
containing all the individual Marshal
‘d and gzipped gemspec.rz
files. The /quick
directory can be found inside the directory
that was passed to the Gem::Indexer
class on initialization. The Marshal.4.8
directory inside the /quick
directory is namespaced to Ruby’s current Marshal
format (Marshal.4.8
) and it contains the Marshal
‘d and gzipped gemspec.rz
files:
quick
├── Marshal.4.8
└─ coolgem-1.6.4.gemspec.rz
└─ rack-1.6.0.beta2.gemspec.rz
Taking a look at the prerelease rack
gem, we can see the dependency information:
> spec = Marshal.load(Gem.inflate(File.read('quick/Marshal.4.8/rack-1.6.0.beta2.gemspec.rz')))
> spec.dependencies
[<Gem::Dependency type=:development name="bacon" requirements=">= 0">, <Gem::Dependency type=:development name="rake" requirements=">= 0">]
The Gem::Server
Class
The Gem::Server
: class provides a way for users to consume gem packages via gem install
. Gem::Server.new
command starts a server on a given port and allows users to download the different index files, gemspec
files, rdoc documentation, and installable gem packages on a set of routes.
Gem::Server.new Gem.dir, 8089, false
Routes
From the stdlib docs on Gem::Server
:
/ - Browsing of gem spec files for installed gems
/specs.4.8.gz - specs name/version/platform index
/latest_specs.4.8.gz - latest specs name/version/platform index
/quick/ - Individual gemspecs
/gems - Direct access to download the installable gems
/rdoc?q= - Search for installed rdoc documentation
Gem Indices and Installing Gem Packages
These index files (latest_specs.4.8
, specs.4.8
, prerelease_specs.4.8
) are requested when the gem install
command is used to install a gem package. The following examples show the different specs
files that are used:
when installing a gem with a specific version: gem install rails -v=4.0.0
https://packagecloud.io/computology/test-gems/specs.4.8.gz
200 OK
when installing a gem without specific version: gem install rails
https://packagecloud.io/computology/test-gems/latest_specs.4.8.gz
200 OK
and when installing a gem using the --pre
flag to specify a prerelease version:
https://packagecloud.io/computology/test-gems/prerelease_specs.4.8.gz
200 OK
Once the required specs
file is resolved, the gemspec
file for the gem to be installed is downloaded:
https://packagecloud.io/computology/test-gems/quick/Marshal.4.8/coolgem-1.6.4.gemspec.rz
Resolving gem dependencies
Once the gemspec
file is downloaded for a gem, its dependencies can now be resolved. This process of walking through the specs
and gemspec
files will need to happen for each dependency until all required dependencies have been installed.
Downloading and Unpacking a gem package
The actual .gem
packages are found in the indexed gems
directory. The RubyGems API expects direct access to download the installable gems via /gems
path of your source:
https://packagecloud.io/computology/test-gems/gems/coolgem-1.6.4.gem
200 OK
After all the dependencies are installed, the gem command will unpack and install the gem contents to your system:
$ sudo gem install -V coolgem
...
GET https://packagecloud.io/computology/test-gems/latest_specs.4.8.gz
200 OK
GET https://packagecloud.io/computology/test-gems/quick/Marshal.4.8/coolgem-1.6.4.gemspec.rz
200 OK
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/Gemfile
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/LICENSE.txt
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/README.md
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/Rakefile
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/bin/coolgem
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/lib/coolgem/test/gem.rb
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/lib/coolgem/test/gem/version.rb
/Users/person/.rvm/rubies/ruby-2.2.3-p451/lib/ruby/gems/2.2.3/gems/coolgem-1.6.4/coolgem.gemspec
/Users/person/.rvm/rubies/ruby-2.2.3-p451/bin/coolgem
Successfully installed coolgem-1.6.4
1 gem installed
Conclusion
When building your own gem server, or trying to understand how the gem command installs packages and resolves dependencies, it’s helpful to understand what the index and gemspec
files are, and how they are used by the gem
command. While knowing how these file and indices are used isn’t a requirement, it can definitely assist when debugging why gem
or Bundler
isn’t finding a particular gem package or when a gem dependency breaks your app. Happy packaging.