Bundle URIs
{{< details >}}
- Status: Experiment
{{< /details >}}
{{< history >}}
-
Introduced in GitLab 17.0 with a flag named
gitaly_bundle_uri
. Disabled by default.
{{< /history >}}
{{< alert type=”flag” >}}
On GitLab Self-Managed, by default this feature is not available.
To make it available, an administrator can enable the feature flag
named gitaly_bundle_uri
.
On GitLab.com and GitLab Dedicated, this feature is not available. This feature
is not ready for production use.
{{< /alert >}}
Gitaly supports Git bundle URIs. Bundle URIs are locations where Git can download one or more bundles to bootstrap the object database before fetching the remaining objects from a remote. Bundle URIs are built in to the Git protocol.
Using Bundle URIs can:
- Speed up clones and fetches for users with a poor network connection to the GitLab server. The bundles can be stored on a CDN, making them available around the world.
- Reduce the load on servers that run CI/CD jobs. If CI/CD jobs can pre-load bundles from somewhere else, the remaining work to incrementally fetch missing objects and references creates a lot less load on the server.
Prerequisites
- The Git configuration
transfer.bundleURI
must be enabled on Git clients. - GitLab Runner 16.6 or later.
- In CI/CD pipeline configuration, the
default Git strategy set to
git clone
.
Server configuration
You must configure where the bundles are stored. Gitaly supports the following storage services:
- Google Cloud Storage
- AWS S3 (or compatible)
- Azure Blob Storage
- Local file storage (not recommended)
Configure Azure Blob storage
How you configure Azure Blob storage for Bundle URI depends on the type of
installation you have. For self-compiled installations, you must set the
AZURE_STORAGE_ACCOUNT
and AZURE_STORAGE_KEY
environment variables outside of
GitLab.
{{< tabs >}}
{{< tab title=”Linux package (Omnibus)” >}}
Edit /etc/gitlab/gitlab.rb
and configure the bundle_uri.go_cloud_url
:
gitaly['env'] = {
'AZURE_STORAGE_ACCOUNT' => 'azure_storage_account',
'AZURE_STORAGE_KEY' => 'azure_storage_key' # or 'AZURE_STORAGE_SAS_TOKEN'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 'azblob://<bucket>'
}
}
{{< /tab >}}
{{< tab title=”Self-compiled (source)” >}}
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "azblob://<bucket>"
{{< /tab >}}
{{< /tabs >}}
Configure Google Cloud storage
Google Cloud storage (GCP) authenticates using Application Default Credentials. Set up Application Default Credentials on each Gitaly server using either:
- The
gcloud auth application-default login
command. - The
GOOGLE_APPLICATION_CREDENTIALS
environment variable. For self-compiled installations, set the environment variable outside of GitLab.
For more information, see Application Default Credentials.
The destination bucket is configured using the go_cloud_url
option.
{{< tabs >}}
{{< tab title=”Linux package (Omnibus)” >}}
Edit /etc/gitlab/gitlab.rb
and configure the go_cloud_url
:
gitaly['env'] = {
'GOOGLE_APPLICATION_CREDENTIALS' => '/path/to/service.json'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 'gs://<bucket>'
}
}
{{< /tab >}}
{{< tab title=”Self-compiled (source)” >}}
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "gs://<bucket>"
{{< /tab >}}
{{< /tabs >}}
Configure S3 storage
To configure S3 storage authentication:
- If you authenticate with the AWS CLI, you can use the default AWS session.
- Otherwise, you can use the
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
environment variables. For self-compiled installations, set the environment variables outside of GitLab.
For more information, see AWS Session documentation.
The destination bucket and region are configured using the go_cloud_url
option.
{{< tabs >}}
{{< tab title=”Linux package (Omnibus)” >}}
Edit /etc/gitlab/gitlab.rb
and configure the go_cloud_url
:
gitaly['env'] = {
'AWS_ACCESS_KEY_ID' => 'aws_access_key_id',
'AWS_SECRET_ACCESS_KEY' => 'aws_secret_access_key'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 's3://<bucket>?region=us-west-1'
}
}
{{< /tab >}}
{{< tab title=”Self-compiled (source)” >}}
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "s3://<bucket>?region=us-west-1"
{{< /tab >}}
{{< /tabs >}}
Configure S3-compatible servers
S3-compatible servers such as MinIO are configured similarly to S3 with the
addition of the endpoint
parameter.
The following parameters are supported:
-
region
: The AWS region. -
endpoint
: The endpoint URL. -
disabledSSL
: A value oftrue
disables SSL. -
s3ForcePathStyle
: A value oftrue
forces path-style addressing.
{{< tabs >}}
{{< tab title=”Linux package (Omnibus)” >}}
Edit /etc/gitlab/gitlab.rb
and configure the go_cloud_url
:
gitaly['env'] = {
'AWS_ACCESS_KEY_ID' => 'minio_access_key_id',
'AWS_SECRET_ACCESS_KEY' => 'minio_secret_access_key'
}
gitaly['configuration'] = {
bundle_uri: {
go_cloud_url: 's3://<bucket>?region=minio&endpoint=my.minio.local:8080&disableSSL=true&s3ForcePathStyle=true'
}
}
{{< /tab >}}
{{< tab title=”Self-compiled (source)” >}}
Edit /home/git/gitaly/config.toml
and configure go_cloud_url
:
[bundle_uri]
go_cloud_url = "s3://<bucket>?region=minio&endpoint=my.minio.local:8080&disableSSL=true&s3ForcePathStyle=true"
{{< /tab >}}
{{< /tabs >}}
Generating bundles
After Gitaly is properly configured, Gitaly can generate bundles, which is a manual process. To generate a bundle for Bundle URI, run:
sudo -u git -- /opt/gitlab/embedded/bin/gitaly bundle-uri \
--config=<config-file> \
--storage=<storage-name> \
--repository=<relative-path>
This command generates the bundle and stores it on the configured storage service. Gitaly does not automatically refresh the generated bundle. When you want to generate a more recent version of a bundle, you must run the command again.
You can schedule this command with a tool like cron(8)
.
Bundle URI example
In the following example, we demonstrate the difference between cloning
gitlab.com/gitlab-org/gitlab.git
with and without using bundle URI.
$ git -c transfer.bundleURI=false clone https://gitlab.com/gitlab-org/gitlab.git
Cloning into 'gitlab'...
remote: Enumerating objects: 5271177, done.
remote: Total 5271177 (delta 0), reused 0 (delta 0), pack-reused 5271177
Receiving objects: 100% (5271177/5271177), 1.93 GiB | 32.93 MiB/s, done.
Resolving deltas: 100% (4140349/4140349), done.
Updating files: 100% (71304/71304), done.
$ git -c transfer.bundleURI=true clone https://gitlab.com/gitlab-org/gitlab.git
Cloning into 'gitlab'...
remote: Enumerating objects: 1322255, done.
remote: Counting objects: 100% (611708/611708), done.
remote: Total 1322255 (delta 611708), reused 611708 (delta 611708), pack-reused 710547
Receiving objects: 100% (1322255/1322255), 539.66 MiB | 22.98 MiB/s, done.
Resolving deltas: 100% (1026890/1026890), completed with 223946 local objects.
Checking objects: 100% (8388608/8388608), done.
Checking connectivity: 1381139, done.
Updating files: 100% (71304/71304), done.
In the above example:
- When not using a Bundle URI, there were 5,271,177 objects received from the GitLab server.
- When using a Bundle URI, there were 1,322,255 objects received from the GitLab server.
This reduction means GitLab needs to pack together fewer objects (in the above example, roughly a quarter of the number of objects) because the client first downloaded the bundle from the storage server.
Securing bundles
The bundles are made accessible to the client using signed URLs. A signed URL is a URL that provides limited permissions and time to make a request. To see if your storage service supports signed URLs, see the documentation of your storage service.