Skip to content

Commit 62b42a0

Browse files
committed
object-storage: add new 'sourcegraph' bucket
1 parent d2e61aa commit 62b42a0

File tree

1 file changed

+74
-15
lines changed

1 file changed

+74
-15
lines changed

docs/self-hosted/external-services/object-storage.mdx

Lines changed: 74 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,70 @@ By default, Sourcegraph will use a `sourcegraph/blobstore` server bundled with t
44

55
You can alternatively configure your instance to instead store this data in an S3 or GCS bucket. Doing so may decrease your hosting costs as persistent volumes are often more expensive than the same storage space in an object store service.
66

7+
## `sourcegraph` bucket
8+
9+
<Callout type="warning">
10+
11+
Self-hosted Sourcegraph instances using S3 or GCS object storage should now provision
12+
an additional bucket named `sourcegraph`. Sourcegraph currently reports a warning
13+
when this bucket is not present, and it will become required for new features in
14+
a future release.
15+
</Callout>
16+
17+
The `sourcegraph` bucket is intended to be the single bucket for new Sourcegraph features. Instead of creating one bucket per feature, new features store objects under namespaced key prefixes within this bucket.
18+
19+
Existing buckets for code graph indexes and search jobs remain in use. This change ensures future features can be enabled without requiring a new bucket for each feature.
20+
21+
### Using S3 for the `sourcegraph` bucket
22+
23+
Set the following environment variables to target an S3 bucket for shared Sourcegraph uploads.
24+
25+
- `SOURCEGRAPH_UPLOAD_BACKEND=S3`
26+
- `SOURCEGRAPH_UPLOAD_BUCKET=sourcegraph` (default)
27+
- `SOURCEGRAPH_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
28+
- `SOURCEGRAPH_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
29+
- `SOURCEGRAPH_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
30+
- `SOURCEGRAPH_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
31+
- `SOURCEGRAPH_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
32+
- `SOURCEGRAPH_UPLOAD_AWS_USE_PATH_STYLE=false` (optional)
33+
- `SOURCEGRAPH_UPLOAD_AWS_REGION=us-east-1` (default)
34+
35+
### Using GCS for the `sourcegraph` bucket
36+
37+
Set the following environment variables to target a GCS bucket for shared Sourcegraph uploads.
38+
39+
- `SOURCEGRAPH_UPLOAD_BACKEND=GCS`
40+
- `SOURCEGRAPH_UPLOAD_BUCKET=sourcegraph` (default)
41+
- `SOURCEGRAPH_UPLOAD_GCP_PROJECT_ID=<my project id>`
42+
- `SOURCEGRAPH_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
43+
- `SOURCEGRAPH_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
44+
45+
### Automatically provision the `sourcegraph` bucket
46+
47+
If you would like to allow your Sourcegraph instance to manage the target bucket configuration, set the following environment variable:
48+
49+
<Callout type="note">
50+
51+
This requires additional bucket-management permissions from your configured storage vendor (AWS or GCP).
52+
</Callout>
53+
54+
- `SOURCEGRAPH_UPLOAD_MANAGE_BUCKET=true`
55+
756
## Code Graph Indexes
857

958
To target a managed object storage service for storing [code graph index uploads](../../code-navigation/precise-code-navigation), you will need to set a handful of environment variables for configuration and authentication to the target service.
1059

1160
- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
1261
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend`, `worker`, and `precise-code-intel-worker` containers
1362

14-
### Using S3
63+
### Using S3 for the Code Graph Indexes bucket
1564

1665
To target an S3 bucket you've already provisioned, set the following environment variables. Authentication can be done through [an access and secret key pair](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) (and optional session token), or via the EC2 metadata API.
1766

1867
<Callout type="warning">
19-
{' '}
68+
2069
Never commit AWS access keys in Git. You should consider using a secret handling
21-
service offered by your cloud provider.{' '}
70+
service offered by your cloud provider.
2271
</Callout>
2372

2473
- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=S3`
@@ -31,21 +80,21 @@ To target an S3 bucket you've already provisioned, set the following environment
3180
- `PRECISE_CODE_INTEL_UPLOAD_AWS_REGION=us-east-1` (default)
3281

3382
<Callout type="note">
34-
{' '}
83+
3584
If a non-default region is supplied, ensure that the subdomain of the endpoint
36-
URL (_the `AWS_ENDPOINT` value_) matches the target region.{' '}
85+
URL (_the `AWS_ENDPOINT` value_) matches the target region.
3786
</Callout>
3887

3988
<Callout type="tip">
40-
{' '}
89+
4190
You don't need to set the `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID` environment
4291
variable when using `PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true`
4392
because role credentials will be automatically resolved. Attach the IAM role
4493
to the EC2 instances hosting the `frontend`, `worker`, and `precise-code-intel-worker`
45-
containers in a multi-node environment.{' '}
94+
containers in a multi-node environment.
4695
</Callout>
4796

48-
### Using GCS
97+
### Using GCS for the Code Graph Indexes bucket
4998

5099
To target a GCS bucket you've already provisioned, set the following environment variables. Authentication is done through a service account key, supplied as either a path to a volume-mounted file, or the contents read in as an environment variable payload.
51100

@@ -55,10 +104,15 @@ To target a GCS bucket you've already provisioned, set the following environment
55104
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
56105
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
57106

58-
### Provisioning buckets
107+
### Automatically provision the Code Graph Indexes bucket
59108

60109
If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:
61110

111+
<Callout type="note">
112+
113+
This requires additional bucket-management permissions from your configured storage vendor (AWS or GCP).
114+
</Callout>
115+
62116
- `PRECISE_CODE_INTEL_UPLOAD_MANAGE_BUCKET=true`
63117
- `PRECISE_CODE_INTEL_UPLOAD_TTL=168h` (default)
64118

@@ -69,12 +123,12 @@ To target a third party managed object storage service for storing [search job r
69123
- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
70124
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend` and `worker` containers
71125

72-
### Using S3
126+
### Using S3 for the Search Job Results bucket
73127

74128
Set the following environment variables to target an S3 bucket you've already provisioned. Authentication can be done through [an access and secret key pair](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) (and optionally through session token) or via the EC2 metadata API.
75129

76130
<Callout type="warning">
77-
{' '}
131+
78132
Never commit AWS access keys in Git. You should consider using a secret handling
79133
service offered by your cloud provider.
80134
</Callout>
@@ -89,19 +143,19 @@ Set the following environment variables to target an S3 bucket you've already pr
89143
- `SEARCH_JOBS_UPLOAD_AWS_REGION=us-east-1` (default)
90144

91145
<Callout type="note">
92-
{' '}
146+
93147
If a non-default region is supplied, ensure that the subdomain of the endpoint
94148
URL (the `AWS_ENDPOINT` value) matches the target region.
95149
</Callout>
96150

97151
<Callout type="tip">
98-
{' '}
152+
99153
You don't need to set the `SEARCH_JOBS_UPLOAD_AWS_ACCESS_KEY_ID` environment
100154
variable when using `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` because
101155
role credentials will be automatically resolved.
102156
</Callout>
103157

104-
### Using GCS
158+
### Using GCS for the Search Job Results bucket
105159

106160
Set the following environment variables to target a GCS bucket you've already provisioned. Authentication is done through a service account key, either as a path to a volume-mounted file or the contents read in as an environment variable payload.
107161

@@ -111,8 +165,13 @@ Set the following environment variables to target a GCS bucket you've already pr
111165
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
112166
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
113167

114-
### Provisioning buckets
168+
### Automatically provision the Search Job Results bucket
115169

116170
If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:
117171

172+
<Callout type="note">
173+
174+
This requires additional bucket-management permissions from your configured storage vendor (AWS or GCP).
175+
</Callout>
176+
118177
- `SEARCH_JOBS_UPLOAD_MANAGE_BUCKET=true`

0 commit comments

Comments
 (0)