GCP (GKE) Self-Hosted
This guide covers self-hosted DeepIntShield on GCP using the scripts shipped in deployment/gcp/. For pulling the managed Enterprise image from DeepIntShield’s Artifact Registry, see Enterprise → GCP.
What you get
Section titled “What you get”- GKE Standard cluster (zonal or regional)
- Cloud SQL for PostgreSQL (private IP, ENTERPRISE edition by default)
- Memorystore for Redis (private service access)
- Artifact Registry for your container images
- All three DeepIntShield components —
deepintshield-server,deepintshield-guard,deepintshield-models - Zero-downtime rolling updates and an optional CI/CD pipeline triggered by pushes to a
productionbranch
Prerequisites
Section titled “Prerequisites”gcloud,kubectl,helm,dockeron your path- A GCP project with billing enabled, owner/editor on the project, and service APIs enabled (the bootstrap script handles this)
- Run
gcloud components updateif you see enum-choice errors (e.g. on Memorystore--tier)
Minimum quotas
Section titled “Minimum quotas”| Quota | Required for starter zonal | Required for regional HA |
|---|---|---|
CPUS_ALL_REGIONS | 8 (3 × e2-standard-2) | 36 (3 zones × 3 × e2-standard-4) |
SSD_TOTAL_GB | 0 (use pd-standard) or 300 (pd-balanced) | 900 |
IN_USE_ADDRESSES | 1 static IP | 1 static IP |
Default project quotas are 12 vCPUs / 250GB SSD — enough for a zonal starter setup. Request a bump before deploying regionally.
Configuration via .env
Section titled “Configuration via .env”All scripts auto-load deployment/gcp/.env if present. Copy and edit:
cp deployment/gcp/env.example deployment/gcp/.envMinimum required fields:
PROJECT_ID=your-gcp-project-idREGION=us-central1ZONE=us-central1-aCLUSTER_NAME=deepintshield-gkeCLOUDSQL_PASSWORD=<strong-password>DEEPINTSHIELD_GUARD_SHARED_SECRET=<random-256-bit-hex># Provider keys (OpenAI, Anthropic, etc.) are NOT set here. Operators add# them in-app after login via the Keys Management UI.Starter (fits default 12 vCPU quota)
Section titled “Starter (fits default 12 vCPU quota)”GKE_ZONAL=1GKE_NUM_NODES=3GKE_MACHINE_TYPE=e2-standard-2GKE_DISK_SIZE=50GKE_DISK_TYPE=pd-standardCLOUDSQL_EDITION=ENTERPRISECLOUDSQL_TIER=db-custom-1-3840REDIS_TIER=basicREDIS_SIZE_GB=1Production (regional, HA)
Section titled “Production (regional, HA)”GKE_ZONAL=0GKE_NUM_NODES=3GKE_MACHINE_TYPE=e2-standard-4GKE_DISK_SIZE=100GKE_DISK_TYPE=pd-balancedCLOUDSQL_EDITION=ENTERPRISECLOUDSQL_TIER=db-custom-2-8192REDIS_TIER=standard_haREDIS_SIZE_GB=5Override the env-file path with DEEPINTSHIELD_ENV_FILE=/path/to/other.env. Explicit exports (e.g. IMAGE_TAG=v2 ./quick_deploy.sh) always win.
First-time deployment
Section titled “First-time deployment”# 1. Bootstrap infra (VPC, GKE, Cloud SQL, Memorystore, Artifact Registry)./deployment/gcp/scripts/bootstrap_gcp.sh
# 2. Build and push imagesIMAGE_TAG=v0.1 ./deployment/gcp/scripts/build_and_push.sh
# 3. Deploy to GKE (guard + models via kubectl, server via Helm)IMAGE_TAG=v0.1 ./deployment/gcp/scripts/deploy_gke.sh
# 4. Verify health./deployment/gcp/scripts/smoke_test.shbootstrap_gcp.sh is idempotent — re-run it safely after any failure. It will skip resources that already exist.
At the end of the bootstrap, you’ll get the static IP to configure a DNS A record for (default hostname app.deepintshield.com, override via PUBLIC_HOSTNAME). Managed certs are issued once DNS resolves.
Subsequent deployments
Section titled “Subsequent deployments”# Bump IMAGE_TAG in .env, then:./deployment/gcp/scripts/quick_deploy.shquick_deploy.sh performs zero-downtime rolling updates:
SERVICES=guard,server— deploy only specific componentsSKIP_BUILD=1— redeploy from already-pushed images
Optional: GitHub Actions CI/CD
Section titled “Optional: GitHub Actions CI/CD”Ship to GCP on every push to a production branch. See CD Production Pipeline for the full setup (secrets, service-account roles, Workload Identity federation, rollback).
Pipeline at a glance:
- PR from
develop→production - Merge to
production - Workflow auto-tags
vYYYY.MM.DD-<sha7>, builds images, deploys to GKE, smoke-tests - Auto-rollback on failure via
helm rollback+kubectl rollout undo - Incident issue auto-opened if the deploy fails
Required repo variables include GCP_PROJECT_ID, GCP_REGION, GCP_ZONE, GKE_ZONAL, GKE_CLUSTER_NAME. The workflow picks --zone vs --region for GKE auth based on GKE_ZONAL.
Zonal vs regional clusters
Section titled “Zonal vs regional clusters”If you bootstrapped zonal (GKE_ZONAL=1), all scripts automatically use --zone $ZONE instead of --region $REGION when calling gcloud container clusters .... If you run a gcloud command manually against a zonal cluster, remember to pass --zone — the cluster is not reachable via --region.
Troubleshooting
Section titled “Troubleshooting”CPUS_ALL_REGIONS quota exceeded — flip to zonal (GKE_ZONAL=1) or request a quota bump. Default regional setup asks for 36 vCPUs.
SSD_TOTAL_GB quota exceeded — set GKE_DISK_TYPE=pd-standard (separate quota) and/or lower GKE_DISK_SIZE.
(gcloud.sql.instances.create) Invalid Tier for ENTERPRISE_PLUS — explicitly set CLOUDSQL_EDITION=ENTERPRISE. Enterprise Plus requires predefined db-perf-optimized-* tiers.
argument --tier: Invalid choice: 'standard-ha' for Memorystore — upgrade gcloud (gcloud components update) or set REDIS_TIER=basic.
get-credentials Not found in us-central1 — cluster is zonal. Use --zone us-central1-a or ensure GKE_ZONAL=1 is in your .env.
Cloud SQL creation stalls on “Waiting for VPC peering” — private-service-access range wasn’t reserved. Re-run bootstrap_gcp.sh; it will reconcile.
Teardown
Section titled “Teardown”gcloud container clusters delete $CLUSTER_NAME --zone $ZONE --quietgcloud sql instances delete $CLOUDSQL_INSTANCE --quietgcloud redis instances delete $REDIS_INSTANCE --region $REGION --quietgcloud compute addresses delete $STATIC_IP_NAME --global --quietgcloud artifacts repositories delete $AR_REPOSITORY --location $REGION --quietgcloud compute networks subnets delete $SUBNET_NAME --region $REGION --quietgcloud compute networks delete $VPC_NAME --quietNext steps
Section titled “Next steps”- Configure DeepIntShield settings for your use case
- Set up observability for monitoring
- Enable clustering for high availability