Kubernetes
Native K8s pod discovery via label selectors
DeepIntShield Clustering delivers production-ready high availability through a peer-to-peer network architecture with automatic service discovery. The clustering system uses gossip protocols to maintain consistent state across nodes while providing seamless scaling, automatic failover, and zero-downtime deployments.
Modern AI gateway deployments require robust infrastructure to handle production workloads:
| Challenge | Impact | Clustering Solution |
|---|---|---|
| Single Point of Failure | Complete service outage if gateway fails | Distributed architecture with automatic failover |
| Traffic Spikes | Performance degradation under high load | Dynamic load distribution across multiple nodes |
| Provider Rate Limits | Request throttling and service interruption | Distributed rate limit tracking across cluster |
| Regional Latency | Poor user experience in distant regions | Geographic distribution with local processing |
| Maintenance Windows | Service downtime during updates | Rolling updates with zero-downtime deployment |
| Capacity Planning | Over/under-provisioning resources | Elastic scaling based on real-time demand |
| Feature | Description |
|---|---|
| Automatic Service Discovery | 6 discovery methods for any infrastructure (K8s, Consul, etcd, DNS, UDP, mDNS) |
| Peer-to-Peer Architecture | No single point of failure with equal node participation |
| Gossip-Based State Sync | Real-time synchronization of traffic patterns and limits |
| Automatic Failover | Seamless traffic redistribution when nodes fail |
| Zero-Downtime Updates | Rolling deployments without service interruption |
DeepIntShield clustering uses a peer-to-peer (P2P) network where all nodes are equal participants. Each node:
The gossip protocol ensures all nodes maintain consistent views of:
Convergence: All nodes converge to the same state within seconds with eventual consistency guarantees.
| Cluster Size | Fault Tolerance | Use Case |
|---|---|---|
| 3 nodes | 1 node failure | Small production deployments |
| 5 nodes | 2 node failures | Medium production deployments |
| 7+ nodes | 3+ node failures | Large enterprise deployments |
The new clustering configuration uses a cluster_config object with integrated service discovery:
{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "kubernetes", "service_name": "deepintshield-cluster", // Discovery-specific configuration here }, "gossip": { "port": 10101, "config": { "timeout_seconds": 10, "success_threshold": 3, "failure_threshold": 3 } } }}All discovery methods support these common fields:
| Field | Type | Required | Description |
|---|---|---|---|
enabled | boolean | Yes | Enable/disable discovery |
type | string | Yes | Discovery type: kubernetes, consul, etcd, dns, udp, mdns |
service_name | string | Yes | Service name for discovery |
bind_port | integer | No | Port for cluster communication (default: 10101) |
dial_timeout | duration | No | Discovery timeout (default: 10s) |
allowed_address_space | array | No | CIDR ranges to filter discovered nodes (e.g., ["10.0.0.0/8"]) |
| Field | Description | Default |
|---|---|---|
port | Gossip protocol port | 10101 |
timeout_seconds | Health check timeout | 10 |
success_threshold | Successful checks to mark healthy | 3 |
failure_threshold | Failed checks to mark unhealthy | 3 |
DeepIntShield supports 6 service discovery methods to fit any infrastructure. Choose based on your deployment environment:
Kubernetes
Native K8s pod discovery via label selectors
Consul
HashiCorp Consul service mesh integration
etcd
etcd-based distributed discovery
DNS
Traditional DNS SRV record discovery
UDP Broadcast
Local network broadcast discovery
mDNS
Multicast DNS for local development
Best for: Kubernetes deployments with StatefulSets or Deployments
Kubernetes discovery uses the K8s API to automatically discover pods based on label selectors. This is the most common method for cloud-native deployments.
{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "kubernetes", "service_name": "deepintshield-cluster", "k8s_namespace": "default", "k8s_label_selector": "app=deepintshield" }, "gossip": { "port": 10101 } }}| Parameter | Required | Description | Example |
|---|---|---|---|
k8s_namespace | No | Kubernetes namespace to search | "default", "production" |
k8s_label_selector | Yes | Label selector for pod discovery | "app=deepintshield", "app=deepintshield,env=prod" |
apiVersion: apps/v1kind: StatefulSetmetadata: name: deepintshield namespace: defaultspec: serviceName: deepintshield-cluster replicas: 3 selector: matchLabels: app: deepintshield template: metadata: labels: app: deepintshield spec: serviceAccountName: deepintshield containers: - name: deepintshield image: <enterprise_repo_base_url>/deepintshield:latest ports: - containerPort: 8080 name: http - containerPort: 10101 name: gossip volumeMounts: - name: config mountPath: /etc/deepintshield volumes: - name: config configMap: name: deepintshield-config---apiVersion: v1kind: ServiceAccountmetadata: name: deepintshield namespace: default---apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: deepintshield-pod-reader namespace: defaultrules:- apiGroups: [""] resources: ["pods"] verbs: ["get", "list", "watch"]---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: name: deepintshield-pod-reader namespace: defaultsubjects:- kind: ServiceAccount name: deepintshield namespace: defaultroleRef: kind: Role name: deepintshield-pod-reader apiGroup: rbac.authorization.k8s.ioapiVersion: apps/v1kind: Deploymentmetadata: name: deepintshield namespace: defaultspec: replicas: 3 selector: matchLabels: app: deepintshield template: metadata: labels: app: deepintshield spec: serviceAccountName: deepintshield containers: - name: deepintshield image: <enterprise_repo_base_url>/deepintshield:latest ports: - containerPort: 8080 name: http - containerPort: 10101 name: gossip volumeMounts: - name: config mountPath: /etc/deepintshield volumes: - name: config configMap: name: deepintshield-config---apiVersion: v1kind: Servicemetadata: name: deepintshield-cluster namespace: defaultspec: clusterIP: None selector: app: deepintshield ports: - port: 10101 name: gossipSymptoms: Cluster shows only 1 member, pods running in isolation
Solutions:
Symptoms: “error getting kubernetes config” or “forbidden” errors
Solutions:
get, list, watch permissions on podsSymptoms: Nodes discovered but marked as “suspect” or “dead”
Solutions:
timeout_seconds in gossip config if network is slowkubectl get podsBest for: Consul service mesh environments, multi-datacenter deployments
Consul discovery integrates with HashiCorp Consul for service registration and discovery. Ideal for environments already using Consul for service mesh or service discovery.
{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "consul", "service_name": "deepintshield-cluster", "consul_address": "consul.service.consul:8500" }, "gossip": { "port": 10101 } }}| Parameter | Required | Description | Example |
|---|---|---|---|
consul_address | No | Consul agent address | "localhost:8500", "consul.service.consul:8500" (default: localhost:8500) |
version: '3.8'
services: consul: image: hashicorp/consul:latest command: agent -dev -client=0.0.0.0 ports: - "8500:8500" networks: - deepintshield-net
deepintshield-1: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config-node1.json:/etc/deepintshield/config.json ports: - "8080:8080" depends_on: - consul networks: - deepintshield-net
deepintshield-2: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config-node2.json:/etc/deepintshield/config.json ports: - "8081:8080" depends_on: - consul networks: - deepintshield-net
deepintshield-3: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config-node3.json:/etc/deepintshield/config.json ports: - "8082:8080" depends_on: - consul networks: - deepintshield-net
networks: deepintshield-net: driver: bridgeSymptoms: “failed to register service with Consul” errors
Solutions:
Symptoms: Consul UI shows services but nodes don’t join cluster
Solutions:
service_name matches across all nodesSymptoms: Services show as critical in Consul UI
Solutions:
Best for: etcd-based distributed systems, existing etcd infrastructure
etcd discovery uses etcd’s distributed key-value store for service registration and discovery. Perfect for environments already using etcd or requiring strong consistency.
{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "etcd", "service_name": "deepintshield-cluster", "etcd_endpoints": [ "http://etcd-1:2379", "http://etcd-2:2379", "http://etcd-3:2379" ], "dial_timeout": "10s" }, "gossip": { "port": 10101 } }}| Parameter | Required | Description | Example |
|---|---|---|---|
etcd_endpoints | Yes | Array of etcd endpoint URLs | ["http://localhost:2379"], ["https://etcd1:2379", "https://etcd2:2379"] |
dial_timeout | No | Connection timeout | "10s" (default), "30s" |
version: '3.8'
services: etcd: image: quay.io/coreos/etcd:latest command: - etcd - --advertise-client-urls=http://etcd:2379 - --listen-client-urls=http://0.0.0.0:2379 - --listen-peer-urls=http://0.0.0.0:2380 - --initial-cluster=etcd=http://etcd:2380 - --initial-advertise-peer-urls=http://etcd:2380 ports: - "2379:2379" - "2380:2380" networks: - deepintshield-net
deepintshield-1: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8080:8080" depends_on: - etcd networks: - deepintshield-net
deepintshield-2: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8081:8080" depends_on: - etcd networks: - deepintshield-net
deepintshield-3: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8082:8080" depends_on: - etcd networks: - deepintshield-net
networks: deepintshield-net: driver: bridgeSymptoms: “etcd client error” on startup
Solutions:
dial_timeout if network is slowSymptoms: “failed to register with etcd” errors
Solutions:
Symptoms: Nodes repeatedly registering/deregistering
Solutions:
Best for: Traditional infrastructure, static node addresses, cloud DNS services
DNS discovery uses standard DNS resolution to discover cluster nodes. Works with any DNS server and is ideal for static deployments or cloud environments with DNS integration.
{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "dns", "service_name": "deepintshield-cluster", "dns_names": [ "deepintshield-cluster.local", "deepintshield-nodes.internal.company.com" ], "bind_port": 10101 }, "gossip": { "port": 10101 } }}| Parameter | Required | Description | Example |
|---|---|---|---|
dns_names | Yes | Array of DNS names to resolve | ["deepintshield.local"], ["node1.local", "node2.local", "node3.local"] |
bind_port | No | Port appended to discovered IPs | 10101 (default) |
# Create A records for each nodeaws route53 change-resource-record-sets \ --hosted-zone-id Z1234567890ABC \ --change-batch '{ "Changes": [{ "Action": "CREATE", "ResourceRecordSet": { "Name": "deepintshield-cluster.internal.company.com", "Type": "A", "TTL": 60, "ResourceRecords": [ {"Value": "10.0.1.10"}, {"Value": "10.0.1.11"}, {"Value": "10.0.1.12"} ] } }] }'apiVersion: v1kind: Servicemetadata: name: deepintshield-cluster namespace: defaultspec: clusterIP: None # Headless service selector: app: deepintshield ports: - port: 10101 name: gossip---# DNS will resolve deepintshield-cluster.default.svc.cluster.local# to all pod IPs matching the selectoraddress=/deepintshield-cluster.local/192.168.1.10address=/deepintshield-cluster.local/192.168.1.11address=/deepintshield-cluster.local/192.168.1.12
# Or use /etc/hosts on each nodeecho "192.168.1.10 node1.deepintshield.local" >> /etc/hostsecho "192.168.1.11 node2.deepintshield.local" >> /etc/hostsecho "192.168.1.12 node3.deepintshield.local" >> /etc/hostsSymptoms: “dns lookup error” in logs, no nodes discovered
Solutions:
nslookup deepintshield-cluster.local/etc/resolv.conf has correct nameserverSymptoms: DNS resolves but cluster has 0 members
Solutions:
bind_port matches actual gossip port on nodesdig or nslookup to verify DNS response formatSymptoms: IPs discovered but gossip connection fails
Solutions:
telnet <ip> 10101Best for: Local network deployments, on-premise infrastructure, development clusters
UDP broadcast discovery automatically finds nodes on the same local network using broadcast packets. No external dependencies required.
allowed_address_space for security{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "udp", "service_name": "deepintshield-cluster", "udp_broadcast_port": 9999, "allowed_address_space": [ "192.168.1.0/24", "10.0.0.0/8" ], "dial_timeout": "10s" }, "gossip": { "port": 10101 } }}| Parameter | Required | Description | Example |
|---|---|---|---|
udp_broadcast_port | Yes | Port for broadcast discovery | 9999, 8888 |
allowed_address_space | Yes | CIDR ranges to limit discovery scope | ["192.168.1.0/24"], ["10.0.0.0/8", "172.16.0.0/12"] |
dial_timeout | No | Time to wait for responses | "10s" (default) |
version: '3.8'
services: deepintshield-1: image: <enterprise_repo_base_url>/deepintshield:latest network_mode: bridge environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8080:8080" - "9999:9999/udp" - "10101:10101"
deepintshield-2: image: <enterprise_repo_base_url>/deepintshield:latest network_mode: bridge environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8081:8080" - "9999:9999/udp" - "10101:10101"
deepintshield-3: image: <enterprise_repo_base_url>/deepintshield:latest network_mode: bridge environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8082:8080" - "9999:9999/udp" - "10101:10101"Symptoms: Discovery runs but finds 0 nodes
Solutions:
allowed_address_space includes node IP addressestcpdump -i any -n udp port 9999Symptoms: “not in allowed address space” warnings
Solutions:
192.168.1.0/24)allowed_address_space covers all node IPsip addr or ifconfigSymptoms: “permission denied” or “address already in use”
Solutions:
netstat -tulpn | grep 9999 to check port usageudp_broadcast_port to different valueBest for: Local development, testing, zero-configuration setups
mDNS (Multicast DNS) provides zero-configuration service discovery on local networks. Perfect for development and testing without requiring any infrastructure setup.
{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "mdns", "service_name": "deepintshield", "mdns_service": "_bifrost._tcp", "dial_timeout": "10s" }, "gossip": { "port": 10101 } }}| Parameter | Required | Description | Example |
|---|---|---|---|
mdns_service | No | mDNS service type | "_bifrost._tcp" (default), "_myapp._tcp" |
dial_timeout | No | Time to wait for mDNS responses | "10s" (default) |
# Start first nodedocker run -p 8080:8080 -p 10101:10101 \ -v $(pwd)/config-mdns.json:/etc/deepintshield/config.json \ <enterprise_repo_base_url>/deepintshield:latest
# Start second node (discovers first automatically)docker run -p 8081:8080 -p 10102:10101 \ -v $(pwd)/config-mdns.json:/etc/deepintshield/config.json \ <enterprise_repo_base_url>/deepintshield:latest
# Start third node (discovers both automatically)docker run -p 8082:8080 -p 10103:10101 \ -v $(pwd)/config-mdns.json:/etc/deepintshield/config.json \ <enterprise_repo_base_url>/deepintshield:latestSymptoms: Nodes don’t discover each other via mDNS
Solutions:
avahi-browse -a (Linux) or dns-sd -B (macOS)dial_timeout if discovery is slowSymptoms: “skipping invalid host address” warnings
Solutions:
Symptoms: Nodes discover then disconnect repeatedly
Solutions:
Complete example using Kubernetes-style discovery with a shared config store:
version: '3.8'
services: postgres: image: postgres:14 environment: POSTGRES_DB: deepintshield POSTGRES_USER: deepintshield POSTGRES_PASSWORD: deepintshield_password volumes: - postgres_data:/var/lib/postgresql/data networks: - deepintshield-net
consul: image: hashicorp/consul:latest command: agent -dev -client=0.0.0.0 ports: - "8500:8500" networks: - deepintshield-net
deepintshield-1: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8080:8080" depends_on: - postgres - consul networks: - deepintshield-net
deepintshield-2: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8081:8080" depends_on: - postgres - consul networks: - deepintshield-net
deepintshield-3: image: <enterprise_repo_base_url>/deepintshield:latest environment: - DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json volumes: - ./config.json:/etc/deepintshield/config.json ports: - "8082:8080" depends_on: - postgres - consul networks: - deepintshield-net
nginx: image: nginx:alpine ports: - "80:80" volumes: - ./nginx.conf:/etc/nginx/nginx.conf:ro depends_on: - deepintshield-1 - deepintshield-2 - deepintshield-3 networks: - deepintshield-net
volumes: postgres_data:
networks: deepintshield-net: driver: bridgenginx.conf for load balancing:
events { worker_connections 1024;}
http { upstream bifrost_cluster { least_conn; server deepintshield-1:8080 max_fails=3 fail_timeout=30s; server deepintshield-2:8080 max_fails=3 fail_timeout=30s; server deepintshield-3:8080 max_fails=3 fail_timeout=30s; }
server { listen 80;
location / { proxy_pass http://bifrost_cluster; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts proxy_connect_timeout 60s; proxy_send_timeout 60s; proxy_read_timeout 60s; }
location /health { access_log off; return 200 "healthy\n"; add_header Content-Type text/plain; } }}Production-ready Kubernetes deployment with StatefulSet:
apiVersion: v1kind: ConfigMapmetadata: name: deepintshield-config namespace: deepintshielddata: config.json: | { "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "kubernetes", "service_name": "deepintshield-cluster", "k8s_namespace": "deepintshield", "k8s_label_selector": "app=deepintshield,component=gateway" }, "gossip": { "port": 10101, "config": { "timeout_seconds": 10, "success_threshold": 3, "failure_threshold": 3 } } }, "config_store": { "enabled": true, "type": "postgres", "config": { "host": "postgres.deepintshield.svc.cluster.local", "port": "5432", "user": "deepintshield", "password": "changeme", "db_name": "deepintshield", "ssl_mode": "require" } } }---apiVersion: v1kind: ServiceAccountmetadata: name: deepintshield namespace: deepintshield---apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: deepintshield-pod-reader namespace: deepintshieldrules:- apiGroups: [""] resources: ["pods"] verbs: ["get", "list", "watch"]---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: name: deepintshield-pod-reader namespace: deepintshieldsubjects:- kind: ServiceAccount name: deepintshield namespace: deepintshieldroleRef: kind: Role name: deepintshield-pod-reader apiGroup: rbac.authorization.k8s.io---apiVersion: apps/v1kind: StatefulSetmetadata: name: deepintshield namespace: deepintshieldspec: serviceName: deepintshield-cluster replicas: 3 selector: matchLabels: app: deepintshield component: gateway template: metadata: labels: app: deepintshield component: gateway spec: serviceAccountName: deepintshield containers: - name: deepintshield image: <enterprise_repo_base_url>/deepintshield:latest ports: - containerPort: 8080 name: http protocol: TCP - containerPort: 10101 name: gossip protocol: TCP env: - name: DEEPINTSHIELD_CONFIG value: /etc/deepintshield/config.json volumeMounts: - name: config mountPath: /etc/deepintshield resources: requests: cpu: "500m" memory: "512Mi" limits: cpu: "2000m" memory: "2Gi" livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 10 periodSeconds: 5 volumes: - name: config configMap: name: deepintshield-config---apiVersion: v1kind: Servicemetadata: name: deepintshield-cluster namespace: deepintshieldspec: clusterIP: None selector: app: deepintshield component: gateway ports: - port: 10101 name: gossip protocol: TCP---apiVersion: v1kind: Servicemetadata: name: deepintshield namespace: deepintshieldspec: type: LoadBalancer selector: app: deepintshield component: gateway ports: - port: 80 targetPort: 8080 protocol: TCP name: http---apiVersion: policy/v1kind: PodDisruptionBudgetmetadata: name: deepintshield-pdb namespace: deepintshieldspec: minAvailable: 2 selector: matchLabels: app: deepintshield component: gatewayFor bare metal or VM deployments using systemd:
Step 1: Install DeepIntShield on each node
# Download DeepIntShield Enterprise binarycurl -O https://releases.getmaxim.ai/deepintshield-enterprise/latest/deepintshield-enterprise-linux-amd64chmod +x deepintshield-enterprise-linux-amd64sudo mv deepintshield-enterprise-linux-amd64 /usr/local/bin/deepintshield-enterpriseStep 2: Create configuration file
sudo mkdir -p /etc/deepintshieldsudo cat > /etc/deepintshield/config.json <<EOF{ "cluster_config": { "enabled": true, "discovery": { "enabled": true, "type": "dns", "service_name": "deepintshield-cluster", "dns_names": ["deepintshield-cluster.internal.company.com"] }, "gossip": { "port": 10101 } }, "config_store": { "enabled": true, "type": "postgres", "config": { "host": "postgres.internal.company.com", "port": "5432", "user": "deepintshield", "password": "secure_password", "db_name": "deepintshield", "ssl_mode": "require" } }}EOFStep 3: Create systemd service
sudo cat > /etc/systemd/system/deepintshield.service <<EOF[Unit]Description=DeepIntShield Enterprise API GatewayAfter=network.target
[Service]Type=simpleUser=deepintshieldGroup=deepintshieldEnvironment="DEEPINTSHIELD_CONFIG=/etc/deepintshield/config.json"ExecStart=/usr/local/bin/deepintshield-enterpriseRestart=alwaysRestartSec=10StandardOutput=journalStandardError=journal
# Security hardeningNoNewPrivileges=truePrivateTmp=trueProtectSystem=strictProtectHome=trueReadWritePaths=/var/lib/deepintshield
[Install]WantedBy=multi-user.targetEOFStep 4: Setup DNS records
# Add A records for deepintshield-cluster.internal.company.com# pointing to all node IPs:# 10.0.1.10 (node1)# 10.0.1.11 (node2)# 10.0.1.12 (node3)Step 5: Start and enable service
sudo useradd -r -s /bin/false deepintshieldsudo mkdir -p /var/lib/deepintshieldsudo chown deepintshield:deepintshield /var/lib/deepintshieldsudo systemctl daemon-reloadsudo systemctl enable deepintshieldsudo systemctl start deepintshieldsudo systemctl status deepintshieldStep 6: Verify cluster formation
# Check logs on each nodesudo journalctl -u deepintshield -f
# Look for messages like:# "successfully joined X peers on startup"# "cluster health: HEALTHY"Symptoms: Each node thinks it’s the only member
Common Causes & Solutions:
discovery.enabled: true and discovery.type is setservice_nameSymptoms: Nodes divided into separate clusters
Common Causes & Solutions:
Symptoms: Memory grows over time, especially in large clusters
Common Causes & Solutions:
Symptoms: Nodes repeatedly join and leave cluster
Common Causes & Solutions:
timeout_seconds in gossip configSymptoms: Broadcast queue errors, messages not propagating
Common Causes & Solutions:
Key log messages to look for:
✅ Successful cluster formation:- "successfully joined X peers on startup"- "cluster health: HEALTHY"- "discovered X nodes"
⚠️ Warning signs:- "no new nodes discovered"- "failed to join cluster"- "cluster health: NOT HEALTHY"- "node marked as suspect"
❌ Errors:- "discovery failed"- "failed to broadcast"- "timeout waiting for response"Monitor cluster health via HTTP endpoints:
# Check if node is healthycurl http://localhost:8080/health
# Get cluster status (if exposed)curl http://localhost:8080/cluster/status
# Expected response shows all cluster members{ "local_node": "deepintshield-remote-10101-...", "members": 3, "healthy_members": 3, "cluster_health": "HEALTHY"}This clustering implementation ensures DeepIntShield can handle enterprise-scale deployments with high availability, automatic service discovery, and intelligent traffic distribution across any infrastructure.