How to Secure Cloud SQL Connections with Auth Proxy
A comprehensive tutorial on implementing Cloud SQL Auth Proxy with IAM-based authentication to secure database connections in Google Cloud environments.
Securing database connections is a critical requirement for data engineers working with Google Cloud. In this tutorial, you'll learn how to secure Cloud SQL connections with Auth Proxy and IAM authentication, a topic that frequently appears on the Professional Data Engineer certification exam. By the end of this guide, you'll have implemented a production-ready connection method that encrypts traffic, hides database IPs, and uses IAM for access control.
The Cloud SQL Auth Proxy provides a secure intermediary between your applications and Cloud SQL instances. Rather than exposing your database directly to network connections, the Auth Proxy handles authentication through IAM service accounts, encrypts all traffic automatically, and prevents direct access attempts by concealing the database IP address. This approach is essential knowledge for anyone preparing for the Professional Data Engineer exam, where securing data infrastructure is a core competency.
What You'll Accomplish
This tutorial will guide you through setting up Cloud SQL Auth Proxy to connect a containerized application to a Cloud SQL database. You'll configure IAM permissions, deploy the Auth Proxy as a sidecar container, and verify that connections are properly secured. The result is a connection pattern you can apply to any application running on Google Cloud, whether on Google Kubernetes Engine, Cloud Run, Compute Engine, or other services.
Prerequisites and Requirements
Before starting this implementation, ensure you have a Google Cloud project with billing enabled, the gcloud CLI installed and configured on your local machine, and Project Editor or Owner role (or specific permissions for Cloud SQL Admin and IAM Admin). You'll also need a Cloud SQL instance already created. This tutorial uses PostgreSQL, but the Auth Proxy works with MySQL and SQL Server as well. If deploying on GKE, you should have basic familiarity with Kubernetes concepts. Plan for approximately 45 minutes to complete all steps.
You can verify your gcloud configuration with this command:
gcloud config list
gcloud auth list
How Cloud SQL Auth Proxy Works
The Cloud SQL Auth Proxy serves as a secure connector between your application and your Cloud SQL database. When your application needs to query the database, it sends requests to the Auth Proxy running locally (typically on localhost:5432 for PostgreSQL or localhost:3306 for MySQL). The Auth Proxy then authenticates using a service account with the cloudsql.instances.connect
permission, establishes an encrypted connection to the Cloud SQL instance, and forwards the database traffic.
This architecture provides three key security benefits. First, your application never needs direct network access to the database, eliminating the need to configure firewall rules or whitelist IP addresses. Second, all traffic between the Auth Proxy and Cloud SQL travels through an encrypted channel, protecting sensitive data in transit. Third, IAM policies control which service accounts can connect, giving you centralized access management through Google Cloud's identity platform.
Step 1: Create a Service Account for Database Access
The first step is creating a dedicated service account that your application will use to authenticate to Cloud SQL. This follows the principle of least privilege by granting only the permissions necessary for database connectivity.
Create the service account with this command:
gcloud iam service-accounts create cloudsql-proxy-user \
--display-name="Cloud SQL Proxy Service Account" \
--description="Service account for application database access"
This creates a service account named cloudsql-proxy-user
in your GCP project. You'll reference this account in subsequent steps when configuring permissions and deploying the Auth Proxy.
Step 2: Grant Required IAM Permissions
The service account needs the cloudsql.instances.connect
permission to establish connections through the Auth Proxy. Google Cloud provides a predefined role called Cloud SQL Client that includes this permission along with other necessary capabilities.
Assign the role to your service account:
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:cloudsql-proxy-user@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/cloudsql.client"
Replace YOUR_PROJECT_ID
with your actual project ID. This command grants the service account permission to connect to any Cloud SQL instance in your project. For production environments where you want tighter control, you can grant permissions at the instance level instead of the project level.
Verify the binding was created successfully:
gcloud projects get-iam-policy YOUR_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:cloudsql-proxy-user@YOUR_PROJECT_ID.iam.gserviceaccount.com"
You should see the roles/cloudsql.client
role listed in the output, confirming that the service account has the necessary permissions.
Step 3: Obtain Your Cloud SQL Instance Connection Name
The Auth Proxy needs to know which Cloud SQL instance to connect to. Each instance has a unique connection name in the format PROJECT_ID:REGION:INSTANCE_NAME
.
Retrieve your instance connection name:
gcloud sql instances describe YOUR_INSTANCE_NAME \
--format="value(connectionName)"
Save this connection name as you'll need it when configuring the Auth Proxy. For example, if your project ID is data-platform-prod
, your region is us-central1
, and your instance name is postgres-main
, the connection name would be data-platform-prod:us-central1:postgres-main
.
Step 4: Deploy Cloud SQL Auth Proxy on Google Kubernetes Engine
For applications running on GKE, the recommended pattern is deploying the Auth Proxy as a sidecar container alongside your application container. This places the proxy in the same pod, allowing your application to connect via localhost.
Here's a complete Kubernetes deployment manifest that includes both an application container and the Cloud SQL Auth Proxy sidecar:
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-processor
spec:
replicas: 3
selector:
matchLabels:
app: payment-processor
template:
metadata:
labels:
app: payment-processor
spec:
serviceAccountName: cloudsql-proxy-user
containers:
- name: payment-app
image: gcr.io/YOUR_PROJECT_ID/payment-processor:latest
env:
- name: DB_HOST
value: "127.0.0.1"
- name: DB_PORT
value: "5432"
- name: DB_NAME
value: "transactions"
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: cloud-sql-proxy
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:latest
args:
- "--structured-logs"
- "--port=5432"
- "YOUR_PROJECT_ID:us-central1:postgres-main"
securityContext:
runAsNonRoot: true
resources:
requests:
memory: "256Mi"
cpu: "100m"
This configuration demonstrates several important patterns. The application container connects to 127.0.0.1:5432
, which is where the Auth Proxy listens. The proxy container uses the latest official image from Google Cloud and specifies the instance connection name as a command argument. Both containers share the same network namespace within the pod, making localhost connectivity possible.
The serviceAccountName
field tells GKE to use the service account you created earlier. For this to work, you need to enable Workload Identity and bind the Kubernetes service account to the Google Cloud service account:
kubectl create serviceaccount cloudsql-proxy-user
gcloud iam service-accounts add-iam-policy-binding \
cloudsql-proxy-user@YOUR_PROJECT_ID.iam.gserviceaccount.com \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:YOUR_PROJECT_ID.svc.id.goog[default/cloudsql-proxy-user]"
kubectl annotate serviceaccount cloudsql-proxy-user \
iam.gke.io/gcp-service-account=cloudsql-proxy-user@YOUR_PROJECT_ID.iam.gserviceaccount.com
Step 5: Configure Connection Pooling and Timeouts
For production deployments, you should configure connection pooling parameters to handle database connections efficiently. The Auth Proxy itself doesn't pool connections, so your application needs to implement pooling.
Here's an example Python configuration using SQLAlchemy that works well with the Auth Proxy:
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
import os
db_user = os.environ['DB_USER']
db_password = os.environ['DB_PASSWORD']
db_name = os.environ['DB_NAME']
db_host = os.environ['DB_HOST']
db_port = os.environ['DB_PORT']
connection_string = f"postgresql://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}"
engine = create_engine(
connection_string,
poolclass=QueuePool,
pool_size=5,
max_overflow=10,
pool_timeout=30,
pool_recycle=1800,
pool_pre_ping=True
)
def get_connection():
return engine.connect()
The pool_pre_ping
option is particularly important when using the Auth Proxy. It tests each connection before using it, which helps detect when the proxy restarts or when connections become stale.
Step 6: Verify Secure Connectivity
After deploying your application with the Auth Proxy, verify that connections are working correctly and using the secure path.
Check the Auth Proxy logs to confirm successful connections:
kubectl logs -l app=payment-processor -c cloud-sql-proxy --tail=50
You should see log entries indicating successful authentication and connection establishment. Look for messages about the Auth Proxy version, the instance it's connecting to, and successful connection attempts.
Test database connectivity from your application pod:
kubectl exec -it deployment/payment-processor -c payment-app -- \
psql -h 127.0.0.1 -U YOUR_DB_USER -d transactions -c "SELECT version();"
A successful query result confirms that your application can reach the database through the Auth Proxy. The connection travels through the encrypted tunnel without requiring any firewall rules or IP whitelisting.
Real-World Application Scenarios
The Cloud SQL Auth Proxy pattern applies across many industries and use cases where secure database access is required.
A telehealth platform processing patient data uses the Auth Proxy to connect their appointment scheduling service on Cloud Run to a Cloud SQL PostgreSQL database storing medical records. The Auth Proxy ensures that protected health information travels through encrypted channels and that only authorized services can access patient data. The IAM-based authentication integrates with their broader security model, where each microservice has a dedicated service account with specific permissions. This architecture helped them achieve HIPAA compliance by eliminating direct database access and providing audit trails through Cloud Logging.
A mobile game studio running player matchmaking services on GKE uses the Auth Proxy to connect to a Cloud SQL MySQL database that stores player profiles and game state. During peak hours when player counts spike, they scale their GKE deployment from 10 to 100 pods. Each pod includes an Auth Proxy sidecar, and the Auth Proxy handles connection lifecycle automatically. The game studio doesn't need to manage database credentials in their application code since the Auth Proxy uses service account authentication. This simplifies their deployment pipeline and reduces the risk of credential leaks.
A solar farm monitoring system collects telemetry from thousands of panels and stores time-series data in Cloud SQL. Their data ingestion pipeline runs on Compute Engine instances across multiple regions. Each instance runs the Auth Proxy as a systemd service, connecting to a regional Cloud SQL read replica for low-latency writes. The Auth Proxy automatically handles authentication using the Compute Engine default service account, which has been granted the Cloud SQL Client role. This setup allows them to add new monitoring locations quickly without reconfiguring database access for each site.
Running Cloud SQL Auth Proxy on Compute Engine
For applications running on Compute Engine instances, you can install the Auth Proxy as a standalone binary and run it as a system service.
Download and install the Auth Proxy on a Compute Engine instance:
wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 -O cloud_sql_proxy
chmod +x cloud_sql_proxy
sudo mv cloud_sql_proxy /usr/local/bin/
Create a systemd service file at /etc/systemd/system/cloud-sql-proxy.service
:
[Unit]
Description=Cloud SQL Auth Proxy
After=network.target
[Service]
Type=simple
User=cloudsqlproxy
ExecStart=/usr/local/bin/cloud_sql_proxy \
--structured-logs \
--port=5432 \
YOUR_PROJECT_ID:us-central1:postgres-main
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Start and enable the service:
sudo systemctl daemon-reload
sudo systemctl start cloud-sql-proxy
sudo systemctl enable cloud-sql-proxy
sudo systemctl status cloud-sql-proxy
The Auth Proxy will now start automatically when the instance boots and restart if it crashes. Your applications on the Compute Engine instance can connect to the database at localhost:5432
.
Common Issues and Troubleshooting
Several common issues can prevent the Auth Proxy from connecting successfully to Cloud SQL.
If you see an error message about missing permissions, verify that your service account has the cloudsql.instances.connect
permission. The error typically looks like "failed to connect: ensure that the account has the required permissions." Double-check your IAM bindings and confirm that you're using the correct service account in your deployment configuration.
Connection timeout errors often indicate that the instance connection name is incorrect or that the Cloud SQL instance is in a different project than expected. Verify the connection name format matches PROJECT_ID:REGION:INSTANCE_NAME
exactly. You can list all available instances with:
gcloud sql instances list --format="table(name,region,connectionName)"
If the Auth Proxy starts successfully but your application cannot connect, check that you're connecting to the correct port. The Auth Proxy defaults to port 5432 for PostgreSQL and 3306 for MySQL, but you can override this with the --port
flag. Make sure your application connection string uses 127.0.0.1
or localhost
as the host.
For Workload Identity issues on GKE, verify that the Kubernetes service account is properly annotated and that the Google Cloud service account has the iam.workloadIdentityUser
role for the Kubernetes service account. You can test Workload Identity with:
kubectl run -it --rm workload-identity-test \
--image=google/cloud-sdk:slim \
--serviceaccount=cloudsql-proxy-user \
--command -- gcloud auth list
The output should show your Google Cloud service account, confirming that Workload Identity is configured correctly.
Monitoring and Logging
Google Cloud provides comprehensive monitoring for Cloud SQL connections through the Auth Proxy. The Auth Proxy writes structured logs to stdout, which GKE and Cloud Run automatically forward to Cloud Logging.
View Auth Proxy logs in Cloud Logging:
gcloud logging read "resource.type=k8s_container AND \
resource.labels.container_name=cloud-sql-proxy" \
--limit=50 \
--format=json
You can set up log-based metrics to track connection attempts, failures, and authentication errors. This helps you detect issues before they impact users.
Cloud SQL also provides connection metrics that show active connections, connection attempts, and failed connection attempts. You can view these in the Cloud Console under your Cloud SQL instance monitoring tab, or query them programmatically using the Cloud Monitoring API.
Security Best Practices
When using the Cloud SQL Auth Proxy in production, follow these security recommendations to maintain a strong security posture.
Use dedicated service accounts for each application or workload. Avoid sharing service accounts across multiple services, as this makes it difficult to audit access and increases the blast radius if credentials are compromised. Each service account should have only the permissions it needs.
Store database credentials (username and password) in Secret Manager rather than environment variables or configuration files. While the Auth Proxy handles authentication to Cloud SQL, your application still needs database credentials to authenticate to PostgreSQL or MySQL itself. Retrieve secrets at runtime using the Secret Manager API.
Enable Cloud Audit Logs for Cloud SQL to track all connection attempts and administrative actions. This provides a complete audit trail for compliance requirements and security investigations. You can filter audit logs to see which service accounts connected to which databases at specific times.
Regularly rotate your database passwords and service account keys if you're using key-based authentication. The Auth Proxy supports automatic credential rotation when using Workload Identity or the Compute Engine default service account, which is preferable to downloading and managing service account keys manually.
Use Private IP for your Cloud SQL instances when possible. While the Auth Proxy encrypts traffic and hides the database IP, using Private IP adds another layer of defense by ensuring the database is not accessible from the public internet at all.
Integration with Other GCP Services
The Cloud SQL Auth Proxy integrates well with other Google Cloud services to support complete data pipelines and application architectures.
Cloud Run applications can use the Auth Proxy through Cloud Run's built-in Cloud SQL connector, which automatically deploys and configures the proxy as a sidecar. Simply add a Cloud SQL connection annotation to your Cloud Run service, and the platform handles the rest. This works particularly well for serverless data APIs that need to query Cloud SQL on demand.
Dataflow pipelines can connect to Cloud SQL using the Auth Proxy running on the worker instances. This pattern is useful for enriching streaming data with reference data stored in Cloud SQL or for writing aggregated results back to a relational database. Configure the Auth Proxy on your custom Dataflow worker images to enable secure database access.
Cloud Functions can connect to Cloud SQL using VPC connectors combined with Private IP, or by running the Auth Proxy within the function execution environment. For Python and Node.js functions, you can include the Auth Proxy in a custom container runtime. This allows event-driven functions to query or update Cloud SQL in response to Pub/Sub messages, Cloud Storage events, or HTTP requests.
BigQuery can load data from Cloud SQL using federated queries or scheduled transfers. While these don't use the Auth Proxy directly, they benefit from the same IAM-based access controls. You can build complete data warehousing workflows where operational data lands in Cloud SQL and analytical queries run in BigQuery.
Cost Optimization
Using the Cloud SQL Auth Proxy has minimal cost impact, but you should be aware of a few considerations when planning your budget.
The Auth Proxy itself is free software with no licensing costs. However, running it consumes compute resources. In a GKE sidecar pattern, each pod runs its own Auth Proxy instance, which requires memory (typically 256Mi) and CPU (typically 100m). For large deployments with hundreds of pods, these resource requests add up and affect your GKE cluster size.
One optimization is using a shared Auth Proxy deployment instead of sidecars. Deploy the Auth Proxy as a separate deployment with a Kubernetes service, and have your application pods connect to it over the cluster network. This reduces the total number of Auth Proxy instances but adds network hops and creates a potential bottleneck.
Cloud SQL charges for network egress when data leaves Google Cloud, but traffic within the same region is free. Ensure your application and Cloud SQL instance are in the same region to avoid egress charges. The Auth Proxy connection itself doesn't add extra egress costs since it's just a secure tunnel.
Next Steps and Advanced Configuration
Once you have basic Auth Proxy connectivity working, you can explore more advanced configurations and features.
The Auth Proxy supports connecting to multiple Cloud SQL instances simultaneously by specifying multiple connection names as arguments. This is useful for applications that need to access different databases for different purposes, such as separating read and write operations across primary and replica instances.
For high-availability setups, configure your application to connect to Cloud SQL regional instances rather than zonal instances. Regional instances automatically failover to a different zone if the primary zone becomes unavailable. The Auth Proxy handles reconnection automatically during failover events.
You can enable automatic IAM database authentication, which allows you to authenticate to the database itself using IAM credentials instead of passwords. This requires Cloud SQL with PostgreSQL 14 or MySQL 8.0, and eliminates the need to manage database passwords entirely. Your application authenticates through the Auth Proxy using IAM, and the database validates the IAM credential directly.
The Cloud SQL Admin API allows you to automate instance creation, configuration, and management. Combine the API with Infrastructure as Code tools like Terraform to provision Cloud SQL instances and configure Auth Proxy access programmatically. This supports repeatable deployments across development, staging, and production environments.
Summary
You've now implemented secure Cloud SQL connectivity using the Cloud SQL Auth Proxy with IAM-based authentication. You created a dedicated service account with the necessary permissions, deployed the Auth Proxy as a sidecar container on GKE, and verified that your application can connect to Cloud SQL through an encrypted tunnel. This pattern eliminates the need for IP whitelisting, automatically encrypts database traffic, and centralizes access control through Google Cloud IAM.
These skills are essential for the Professional Data Engineer certification exam, where you'll need to demonstrate understanding of secure data access patterns, IAM integration, and GCP service connectivity. The Auth Proxy represents a best practice for production database access that balances security, operational simplicity, and performance. For comprehensive preparation covering Cloud SQL, security, and all other Professional Data Engineer exam topics, check out the Professional Data Engineer course.