Using GCP Ops Agent to view Squid Logs


The VMs were deployed via Terraform using instance templates, managed instance groups, and an internal TCP/UDP load balancer with a forwarding rule for port 3128. Debian 11 (Bullseye) was selected as the OS because it has a low memory footprint while still offering an nice pre-packaged version of Squid version 4.

The first problem is the older stackdriver agent isn’t compatible with Debian 11. So I had to install the newer one. I chose to just add these lines to my startup script, pulling the script directly from a bucket to avoid the requirement of Internet access:

gsutil cp gs://public-j5-org/ /tmp/
bash /tmp/ --also-install

After re-deploying the VMs, I ssh’d in and verified the Ops agent was installed and running:

sudo systemctl status google-cloud-ops-agent"*"

google-cloud-ops-agent-opentelemetry-collector.service - Google Cloud Ops Agent - Metrics Agent
     Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-opentelemetry-collector.service; static)
     Active: active (running) since Fri 2023-02-10 22:18:17 UTC; 18min ago
    Process: 4317 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=otel -in /etc/google-cloud-ops-agent/config.yaml -logs ${LOGS_DIRECTORY} (code=exited, status=0/>
   Main PID: 4350 (otelopscol)
      Tasks: 7 (limit: 1989)
     Memory: 45.7M
        CPU: 1.160s

After waiting a couple minutes, I still didn’t see anything, so I downloaded and ran their diagnostic script:

gsutil cp gs://public-j5-org/ /tmp/
bash /tmp/

This was confusing because while it didn’t show any errors, the actual log was dumped to disk in a sub-directory of /var/tmp/google-agents/. and did indicate a problem in the agent-info.txt file:

API Check - Result: FAIL, Error code: LogApiPermissionErr, Failure:
 Service account is missing the roles/logging.logWriter role., Solution: Add the roles/logging.logWriter role to the Google Cloud service account., Res

And this made sense, because in order for Ops Agent to function, it needs these two IAM roles enabled for the service account:

  • Monitoring > Monitoring Metric Writer.
  • Logging > Logs Writer.

Here’s a Terraform snippet that will do that:

# Add required IAM permissions for Ops Agents
locals {
  roles = ["logging.logWriter", "monitoring.metricWriter"]
resource "google_project_iam_member" "default" {
  for_each = var.service_account_email != null ? { for i, v in local.roles : i => v } : {}
  project  = var.project_id
  member   = "serviceAccount:${var.service_account_email}"
  role     = "roles/${each.value}"

Within a few minutes of adding these, data started showing up in the graphs.