Disabling IPv6 in Debian

Oddly never had this problem until today,

Cannot initiate the connection to debian.map.fastly.net:443 (2a04:4e42::644). - Cannot initiate the connection to deb.debian.org:443 (2a04:4e42::644). - connect (101: Network is unreachable)
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

Adding these to /etc/sysctl.conf will fix it

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1



			

Streaming Squid Logs to GCP Logging / StackDriver

I’m still using Squid over SWP in GCP as a forward proxy because…well….it’s much cheaper. The only real shortcoming/limitation has been around logging and reporting – I don’t have a 3rd party logging setup like Splunk or ELK stack, so it basically comes down to tail -f in raw logfiles (though I did at least push them to a centralized bucket via a 1-minute cron job).

Sending 3rd party application logs to GCP StackDriver is a relatively simple process, I just couldn’t fine a specific example for Squid.


If not done so already, install Ops Agent:

cd /tmp
wget https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
sudo bash ./add-google-cloud-ops-agent-repo.sh --also-install 

And make sure the Service Account for the VM has these Roles:

  • logging.logWriter
  • monitoring.metricWriter

Next, I configured Squid to log in JSON format. This will allow searches based on log fields like special fields like Client IP address or URL, which is very useful. This was a 2-liner in squid.conf:

# Define Syntax for JSON Logging
logformat json { "client_ip": "%>a", "timestamp": "%{%FT%T%z}tg", "method": "%rm", "url": "%ru", "http_version": "HTTP/%rv", "response_code": %>Hs, "bytes": %<st, "user_agent": "%{User-Agent}>h", "status_code": "%Ss", "hier": "%Sh"}

# Log using JSON format
access_log /var/log/squid/access_json.log json

# Optional - disable /var/log/squid/access.log
access_log daemon:/dev/null

I chose a name ending in .log because the Debian package will automatically rotate all /var/log/squid/*.log files every day at 00:00:00 per /etc/logrotate.d/squid). I needed to ensure log rotation was occurring regularly to the disk didn’t get full.


To actually start sending the JSON logs to StackDriver add the following lines to the file
/etc/google-cloud-ops-agent/config.yaml

logging:
  processors:
    squid_json:
      type: parse_json
  receivers:
    squid_cache:
      type: files
      include_paths: [/var/log/squid/cache.log]
    squid:
      type: files
      include_paths: [/var/log/squid/access_json.log]
  service:
    pipelines:
      squid:
        receivers: [squid_cache]
      squid_proxy:
        receivers: [squid]
        processors: [squid_json]

This will also send the /var/log/squid/cache.log file, just not in JSON format. This log file only logs startup/shutdown and errors, so a regular text format just showing the message body was fine.

Restart the agent:

systemctl restart google-cloud-ops-agent

And the logs are now searchable

Permission errors on Cloud Build & Cloud Run as custom service account

Moved from a Project Owner account to a custom service account for deployments – you’d think in the year 2025 this would be simple, but as it turns out there’s roles required outside of just Cloud Build & Cloud Run to actually make this work.

Here’s a run-down of the errors and roles


Error #1: Missing ‘Storage Admin’ Role

After authenticating as the service account, I ran gcloud builds submit and got this:

ERROR: (gcloud.builds.submit) The user is forbidden from accessing the bucket [myproject-123456_cloudbuild]. Please check your organization's policy or if the user has the "serviceusage.services.use" permission. Giving the user Owner, Editor, or Viewer roles may also fix this issue. Alternatively, use the --no-source option and access your source code via a different method.

make: *** [cloud-build] Error 1

Didn’t make much sense as the account already had “Service Usage Consumer” and “Storage Object Admin” roles. ‘Editor’ did fix it, so did some searching and found a couple StackOverflow posts:

https://stackoverflow.com/questions/74301031/cloud-build-the-user-is-forbidden-from-accessing-the-bucket

https://stackoverflow.com/questions/74301031/cloud-build-the-user-is-forbidden-from-accessing-the-bucket

Adding the “Storage Admin” (not to be confused with “Storage Object Admin”) did fix it.


Error #2: Missing ‘Viewer’ Role

After fixing above, I get this new error:

ERROR: (gcloud.builds.submit) 
The build is running, and logs are being written to the default logs bucket.
This tool can only stream logs if you are Viewer/Owner of the project and, if applicable, allowed by your VPC-SC security policy.

The default logs bucket is always outside any VPC-SC security perimeter.
If you want your logs saved inside your VPC-SC perimeter, use your own bucket.
See https://cloud.google.com/build/docs/securing-builds/store-manage-build-logs.

Found this post which states it can be fixed by adding the Viewer role, and that did work.


Error #3: Missing ‘Cloud Build Service Account’ Role

Creating temporary archive of 34 file(s) totalling 73.6 KiB before compression.
Uploading tarball of [.] to [gs://myproject-123456_cloudbuild/source/xxxxx.tgz]
ERROR: (gcloud.builds.submit) PERMISSION_DENIED: The caller does not have permission. This command is authenticated as network-deployer@myproject-123456.iam.gserviceaccount.com which is the active account specified by the [core/account] property
make: *** [cloud-build] Error 1

Had to figure out this one on my own, and added “Cloud Build Service Account” and that did the trick


Error #4: Missing ‘Service Account User’ Role

Moving on to Cloud Run finally, I had already added “Cloud Run Builder” and “Cloud Run Developer”. Yet, I get this error:

X Deploying...                                                                                                                                                                                                                                                                                             
  . Creating Revision...                                                                                                                                                                                                                                                                                   
  . Routing traffic...                                                                                                                                                                                                                                                                                     
  . Setting IAM Policy...                                                                                                                                                                                                                                                                                  
Deployment failed                                                                                                                                                                                                                                                                                          
ERROR: (gcloud.run.deploy) PERMISSION_DENIED: Permission 'iam.serviceaccounts.actAs' denied on service account 123456789-compute@developer.gserviceaccount.com (or it may not exist). This command is authenticated as network-deployer@myproject-123456.iam.gserviceaccount.com which is the active account specified by the [core/account] property.
make: *** [cloud-run] Error 1

This one was at least more straightforward. The “Service Account User” role was required to do the deployment as the default service account.


In Summary, these were the roles I added to allow in order to use Cloud Build / Cloud Run:

  • Cloud Build Logging Service Agent
  • Cloud Build Service Account
  • Cloud Run Builder
  • Cloud Run Developer
  • Service Account User
  • Service Usage Consumer
  • Storage Admin
  • Storage Object Admin
  • Viewer

Connecting to non-local Ubiquiti Unifi Controller

Redid my Ubiquiti setup at home, running the Controller inside a Docker container running on a Synology NAS. Problem is, the NAS is internet-exposed, and runs on a DMZ network, whereas the Ubiquiti Switches & APs are on the LAN.

I first tried to address this via by using the Firewall to forward ports (tcp/8080, etc) from the LAN to the Synology IP, but had no luck. Creating a DNS entry for ‘unifi.<MY_INTERNAL_DOMAIN>’ only worked half the time.

Then found it’s possible to explicitly set the IP of the controller. Here’s how:

1) If device SSH username & password is not known, perform a factory reset.

2) After the device comes back online, find the IP address, then SSH to the device with default username and password of ubnt/ubnt

3) At the Command Prompt, enter the IP address or FQDN of the controller

set-inform http://192.0.2.23:8080/inform

Within a few seconds, the device should show up as ready to be adopted.

Thanks to this post for reminding me of the command.

Migrating Cloud Build from GCR to Artifact Registry

After enabling Artifact Registry, Run this command

 gcloud auth configure-docker

After creating a Repo in artifactory, use this command to build and upload to it:

 gcloud builds submit --tag $(GCR_HOST)/$(PROJECT_ID)/$(REPO)/$(SERVICE):latest .

Here’s my Makefile:

HOST := us-docker.pkg.dev
REGION := us-central1
PROJECT_ID := my-project
REPO := my-repo
SERVICE := my-service

include Makefile.env

all: gcp-setup cloud-build cloud-run-deploy

gcp-setup:
	gcloud config set project $(PROJECT_ID)

cloud-build:
	gcloud auth configure-docker $(HOST)
	gcloud builds submit --tag $(HOST)/$(PROJECT_ID)/$(REPO)/$(SERVICE):latest .


cloud-run-deploy:
	gcloud config set run/region $(REGION)
	gcloud run deploy $(SERVICE) --image $(HOST)/$(PROJECT_ID)/$(REPO)/$(SERVICE):latest

Migrating from GCP subnets from INTERNAL_HTTPS_LOAD_BALANCER to REGIONAL_MANAGED_PROXY

First, add the new subnet with purpose = “REGIONAL_MANAGED_PROXY” and role = “BACKUP”. A typical Terraform input might look like this:

{
    name        = "old-proxy-only-subnet"
    description = null
    ip_range    = "100.64.1.0/24"
    region      = "us-central1"
    purpose     = "INTERNAL_HTTPS_LOAD_BALANCER"
    role        = "ACTIVE"
},
{
    name        = "new-proxy-only-subnet"
    description = null
    region      = "us-central1"
    ip_range    = "100.64.2.0/24"
    purpose     = "REGIONAL_MANAGED_PROXY"
    role        = "BACKUP"
},

After the subnet has been created, switch the role to “ACTIVE”

{
    name        = "new-proxy-only-subnet"
    purpose     = "REGIONAL_MANAGED_PROXY"
    role        = "ACTIVE"
},

Google will automatically change the old subnet’s to role = “BACKUP”. It will also change state from “READY” to “DRAINING”. To match the role change, update input:

{
    name      = "old-proxy-only-subnet"
    purpose   = "INTERNAL_HTTPS_LOAD_BALANCER"
    role      = "BACKUP"
},

After 5 minutes, the draining should finish. You may either leave the old subnet as-is, or simply delete it.


To replace a REGIONAL_MANGED_PROXY subnet, follow this process:

  1. Add the new subnet with a unique name and ip range with role = “BACKUP”
  2. Change the new subnet’s role from “BACKUP” to “ACTIVE”. Google will change the old subnet’s role to “BACKUP”
  3. The old subnet can be be deleted after waiting at least 5 minutes for existing sessions to drain

Remember of course to update firewall rules if the IP address has changed! Google does not automatically create firewall rules for you.

CPU types and performance benchmarks for Synology NASes

The cool thing about higher end Synology NASes is they have the ability to run docker containers and virtual machines directly on the NAS, thanks to their x86 CPUs. Don’t expect anything great in terms of performance, but definitely adequate for basic software development and small office needs.

I found the specs on benchmarks a bit tough to find so here they are, in a simple table:

+------------------------------------------------------------------+
| Model   |  CPU Model & Clock    |  CPU Mark  |  Geekbench  | TDP |
+------------------------------------------------------------------+
| DS218+  | Celeron J3355 2.0 GHz |    1197    |   269/427   | 10W |
| DS224+  | Celeron J4125 2.0 GHz |    2962    |   347/963   | 10W |
| DS723+  | Ryzen R1600 2.6 GHz   |    3365    |     N/A     | 25W |
+---------+-----------------------+------------+-------------+-----+

Fixing ‘Invalid access config name’ error on CheckPoint

One of the many stupid things for the CheckPoint CloudGuard IaaS appliances in GCP is Checkpoint never took in to account scenarios where multiple clusters exist within the same project and/or same network. This results in a naming conflict for the static routes & access config, and the default behavior will be for different clusters to “steal” routes IP addresses from the others.

To fix this, the first step is give each cluster a unique name. This can be fairly easily done by setting CHKP_TAG in the Python script $FWDIR/scripts/gcp_had.py

CHKP_TAG = cluster-1

This variable influences the route and access config names. But that still won’t be enough, because their deployment script hard-codes the access config name, so failover still won’t work. You’ll see this in $FWDIR/log/gcp_had.elg during a failover event:

2024-03-28 23:09:44,259-GCP-CP-HA-ERROR- Operation deleteAccessConfig for https://www.googleapis.com/compute/v1/projects/project-1234/zones/us-west2-b/instances/checkpoint-member-b error OrderedDict([('errors', [OrderedDict([('code', 'INVALID_USAGE'), ('message', 'Invalid access config name `checkpoint-access-config` as the access config name in instance is `x-chkp-access-config`.')])])])

To fix this, the existing access config names must be manually deleted on both members:

gcloud compute instances delete-access-config checkpoint-member-a --zone=us-west2-a --access-config-name="x-chkp-access-config"

gcloud compute instances delete-access-config checkppoint-member-b --zone=us-west2-b --access-config-name="x-chkp-access-config"

Then perform a rolling reboot of both members, and failover should work now.

Getting an Access Token from a Service Account Key

This one took me a while to figure out, but if you want to get an Google access token from a Service Account key (JSON file), do this:

 
import json
import google.oauth2
import google.auth.transport.requests
import requests


KEY_FILE = "../private/my-project-key.json"
SCOPES = ['https://www.googleapis.com/auth/cloud-platform']


# Parse the key file to get its project ID
try:
with open(KEY_FILE, 'r') as f:
_ = json.load(f)
project_id = _.get('project_id')
except Exception as e:
quit(e)

try:
credentials = google.oauth2.service_account.Credentials.from_service_account_file(KEY_FILE, scopes=SCOPES)
_ = google.auth.transport.requests.Request()
credentials.refresh(_)
access_token = credentials.token
except Exception as e:
quit(e)

This token can then be used for Rest API calls by inserting in to the Authorization header like this:

instances = []

try:
    url = f"https://compute.googleapis.com/compute/v1/projects/{project_id}/aggregated/instances"

    headers = {'Authorization': f"Bearer {access_token}"}
    response = requests.get(url, headers=headers)
_ = response.json().get('items')
for k, v in _.items():
instances.extend(v.get('instances', []))
except Exception as e:
quit(e)

print([instance.get('name') for instance in instances])