A weird, ugly Error message when running google_ha_test.py

[Expert@cp-member-a:0]# $FWDIR/scripts/google_ha_test.py
GCP HA TESTER: started
GCP HA TESTER: checking access scopes...
GCP HA TESTER: ERROR 

Expecting value: line 1 column 1 (char 0)

Got this message when trying to test a CheckPoint R81.10 cluster build in a new environment. Obviously, this error message is not at all helpful in determining what the problem is. So I wrote a little debug script to try and isolate the issue:

import traceback
import gcp as _gcp 

global api
api = _gcp.GCP('IAM', max_time=20)
metadata = api.metadata()[0]

project = metadata['project']['projectId']
zone = metadata['instance']['zone'].split('/')[-1]
name = metadata['instance']['name']

print("Got metadata: project = {}, zone = {}, name = {}\n".format(project, zone, name))
path = "/projects/{}/zones/{}/instances/{}".format(project, zone, name)

try:
    head, res = api.rest("GET",path,query=None, body=None,aggregate=False)
except Exception as e:
    print(traceback.format_exc())

Running the script, I now see an exception when trying to make the initial API call:

[Expert@cp-cluster-member-a:0]# cd $FWDIR/scripts
[Expert@cp-cluster-member-a:0]# python3 ./debug.py

Got metadata: project = myproject, zone = us-central1-b, name = cp-member-a

Traceback (most recent call last):
  File "debug.py", line 18, in <module>
    head, res = api.rest(method,path,query=None,body=None,aggregate=False)
  File "/opt/CPsuite-R81.10/fw1/scripts/gcp.py", line 327, in rest
    max_time=self.max_time, proxy=self.proxy)
  File "/opt/CPsuite-R81.10/fw1/scripts/gcp.py", line 139, in http
    headers['_code']), headers, repr(response))
gcp.HTTPException: Unexpected HTTP code: 403

This at least indicates the connection to the API is OK and it’s some type of permissions issue with the account.

The CheckPoints have always been really tough to troubleshoot in this aspect, so to keep it simple, I deploy them with the default service account for the project. It’s not explicitly called out

I was able to re-enabled Editor permissions for the default service account with this Terraform code:

# Set Project ID via input variable
variable "project_id" {
  description = "GCP Project ID"
  type = string
}
# Get the default service account info for this project
data "google_compute_default_service_account" "default" {
  project = var.project_id
}
# Enable editor role for this service account
resource "google_project_iam_member" "default_service_account_editor" {
  project = var.project_id
  member  = "serviceAccount:${data.google_compute_default_service_account.default.email}"
  role    = "roles/editor"
}
Advertisement

CheckPoint SmartView Monitor shows Permanent Tunnels Down, even though they’re up

Being fairly new to CheckPoint, I hadn’t yet used SmartView monitor, which is the windows desktop monitoring application. At first glance it wasn’t very useful. I had terminated several test tunnels to various Cisco, FortiGate, and Palo Alto firewalls, all of which were working fine. But they all showed down in SmartView. What the heck?

Reason: When it comes to monitoring tunnels, CheckPoint by default uses a proprietary protocol they call “tunnel_test” (udp/18234). In order to properly monitor VPN tunnels to Non-CheckPoint Devices, DPD (dead peer detection) must be used.

Here’s how to enable DPD on an interoperable device:

  1. In the CheckPoint SmartConsole folder (usually C:\Program Files (x86)\CheckPoint\SmartConsole), run GuiDBedit.exe
  2. Under Network Objects folder -> network_objects, look for the interoperable device Object. The class name will be “gateway_plain”
  3. Search for Field name tunnel_keepalive_method and change it to dpd
  4. File -> Save All, exit.
  5. Restart SmartConsole and install policy to the applicable Checkpoint gateways / clusters

After making that change, pushing policy, and restarting SmartView Monitor, the tunnels now show green:

Cisco ISR G2 to CheckPoint R80.30 IKEv1 VPN woes

I had previously done Cisco router to CheckPoint R80.30 gateway VPNs before without issue, but for whatever reason could not even establish phase 1 for this one. CheckPoint R80 VPN communities default to AES-256, SHA-1, Group 2, and 1-day timetime which is easy to match on the Cisco with this config:

crypto keyring mycheckpoint
 local-address GigabitEthernet0/0
 pre-shared-key address 192.0.2.190 key abcdefghij1234567890
!
crypto isakmp policy 100
 encr aes 256
 authentication pre-share
 group 2
 hash sha          ! <--- default value
 lifetime 864000   ! <--- default value
!

After verifying connectivity, doing packet captures, and multiple reboots on on both ends, IKE simply would not come up. On the Cisco ISR, debug crypto isakmp wasn’t especially helpful:

Jun 18 11:06:17.085: ISAKMP: (0):purging SA., sa=3246F97C, delme=3246F97C
Jun 18 11:06:17.285: ISAKMP: (0):SA request profile is (NULL)
Jun 18 11:06:17.285: ISAKMP: (0):Created a peer struct for 35.245.62.190, peer port 500
Jun 18 11:06:17.285: ISAKMP: (0):New peer created peer = 0x2CE62C3C peer_handle = 0x80000005
Jun 18 11:06:17.285: ISAKMP: (0):Locking peer struct 0x2CE62C3C, refcount 1 for isakmp_initiator
Jun 18 11:06:17.285: ISAKMP: (0):local port 500, remote port 500
Jun 18 11:06:17.285: ISAKMP: (0):set new node 0 to QM_IDLE
Jun 18 11:06:17.285: ISAKMP: (0):insert sa successfully sa = 2CE620E8
Jun 18 11:06:17.285: ISAKMP: (0):Can not start Aggressive mode, trying Main mode.
Jun 18 11:06:17.285: ISAKMP: (0):found peer pre-shared key matching 192.0.2.190
Jun 18 11:06:17.285: ISAKMP: (0):constructed NAT-T vendor-rfc3947 ID
Jun 18 11:06:17.285: ISAKMP: (0):constructed NAT-T vendor-07 ID
Jun 18 11:06:17.285: ISAKMP: (0):constructed NAT-T vendor-03 ID
Jun 18 11:06:17.285: ISAKMP: (0):constructed NAT-T vendor-02 ID
Jun 18 11:06:17.285: ISAKMP: (0):Input = IKE_MESG_FROM_IPSEC, IKE_SA_REQ_MM
Jun 18 11:06:17.285: ISAKMP: (0):Old State = IKE_READY New State = IKE_I_MM1
Jun 18 11:06:17.285: ISAKMP: (0):beginning Main Mode exchange
Jun 18 11:06:17.285: ISAKMP-PAK: (0):sending packet to 192.0.2.190 my_port 500 peer_port 500 (I) MM_NO_STATE
Jun 18 11:06:17.285: ISAKMP: (0):Sending an IKE IPv4 Packet.
Jun 18 11:06:17.369: ISAKMP-PAK: (0):received packet from 192.0.2.190 dport 500 sport 500 Global (I) MM_NO_STATE
Jun 18 11:06:17.369: ISAKMP-ERROR: (0):Couldn't find node: message_id 2303169274
Jun 18 11:06:17.369: ISAKMP-ERROR: (0):(0): Unknown Input IKE_MESG_FROM_PEER, IKE_INFO_NOTIFY: state = IKE_I_MM1
Jun 18 11:06:17.369: ISAKMP: (0):Input = IKE_MESG_FROM_PEER, IKE_INFO_NOTIFY
Jun 18 11:06:17.369: ISAKMP: (0):Old State = IKE_I_MM1 New State = IKE_I_MM1

The CheckPoint gave a more “useful” error:

Main Mode Failed to match proposal: Transform: AES-256, SHA1, Group 2 (1024 bit); Reason: Wrong value for: Authentication Method

This seemed to imply the CheckPoint was expecting certificate-based authentication rather than PSK. In traditional mode, the gateway is set by default for certificate only. But it’s not clear how this is configured in newer versions.

After poking around settings for quite a while, I simply deleted the VPN community in CheckPoint SmartConsole and re-created it. The connection then popped up immediately.

¯\_(ツ)_/¯

Reset admin password for CheckPoint IaaS Gateway in GCP or AWS

Someone changed the admin password, but we could still access the gateway via the SSH key. The processes for resetting the password, bypassing password history was quite easy:

Go to expert mode and generate a hashed string for password ‘ABCXYZ1234’

[Expert@checkpoint:0]# cpopenssl passwd -1 ABCXYZ1234
$1$I54N3F1M$lk/zHvFaKRKXkUFoiEamq1

Then go back to regular CLI and apply the hashed password

set user admin password-hash $1$I54N3F1M$lk/zHvFaKRKXkUFoiEamq1exit

save config

That’s it. Logging in to GAIA as admin / ABCXYZ1234 will then work

Using CheckPoint Dynamic Objects to Source NAT flows

By default, the CheckPoint will usually have three dynamic objects that can be referenced in firewall and NAT policy rules

  • LocalGateway – Main interface of the CheckPoint
  • LocalGatewayExternal – External interface of the CheckPoint
  • LocalGatewayInternal – First internal interface of the CheckPoint

In a 3-Nic deployment, you may want to reference the second internal NIC, for example to source NAT traffic bound to the internal servers to the CheckPoint’s internal IP address.

To do this, you must create a custom dynamic object in SmartConsole, then manually create it on each gateway.

On the gateway, first verify the internal IP address:

[Expert@gateway]# ifconfig eth2
eth2      Link encap:Ethernet HWaddr 42:01:0A:D4:80:03 
          inet addr:10.1.2.1 Bcast:10.1.2.255 Mask:255.255.255.0

Create the object:

[Expert@gateway]# dynamic_objects -n LocalGateway-eth2 -r 10.1.2.1 10.1.2.1 -a

Verify it’s been created:

[Expert@gateway]# dynamic_objects -l

object name : LocalGateway
range 0 : 198.51.100.100 198.51.100.100

object name : LocalGatewayExternal
range 0 : 198.51.100.100 198.51.100.100

object name : LocalGatewayInternal
range 0 : 10.1.1.10 10.1.1.10

object name : LocalGateway-eth2
range 0 : 10.1.2.1 10.1.2.1

Source: skI1915 – Configuring Dynamic Objects

 

Deploying CheckPoint CloudGuard IaaS High Availability in GCP

A minimum 3 NICs are required and will be broken down like so:

  • eth0 – Public / External Interface facing Internet
  • eth1 – Management interface used for Cluster sync.  Can also be used for security management server communication
  • eth2 – First internal interface.  Usually faces internal servers & load balancers.  Can be used for security management server communication

The Deployment launch template has a few fields which aren’t explained very well…

Security Management Server address

A static route to this destination via management interface will be created a launch time.  If the Security Management server is accessed via one of the internal interfaces, use a dummy address here such as 1.2.3.4/32 and add the static routes after launch.

SIC key

This is the password to communicate with the Security Management server. It can be set after launch, but if already known, it can be set here to be pre-configured at launch

Automatically generate an administrator password

This will create a new random ‘admin’ user password to allow access to the WebGUI right after launch, which saves some time especially in situations were SSH is slow or blocked.

Note – SSH connections always require public key authentication, even with this enabled

Allow download from/upload to Check Point

This will allow the instance to communicate outbound to Checkpoint to check for updates.  It’s enabled by default on most CheckPoint appliances, so I’d recommend enabling this setting

Networking

This is the real catch, and a pretty stupid one.  The form fills out these three subnets:

  • “Cluster External Subnet CIDR” = 10.0.0.0/24
  • “Management external subnet CIDR” = 10.0.1.0/24
  • “1st internal subnet CIDR” = 10.0.2.0/24

If using an existing network, erase the pre-filled value and then select the appropriate networks in the drop-down menus like so:

GCP_Existing_VPCNetworks

Also, make sure all subnets have “Private Google Access” checked

Post-launch Configuration

After launch, access the gateways via SSH using public key and/or WebGUI to run through initial setup.  The first step is set a new password for the admin user:

set user admin password

set expert-password

Since eth1 rather than eth0 is the management interface, I would recommend setting that accordingly:

set management interface eth1

I would also recommend adding static routes. The deployment will create static routes for RFC 1918 space via the management interface.  If these need to be overridden to go via an internal interface the CLI command is something like this

set static-route NETWORK/MASK nexthop gateway address NEXTHOP_ADDRESS on

Before importing in to SmartConsole, you can test connectivity by trying to telnet to the security management’s server address on port 18191. Once everything looks good, don’t forget to save the configuration:

save config

Cluster Creation

In SmartConsole, create a new ClusterXL. When prompted for the cluster address, enter the primary cluster address.  The easy way to find this is look the the deployment result under Tools -> Deployment manager -> Deployments

CheckPoint_Deployment_ClusterIPExternalAddress

Then add the individual gateways with the management interface.   Walking through the wizard, you’ll need to define the type of each interface:

  • Set the first (external) interface to private use
  • Set the secondary (management) interface as sync/primary
  • Set subsequent interfaces as private use with monitoring.

Note the wizard tends to list the interfaces backwards: eth2, eth1, eth0

GCP_Clustering

The guide lists a few steps to do within the Gateway Cluster Properties, several of which I disagree with. Instead, I’d suggest the following:

  • Under Network Management, VPN Domain, create a group that lists the internal subnets behind the Checkpoint that will be accessed via site-to-site and remote access VPNs
  • On the eth1 interface, set Topology to Override / This Network / Network defined by routes. This should allow IP spoofing to remain enabled
  • Under NAT, do not check “Hide internal networks behind the Gateway’s external IP” as this will auto-generate a NAT rule that could conflict with site-to-site VPNs. Instead, create manual NAT rules in the policy.
  • Under IPSec VPN, Link Selection, Source IP address Settings, set Manual / IP address of chosen interface

Do a policy install on the new cluster, and a few minutes later, the GCP console should map the primary and secondary external IP addresses to the two instances

CheckPoint_GCP_External_IPAddresses

Failover

Failover is done via API call and takes roughly 15 seconds.

On the external network (front end), the primary and secondary firewalls will each get external IP address mapped.  CheckPoint calls these “primary-cluster-address” and “secondary-cluster-address”.  I’d argue “active” and “standby” would be better names, because the addresses will flip during a failover event.

On the internal network (back end0, failover is done by modifying the static route to 0.0.0.0/0.  The entries will be created on the internal networks when the cluster is formed.

Known Problems

The script $FWDIR/scripts/gcp_ha_test.py is missing

This is simply a mistake in CheckPoint’s documentation.  The correct file name is:

$FWDIR/scripts/google_ha_test.py

Deployment Fails with error code 504, Resource Error, Timeout expired

DeployFailure

Also, while the instances get created and External static IPs allocated, the secondary cluster IP never gets mapped and failover does not work.

Cause: there is a portion of the R80.30 deployment script relating to external IP address mapping that assumes the default service account is enabled, but many Enterprise customers will have default service account disabled as a security best practice.  As of January 2020, the only fix is to enable the default service account, then redo the deployment.

StackDriver is enabled at launch, but never gets logs

Same issue as a above.  As of January 2020, it depends on the default service account being enabled.

Site-to-Site IPSec VPNs on CheckPoint R80.30

The first step is to create a new object with the public IP address of the other side of the tunnel.  This is fairly well buried in the menus:

R80_30_new_VPN_interop_device

After that, create a new VPN “community” in Objects -> More object types -> VPN Community -> New Meshed VPN and walk through the wizard.

The main gotcha is watch out for weird default settings.  In particular, AES-128 is disabled as encryption cipher for Phase 1.  My guess is since it’s the most popular cipher for Phase 2, they go with the “mix ciphers” strategy.  But personally I just like to use AES-128 for everything – it’s simple, fast, and plenty secure.

CheckPoint Dedicated Management Route

New feature (finally!) in R80.30 is the ability to enabled Management data plane Separation, in order to have a separate route table for the management interface and all management related functions (Policy installation, SSH, SNMP, syslog, GAIA portal, etc).

Let’s assume the interface “Mgmt” has already been set as the management interface with IP address 192.168.1.100 and wants default gateway 192.168.1.1, and “eth5” has been setup as the dedicated sync interface:

set mdps mgmt plane on
set mdps mgmt resource on
set mdps interface Mgmt management on
set mdps interface eth5 sync on
add mdps route 0.0.0.0/0 nexthop 192.168.1.1
save config
reboot

After the box comes up you can verify the management route has been set by going in to expert mode and the the “mplane” command to enter management space:

> expert
[Expert@MyCheckPoint:0]# mplane
Context set to Management Plane
[Expert@MyCheckPoint:1]# netstat -rn
Kernel IP routing table
Destination  Gateway       Genmask         Flags MSS Window irtt Iface
169.254.0.0  0.0.0.0       255.255.255.252 U     0   0      0    eth5
192.168.1.0  0.0.0.0       255.255.255.0   U     0   0      0    Mgmt
0.0.0.0      192.168.1.1   0.0.0.0         UGD   0   0      0    Mgmt

Routes from the main route table relating to management can then be deleted, which makes the data plane route table much cleaner:

[Expert@MyCheckpoint:1]# dplane
Context set to Data Plane

[Expert@MyCheckPoint:0]# netstat -rn
Kernel IP routing table
Destination   Gateway       Genmask         Flags MSS Window irtt Iface
203.0.113.32  0.0.0.0       255.255.255.224 U     0   0      0    bond1.11
192.168.222.0 0.0.0.0       255.255.255.0   U     0   0      0    bond1.22
0.0.0.0       203.0.113.33  0.0.0.0         UGD   0   0      0    bond1.11
192.168.0.0   192.168.222.1 255.255.0.0     UGD   0   0      0    bond1.22

Upgrading Checkpoint Management Server in AWS from R80.20 to R80.30

Unfortunately it is not possible to simply upgrade an existing CheckPoint management server in AWS.  A new one must be built, with the database manually exported from the old instance and imported to the new one.

There is a CheckPoint Knowledge base article, but I found it to have several errors and also be confusing on which version of tools should be used.

Below is the process I used to go from R80.20 to R80.30

Login to the old R80.20 server.  Download and extract the R80.30 tools:

cd /home/admin
tar -zxvf Check_Point_R80.30_Gaia_SecurePlatform_above_R75.40_Migration_tools.tgz

Run the export job to create an archive of the database:

./migrate export --exclude-licenses /tmp/R8020Backup.tgz

Copy this .tgz file to the new R80.30 management server in /tmp

On the new management server, run the import job:

cd $FWDIR/bin/upgrade_tools
./migrate import /tmp/R8020Backup.tgz 
The import operation will eventually stop all Check Point services (cpstop)
Do you want to continue? (y/n) [n]? y

After a few minutes, the operation will complete and you’ll be prompted to start services again.

Finish by upgrading SmartConsole to R80.30 and connect to the new R80.30 server.  I’ve noticed it to be very slow, but it will eventually connect and all the old gateways and policies will be there.

CheckPoint Initial Configuration via CLI

The default credentials are admin/admin

Verify the management interface

show management interface

Set the management interface with IP address 192.168.1.222/255.255.255.0

set interface Mgmt ipv4-address 192.168.1.222 mask-length 24

Verify IP address for management interface

show interface Mgmt ipv4-address

Ping something

ping 192.168.1.1

Set the default route to 192.168.1.1

set static-route default nexthop gateway address 192.168.1.1 priority 1 on

Create internal route for 10.0.0.0/8 via gateway 10.10.10.10

set static-route 10.0.0.0/8 nexthop gateway address 10.10.10.1 on

Verify routing

show route

Set DNS servers

set dns primary 8.8.8.8
set dns secondary 9.9.9.9

Save the configuration

save config

Show all interface

show interfaces

Show interfaces with IP addresses configured

show security-gateway monitored-interfaces

Create an 802.3ad (LACP) bonded logical interface with eth1 & eth2 as physical members

add bonding group 1
set bonding group 1 mode 8023AD
set bonding group 1 lacp-rate fast
add bonding group 1 interface eth1
add bonding group 1 interface eth2

Create a VLAN sub-interface on bond1 with 802.1q tag 123

add interface bond1 vlan 123

Check software version

show version all

Get hardware information and serial number

show asset system

Change admin password

set user admin password

Set expert mode password

set expert-password

Check policy Status

fw stat

Clear the current local policy

fw unloadlocal

Check site-to-site VPN status

vpn tu tlist

Reset VPN tunnels (list/delete IKE/IPSec SAs)

vpn tu

Modify license, configure SNMP, reset SIC connection:

cpconfig

Verify number of CPUs

fw ctl multik stat

View CPU to connection distribution table

fw ctl affinity -l -r

Reboot the firewall

reboot