u/DailyDuino

Cloud Run problems since May 12. GCP Says it is solved.

Hi guys,
Since May 12, our Cloud Run services and jobs have been consistently failing. Our services call external endpoints fairly frequently, but the call volume itself hasn't changed. The issue is that we're getting a "network unreachable" error after the service exhausts its configured timeout, and this affects both Cloud Run services and jobs.

To rule out issues on the external endpoint side, we tested the same service on a Hetzner instance and an on-prem server, and both worked without any problems. This points to the issue being specific to our Cloud Run environment.

We checked the GCP status page for our project and found a NAT issue that was marked as resolved, but the problem is still ongoing on our end. On top of that, this has caused our billing to spike significantly.

As a temporary workaround, we switched to a Serverless VPC, which does work, but it's considerably more expensive and still fails occasionally.

Is anyone else facing the same issue ?

reddit.com
u/DailyDuino — 1 day ago