Faster and more Reliable ServiceAccount authentication for Google Cloud Platform APIs

2018-08-28

Please note that JWT access token when used with our client libraries requires you have an actual service account key handy….try to avoid that if you can. It arguable acceptable to use keys that are embedded in HSMs https://github.com/salrashid123/oauth2#usage-tpmtokensource https://github.com/salrashid123/tpm2_evp_sign_decrypt/tree/master

A couple weeks ago I wanted to understand the AccessTokenCredentials flow that certain google cloud APIs supported.

It’s described here in the addendum of our developers oauth documentation as a specific optimization over the “normal” oauth flows providers maintain. The optimization is pretty dramatic so I thought I’d write an article describing it and how its used…and finally a quick bakeoff to demonstrate its advantage.

First, a quick background on serviceAccount oauth flows

ServiceAccount Oauth2 Flow

A serviceAccount on GCP can take many shapes but it normally represents a non-user accessing a system. Think of it as machine accounts that require access to a service. When a system needs to access a GCP service (eg Pub/Sub), it needs to acquire an Oauth2 token described here

Essentially, a cryptographic private key is issued for a ServiceAccount and that key is used to sign a JSONWebToken (JWT) which includes some claims and information about the token capabilities requested. At that point, the JWT is transmitted to Google which verifies the claims and identity of the service account. Once google verifies the identity, it issues an access_token for the ServiceAccount with the scopes the original JWT requested and returns that token back to the client. The client application at that point has the bearer access_token to make the request to the service (Pub/Sub, in this example).

AccessTokenCredential Flow

Allright, so how can we optimize the flow above if we already have a crypto key we can sign with? How about we create a JWT with a specific audience that is the service we intend it for? That is the optimized flow we’re dealing with in this article and that is the flow that will save us this roundtrip.

The golang sample here basically reads the private key and uses it to sign a JWT with a specific aud: field that denotes the service its intended for.

This flow saves a round trip call but only applies to specific services within Google Cloud. These specific services utilize a different backend system which allows for this abridged flow. For example, the services listed below are the only ones that allows for this:

If you’re interested, the JWT that is signed by the service account uses the aud: field that describes the target service itself:

{
  "alg": "RS256",
  "typ": "JWT",
  "kid": "cc241d179abcbea44d0c69355bab01315a1ea45d"
}
{
  "iss": "access-token-creds@mineral-minutia-820.iam.gserviceaccount.com",
  "aud": "https://pubsub.googleapis.com/google.pubsub.v1.Publisher",
  "exp": 1535333036,
  "iat": 1535329436,
  "sub": "access-token-creds@mineral-minutia-820.iam.gserviceaccount.com"
}

Example Implementation

If you want to try this sample out, you would need to first create a service account and download its JSON private key. Once you do that, enable IAM access for that service account to Pub/Sub Viewer role as shown below:

images/pubsub.png

At that point, download the JSON certificate and initialize the client:

package main

import (
	"flag"
	"fmt"
	"io/ioutil"
	"math"
	"os"
	"time"

	"github.com/golang/glog"
	"golang.org/x/net/context"
	"golang.org/x/oauth2/google"

	"cloud.google.com/go/pubsub"
	"google.golang.org/api/iterator"
	"google.golang.org/api/option"
)

var (
	projectID = flag.String("project", "", "Project ID")
	keyfile   = flag.String("keyfile", "", "Service Account JSON keyfile")
)

func main() {
	flag.Parse()

	if *projectID == "" {
		fmt.Fprintln(os.Stderr, "missing -project flag")
		flag.Usage()
		os.Exit(2)
	}
	if *keyfile == "" {
		fmt.Fprintln(os.Stderr, "missing -keyfile flag")
		flag.Usage()
		os.Exit(2)
	}

	// audience values for other services can be found in the repo here similar to
	// PubSub
	// https://github.com/googleapis/googleapis/blob/master/google/pubsub/pubsub.yaml
	var aud string = "https://pubsub.googleapis.com/google.pubsub.v1.Publisher"

	ctx := context.Background()
	keyBytes, err := ioutil.ReadFile(*keyfile)
	if err != nil {
		glog.Error("Unable to read service account key file  %v", err)
	}

	start := time.Now()
	tokenSource, err := google.JWTAccessTokenSourceFromJSON(keyBytes, aud)
	if err != nil {
		glog.Error("Error building JWT access token source: %v", err)
	}

	/*
		jwt, err := tokenSource.Token()
		if err != nil {
			glog.Error("Unable to generate JWT token: %v", err)
		}
		glog.V(3).Infoln(jwt.AccessToken)
	*/

	client, err := pubsub.NewClient(ctx, *projectID, option.WithTokenSource(tokenSource))
	if err != nil {
		glog.Error("Could not create pubsub Client: %v", err)
	}
	topics := client.Topics(ctx)
	for {
		topic, err := topics.Next()
		if err == iterator.Done {
			break
		}
		if err != nil {
			glog.Error("Error listing topics %v", err)
		}
		glog.V(3).Infoln(topic)
	}
	elapsed := time.Since(start)
	fmt.Println(math.Round(elapsed.Seconds() * 1000))
}

In which response is the latency in milliseconds.

I must note: this whole procedure ONLY applies to the inital acquisition of the access_token. In normal usecases, you can reuse an access_token or even the id_token until it expires (normally 3600s). What that means is the latency described below is only to get the first token for most usecases.

The additional (and what i see as primary), is that this mechanism is more reliable: it does not require and intermediate exchange for an access_token

Bakeoff!

The following sample runs through the abridged flow against the standard ServiceAccount oauth flow where the full cert is loaded already and the measure is the Percentile Latency

I ran each mechanism 100 times on the same computer separately (and yah, trust me, the workstation where Iran it had lots of compute and very high network bandwith to GCP endpoints!).

  • ServiceAccount Flow
╔════════════════╦════════════════════╦
║   Percentile   ║   Latency (ms)     ║
╠════════════════╬════════════════════╣
║    50          ║   585              ║
║    90          ║   613              ║
║    95          ║   637              ║
║    99          ║   742              ║
╚════════════════╩════════════════════╩
  • AccessToken Flow
╔════════════════╦════════════════════╦
║   Percentile   ║   Latency (ms)     ║
╠════════════════╬════════════════════╣
║    50          ║   289              ║
║    90          ║   442              ║
║    95          ║   448              ║
║    99          ║   572              ║
╚════════════════╩════════════════════╩

As you can see, in any bracket, the lack of the additional roundtrip makes a difference in getting and making the same API call!

This site supports webmentions. Send me a mention via this form.