SKUs and Resources

Published March 18, 2021

Reading cloud bills is hard. For illustration, below are a few lines from the bill for a small toy project of mine on GCloud. I don’t want to single Google out here—AWS and Azure bills are also difficult to read. In the following, all details are taken from Google Cloud, but I point to related AWS or Azure features when I think it’s useful.

Some of the trouble with reading the bill comes from how the resources I have (such as virtual machines) translate to the entities that the provider bills me for.

SKUSKU IDUsageCost in CHF
Storage PD CapacityD973-5D65-BAB251.24 gibibyte month0.77
Storage PD Capacity in NetherlandsAE8C-46C3-49942.94 gibibyte month0.12
N1 Predefined Instance Core running in EMEA9431-52B1-2C4F2.67 hour0.08
N1 Predefined Instance Ram running in EMEA39F4-0112-6F3910.01 gibibyte hour0.04

This only shows the first few lines out of a total of 61.

I’ll go over what the entries mean next, starting with the meaning of “SKU”. Then comes a bit about how to labels or tags can make this bill more meaningful, and at the end I’ll use all this information to try and solve a little mystery.

What is a SKU?

SKU stands for Stock Keeping Unit (at least that’s what I found on the Internet). AWS, Azure and Google all use SKUs in their billing systems. I find it easiest to think of a SKU as a billing code—it identifies a category of things that a cloud provider charges me for. These categories are very fine-grained, they are (of course) not consistent across cloud providers, and matching Resources to SKUs is not straightforward.

I used a VM of the N1 machine type. The breakdown shows costs incurred between Mar 1st and Mar 16th. The VM only ran for a few hours, and I got billed by the hour. I use the N1 instance for running minikube because sometimes I need to run unit tests that use it and bringing up minikube on my personal computer is a pain in the neck. Google has separate SKUs for CPU and Ram used by my N1 instance. For machine types that support them, there is another SKU for GPUs.

For the storage costs, there are two disks (the PD Capacity lines; PD stands for Persistent Disk). One of them was a zonal disk, the other a regional disk. I deleted the regional disk soon after creating it. The zonal disk has existed since before Mar 1st and it has a size of 100GB. I am getting billed for 51.24 gibibyte month, which makes sense: using the disk for a full month would be 100 gibibytes, we’re on the 16th, the bill is based on a snapshot taken some time earlier this day, hence 51.24. Indeed, the bill for a full month comes to 100 gibibyte. Gibibyte is an extra piece of precision here—often, when specifying data volumes, people use Gigabyte as shorthand for “1000 or 1024 Megabytes, not sure and don’t care which”. In billing, the distinction does matter at some point (though not in this example).

Not shown but also interesting:

Each of the assets (VM, Disk) I create for my project relates to one or more SKUs that I get billed for. Some assets, like the VM, imply other assets (such as an address, a license, a boot disk). The SKUs reflect the billing structure that Google applies to my assets, but lining up the assets as I see them and the SKUs that say how much I pay for them is harder than it looks.

This is because:

Experience helps in dealing with this, but so do labels (or tags if you’re using AWS or Azure).

Aligning your bill with your asset inventory via labels or tags.

The GCloud billing console lets you break down costs by Project, Service, or SKU. Service refers to the Google cloud service (like Compute or Kubernetes Engine; the equivalents for AWS would be EC2 or EKS), and I tend to scan costs service by service. Breaking down by SKU involves a lot of scrolling and clicking; I’d really like a category in between service and SKU here. Google also lets you filter by location (region or zone) and by label. Labels are user-defined key-value pairs that you can attach to resources.

I’ll focus on Google Cloud’s labels here. AWS has cost allocation tags, and Azure has tags. Those are all really cool and useful, but this post got long enough just writing about Google that AWS and Azure will have to wait for now.

The bottom-right corner of the billing console has a drop-down for label keys. That drop-down is populated based on stackdriver monitoring time series that contain data with that label. So what this means is, after you create a resource with a given label, you have to wait until there is monitoring data for that resource, and then the label should show up in the drop-down. This way, you can get the console to show you costs for resources with a particular value for that label. When you do that, you can see that the label applies to “dependent” resources as well—for example, I only labeled a VM, but the cost report for that label value also shows costs for the IP address, network egress, and OS license.

This is nice, but if you want to do grouping by labels and see costs for several different label values at once, I don’t think there’s a way to do that without exporting to BigQuery. AWS have a similar feature where you can export your bill to S3 and then either view it there or import it into e.g. Athena. I haven’t tried this with Azure yet, but they have a pretty comprehensive cost management toolset.

As far as I know, labels do not apply to costs retroactively. Thus if you have a VM and label it halfway through the month, I wouldn’t rely on all the costs related to that VM for that month having the label.

SKUs and prices

Once you have a SKU, you can look up how much you’ll get charged. Google have a SKU explorer that shows you the description (N1 Predefined Instance Core running in EMEA), the price (0.034773 USD per hour) and the regions (europe-west1 – this means the description should really be “in Belgium” rather than “in EMEA”, since there are other SKUs for N1 Cores running in the Netherlands for example).

If you want to know more about your SKUs (and who wouldn’t), Google’s billing APIs let you query SKUs programmatically. You do need to put a SKU Service Name into the API request. You can see the service name in the Cost table on the billing console, but not in the Cost breakdown page. Another way is to query the SKU explorer for a SKU (for example one that showed up on your bill). The explorer shows the SKU service name (6F81-5844-456A for Compute Engine in this example) in the blue bar at the top of its results. Compute Engine is one of the largest SKU services, and it contains a lot of the SKUs I care about in my work.

When you request a SKU via the API, you get back a protocol buffer including a field called PricingExpression. This encodes how the price is calculated. It’s not super easy to read, but the documentation is good.

Amazon also offers an API for obtaining price information. Theirs is arguably more convenient than Google’s because you don’t even need to be authenticated to use the bulk download. If you request pricing information for their compute service (EC2), you’ll get about a Gigabyte’s worth of JSON or CSV (your choice). I find the entries a little difficult to interpret but this, too, will have to wait for another blog post.

Azure has a comparable API as well.

Ideally, I’d like to obtain SKUs (or price information, actually) based on the assets I’ve got. So I’d put an asset specification into a request and ask an API for the SKUs that apply to this asset. As far as I know, Google does not offer this as an API service (if they do, please let me know!).

A Mystery Solved (Maybe)

Finally, let’s put all this information together and solve a mystery. On my Google Cloud bill, I have three charges for the Ubuntu OS license:

SKUSKU IDUsageCost in CHF
Licensing Fee for Ubuntu 16.04 (Xenial Xerus) (CPU cost)2A23-096B-A3721.33 hour0.00
Licensing Fee for Ubuntu 16.04 (Xenial Xerus) (CPU cost)EE6A-FFA3-84D72.67 hour0.00
Licensing Fee for Ubuntu 16.04 (Xenial Xerus) (RAM cost)9169-5341-635D10.01 gibibyte hour0.00

If the costs weren’t all zero, I guess I’d be a little worried. Why are there two charges for “CPU cost”? Browsing the SKUs did not get me additional information. Labels helped me verify that these charges are indeed for the minikube VM (the other VM I run uses Debian). I also verified that I get the equivalent three lines of charges for Debian and Fedora.

So next I compared the Usages reported: 1.33 vs. 2.67 hours for the CPU costs, and 10.01 gibibyte hours for the RAM cost. The VM had had about 80 minutes of uptime, and that’s close enough to 1.33 hours. An N1-Standard-2 machine type has two vCPUs and eight GB of RAM. So maybe one of the “CPU cost” charges is really a “VM cost” charge, because the VM ran for 1.33 hours. And the second “CPU cost” charge is per vCPU, hence 2 * 1.33 ~ 2.67 hours. This matches the RAM charge: 10.01 gibibyte hours are about 10.75 Gigabyte hours, and that’s about 8 * 1.34 Gigabyte hours.

Maybe this means that Google supports billing for licenses based on RAM usage, vCPU usage, VM count, or a combination of those, and they simply created SKUs for all billing modes even though some of them always have a zero price?

However, I could not find all of these SKUs in the explorer. I have a SKU on my bill for March that’s labeled Licensing Fee for Fedora CoreOS Stable (CPU cost) with an id of F9BE-F344-D1B6 but the SKU explorer claims that this SKU does not exist. Maybe there’s some kind of migration or update in progress? Then again, the API data and BigQuery both know about this SKU, so maybe it’s the explorer that is wrong.

On another hand, SKUs for other OSses can be structured in very different ways. For example, here are the SKUs for RHEL7 licenses:

SKUDescription
2894-5229-F604Licensing Fee for RedHat Enterprise Linux 7 (GPU cost)
5744-6443-7F8DLicensing Fee for RedHat Enterprise Linux 7 on VM with 1 to 4 VCPU
9AAA-D841-1CA1Licensing Fee for RedHat Enterprise Linux 7 on VM with 6 or more VCPU
FF54-E5CA-BAC3Licensing Fee for RedHat Enterprise Linux 7 on f1-micro
9AAA-D841-1CA1Licensing Fee for RedHat Enterprise Linux 7 on g1-small
FF54-E5CA-BAC3Licensing Fee for RedHat Enterprise Linux 7 (RAM cost)

By the way, you can get this information out of BigQuery if you have set up billing exports for your account.

This probably shows that there are more ways for Google to implement licenses than just per-VM or per-vCPU, so maybe the Ubuntu etc. licenses were just created with some kind of scripted default values and nobody has bothered to change them because the charges are always zero anyway?

Anyway, I hope this has shed a little light on how to dig into a cloud bill from Google.