Off-Prem

Even Netflix struggles to identify and understand the cost of its AWS estate

If you have trouble keeping track of your various streaming subscriptions, you're gonna love the irony


Keeping track of the amount of cloudy resources an org uses, and the cost of doing so, is notoriously tricky – so tricky, indeed, that even Netflix isn't on top of it.

We know this because on Wednesday US time the vid-streamer blogged about its cloud efficiency measures.

The post – penned by senior analytics engineer "Jennifer H" and Pallavi Phadnis, who describes her role as "Data" – opens by noting Netflix's well-known use of Amazon Web Services (AWS) for its cloud infrastructure needs, and that its engineering teams have self-service tools they can use to provision apps in the cloud.

The pair also reveal that Netflix operates a Platform DSE (Data Science Engineering) team, which helps engineering teams "to understand what resources they're using, how effectively and efficiently they use those resources, and the cost associated with their resource usage."

The Platform DSE team's goal is helping "downstream consumers to make cost conscious decisions using our datasets."

To assist in that goal, it's created two tools:

  1. A Foundational Platform Data (FPD) that "provides a centralized data layer for all platform data, featuring a consistent data model and standardized data processing methodology."
  2. A Cloud Efficiency Analytics (CEA) tool that is built on top of FPD and "offers an analytics data layer that provides time series efficiency metrics across various business use cases."

FPD consumes fed data from applications like Apache spark, which records how long cores are allocated to jobs and the amount of data read. CEA is then sent "inventory, ownership, and usage data and applies the appropriate business logic to produce cost and ownership attribution at various granularities," the post explains.

The datasets Netflix generates are highly complex "due to the breadth and scope of the business infrastructure and platform specific features."

"Services can have multiple owners, cost heuristics are unique to each platform, and the scale of infra data is large," Jennifer H and Pallavi Phadnis wrote, before explaining Netflix's platforms often have customizations that mean the Platform DSE team always has plenty to do – including regular audits.

"Maintaining data completeness while ensuring correctness becomes challenging due to upstream latency and required transformations to have the data ready for consumption," they explained.

Their work therefore continues, with both FPD and CEA under development and Netflix "striving for nearly complete cost insight coverage in the upcoming year."

It gets better. The post concludes by revealing Netflix's intention to "move towards proactive approaches via predictive analytics and ML for optimizing usage and detecting anomalies in cost."

You read that right: Netflix, one of the most famous users of public cloud, isn't in total control of its cloud spend and needs to get better at detecting anomalies.

So you're not alone if you struggle to do so, too. ®

Send us news
70 Comments

Ransomware crew abuses AWS native encryption, sets data-destruct timer for 7 days

'Codefinger' crims on the hunt for compromised keys

AWS adds 32-vCPU option and an easier on-ramp to its cloudy desktops

Weirdly, this shows the weakness of hosted Windows with an admission about vidchats

Cryptojacking, backdoors abound as fiends abuse Aviatrix Controller bug

This is what happens when you publish PoCs immediately, hm?

AWS follows Iceberg path to unite analytics platform

But other obstacles remain before developers get free choice of storage and analytics engines

Brit government contractor CloudKubed enters administration

Home Office, Department for Work and Pensions supplier in hands of FRP Advisory

With AI boom in full force, 2024 datacenter deals reach $57B record

Fewer giant contracts, but many more smaller ones, in bit barn feeding frenzy

Broadcom filing mentions major VMware Cloud Foundation releases in March and July

Will they make price rises palatable? Or bring more of what new Netflix lawsuit calls Broadcom's ‘Buy. Chop up. Raise prices' business plan?

AI hype led to an enterprise datacenter spending binge in 2024 that won't last

GPUs and generative AI systems so hot right now... yet 'long-term trend remains,' says analyst

Looming energy crunch makes future uncertain for datacenters

But investors still betting big on bit barns thanks to AI and cloud demand

Workday on lessons learned from Iowa and Maine project woes

Nine in ten of our implementations are a success, CEO Carl Eschenbach tells The Reg

AWS now renting monster HPE servers, even in clusters of 7,680-vCPUs and 128TB

Heir to Superdome goes cloudy for those who run large in-memory databases and apps that need them

AWS says AI could disrupt everything – and hopes it will do just that to Windows

Cloud colossus reckons it can clarify hallucinations, get your apps off Microsoft's OS at pleasing speed