On-Prem

Systems

Huawei releases data detailing serverless secrets

Reveals why your functions start slowly on its cloud and maybe others too


Huawei Cloud has released a huge trove of data describing the performance of its serverless services in the hope that other hyperscalers use it to improve their own operations.

The Chinese giant detailed its ops in a recent pre-press paper [PDF] that reveals Huawei's YuanRong serverless platform has been deployed for over three years across nearly 20 datacenter regions, and processes 30 billion requests each day.

Each of the regions Huawei operates is divided into four clusters. "Clusters provide virtual and physical separations within a region, improving availability and fault tolerance," the paper states.

Next comes an explanation of how Huawei lets users select the resources allocated to functions: by choosing a "resource limit" that defines CPU-memory configurations, such as "300-128" for a rig that offers 300 millicores and 128 MB of memory. The company keeps "pods" of resources ready to run functions and meet escalating demand.

An autoscaler determines if additional pods are required to address incoming requests and, when more power is needed, "pods are taken from the appropriate pool, the code of that function is loaded into it, and it is ready to process requests."

As the paper explains, if a container is not ready to run a function, the pod called into action must perform a "cold start" – the serverless equivalent of booting up into a state in which a function can run.

Pods keep running for a minute even if unused – which Huawei calls "keep-alive time" – after which they'll need to cold start again if required.

All cold starts add "significant latency, degrading application performance," write the paper's eight authors – all of whom are employees at Huawei's Systems Infrastructure Research (SIR) Lab in Edinburgh, Scotland.

Detecting, predicting, and ameliorating cold starts is the focus of the paper, which is based on analysis of data describing 85 billion requests from over 12 million pods, including over 11 million cold starts. The data was gathered over weeks of operation, including one week that featured a Chinese holiday so researchers could capture the impact of usage spikes. That data has been posted to GitHub and includes what Huawei describes as "detailed component times of cold starts from five regions, and examines the effect of function characteristics such as resource allocation, runtime language, and trigger type."

Cold starts are a known issue. But Huawei's authors assert that the data they've disclosed matters because previous literature mostly considered "high-level metrics from a single region with little discussion of components and the effect of factors such as runtime language, resource allocation, and trigger type on the number of cold starts and their component times."

Huawei Cloud therefore claims its data is the first release of its type.

The paper essentially concludes that cold starts happen for lots of reasons – among them variability between Huawei Cloud's own datacenters, the complexity of the function, or the languages and runtimes used.

It also concludes that users and operators of serverless platforms mostly feel that multi-region operations are inherently risky – but suggests the latency involved in running functions across multiple datacenters could be less impactful than the time required to wait for a cold start. The paper also suggests possible improvements to pod scheduling, and optimization of keep-alive time, to enhance serverless performance.

The data dump is just the second Huawei's SIR Lab has posted to GitHub. The paper will be presented at the EuroSys 2025 conference in Amsterdam, which kicks off in March. ®

Send us news
4 Comments

US adds Chinese RISC-V player that TSMC suspected of helping build Huawei GPUs to risky company register

Sophgo scores a place on Entity List, Indian nuclear boffins taken off

Brit government contractor CloudKubed enters administration

Home Office, Department for Work and Pensions supplier in hands of FRP Advisory

AWS adds 32-vCPU option and an easier on-ramp to its cloudy desktops

Weirdly, this shows the weakness of hosted Windows with an admission about vidchats

With AI boom in full force, 2024 datacenter deals reach $57B record

Fewer giant contracts, but many more smaller ones, in bit barn feeding frenzy

Cryptojacking, backdoors abound as fiends abuse Aviatrix Controller bug

This is what happens when you publish PoCs immediately, hm?

AI hype led to an enterprise datacenter spending binge in 2024 that won't last

GPUs and generative AI systems so hot right now... yet 'long-term trend remains,' says analyst

Even Netflix struggles to identify and understand the cost of its AWS estate

If you have trouble keeping track of your various streaming subscriptions, you're gonna love the irony

AWS now renting monster HPE servers, even in clusters of 7,680-vCPUs and 128TB

Heir to Superdome goes cloudy for those who run large in-memory databases and apps that need them

$800 'AI' robot for kids bites the dust along with its maker

Moxie maker Embodied is going under, teaching important lessons about cloud services

Huawei handed 2,596,148,429,267,413, <br> 814,265,248,164,610,048 IPv6 addresses

That's 2.56 decillion of them, destined for use in CDNs and the cloud – and APNIC needed 83 decillion more to handle the request

AMD secure VM tech undone by DRAM meddling

Boffins devise BadRAM attack to pilfer secrets from SEV-SNP encrypted memory

Chinese clouds target small and medium enterprises in APAC in search of growth

Smaller buyers see deep discounts and suddenly worry less about regulatory issues