At Ubicloud, we’re building an open and portable cloud. One of the first cloud services we released was Elastic Compute; and I was reading a related Git commit messages the other day. It’s interesting how we used to talk about CPUs. Now, we talk about CPUs, vCPUs, threads, cores, nodes, dies, sockets, and packages.
https://github.com/ubicloud/ubicloud/commit/d30c157f2f1b2f3d7146824b836da7f4d6aa3973
Given the abundance of terms, we wanted to write a blog post that explains what all of these terms mean. Here’s a 5 minute explanation.
Ubicloud uses Linux KVM for virtualization and the Cloud Hypervisor as our Virtual Machine Monitor. Although these projects are in related spaces, they use different terminology. Here’s how the different terms match up respectively.
* Linux /proc & lscpu: cpu or thread, core, die, socket, node
* Cloud Hypervisor: vCPU or thread, core, die, package, node
So what Linux calls a socket, the Cloud Hypervisor calls a package. Once you know this, you can then map terms across projects.
Socket is the oldest of these concepts. It's a receptacle, a physical connector linking the processor to the computer’s motherboard. Most PCs in the 1990s had just one CPU and so just one CPU socket. If you wanted another CPU then you needed a motherboard with another CPU socket. Two-socket PC motherboards first appeared around 1995. A processor package sits in this socket and contains one or more dies.
A die is a single piece of silicon that can contain any number of cores. A processor die is where the transistors making up the processor reside.
In the mid 2000s, AMD and Intel started taking multiple CPUs (as they were defined back then) and putting them in the same package. What we referred to as a CPU became a physical core. Total core count today is an important metric for performance.
For example, a dual-core processor is a processor package that has two physical cores inside. It can be either on one die or two dies. Often, the first generation multi-core processors used several dies on a single package. Modern designs put them on the same die; and this gives advantages like being able to share an on-die cache.
Finally, we have threads. These are logical processors that run within the same physical core. Intel popularized this notion in 2002 with "Hyper-Threading Technology." Hyper-threading, or in general simultaneous multithreading, is a hardware optimization to an old problem. Processors are often data-starved, waiting on I/O requests from slower storage. When this happens, the OS context switches, but that costs a few thousand CPU cycles. Hyper-threading solves the context switch problem by having a second set of these super-fast registers already loaded and ready to go.
In most x64 server configurations, the ratio of threads to cores is 2:1.
Historically, all memory on x86 architectures are equally accessible by all threads. Known as Uniform Memory Access (UMA), access times are the same no matter which thread performs the operation.
Non-Uniform Memory Access (NUMA) tends to intrude when multiple sockets are involved. In a 2+ socket system, each CPU socket has its own memory that it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specify which part of system memory is local to which thread.
You can configure the NUMA in your system to behave such that it gives the best possible performance for your workload. You can for example allow all threads to only access local memory, all memory, or give preference to local memory. This setting then changes how the Linux scheduler will distribute processes among available threads.
Tying this to the above terminology, many processor packages support one to four NUMA nodes. Most configurations interleave all the NUMA nodes supported into one. However, for the benefit of specialized, NUMA-aware workloads, it is possible to decrease memory latency by declining to interleave the memory access of multiple cores.
AWS uses vCPU to indicate VM processing power and defines them as they are explained in this blog post. More specifically:
CPU architectures came a long way in the past twenty years. We now have terms to distinguish between concepts such as threads, cores, dies, sockets, and nodes. Further, different projects use different terminology. Add to that the public cloud and new CPU architectures such as ARM, things become confusing fast.
If you need to quickly remember these terms in the future, hopefully, this blog post will help. If any of this sounds interesting to you, and you have questions or feedback about Ubicloud, please drop us a line at [email protected]