The newer ARMv7 Cortex-A class cores such Cortex-A7, A15 and A17 come with a virtualization extensions which allow to use KVM (kernel virtual machine). The NXP i.MX 7Dual SoC which I worked with lately includes the ARM Cortex-A7 CPU. I went ahead and tried to bring up KVM on i.MX 7. I was not really familiar with the ARMv7 virtualization architecture, so I had to read up on some concepts. This post summarizes what I learned and gives a big picture of software support.
The Hypervisor mode
To provide hardware support for full CPU virtualization an additional privilege level is required. User-space (PL0) uses the SVC (Supervisor) instruction to switch to kernel-space (PL1, SVC mode). A similar separation between Kernel and hypervisor is required. The ARMv7 architecture with virtualization extension calls this privilege level PL2 or HYP mode.
Linux with KVM for ARM uses this mode to provide CPU virtualization. The CPU needs to be in HYP mode when Linux is booting so KVM can make use of the extension. How KVM uses the HYP mode in detail is explained in this excellent LWN article. After building a kernel with KVM support, I encountered this problem first: By default, the system did boot in SVC mode.
Brought up 2 CPUs
CPU: All CPU(s) started in SVC mode.
kvm : HYP mode not available
Secure and Non-Secure world
ARMv7 privilege levels
To understand how to switch into Hypervisor mode, one needs to understand the whole privilege level architecture first. Notable here is that on ARMv7 CPU’s the HYP mode is only available in non-secure mode, by design. Any hypervisor needs to operate in non-secure mode, there is no virtualization extension in secure mode. Read more »
Given two systems, both with a Cortex-A5 CPU, one clocked at 396MHz without L2 cache and one clocked at 500 MHz with 512kB L2 cache. How big is the impact of the L2 cache? Since the clock frequency is different, a simple CPU time comparison of a given program does not answer the question… I tried to answer this question using perf. perf is often used to profile software, but in this case it also proved to be useful to compare two different hardware implementations.
Most CPU’s nowadays have internal counters which count various events (e.g. executed instructions, cache misses, executed branches and branch misses etc…). Other hardware, e.g. cache controllers, might expose performance counters too, but this article focuses on the hardware counters exposed by the CPU. Read more »
The serial console is a very helpful debugging tool for kernel development. However, when a crash occurs early in the boot process, one is left in the dark. There are two early console implementations available with different merits. This post will throw some lights on them.
The kernel supports earlyprintk since… probably ever. At least 2.6.12, where the new age (git) started. After enabling “Kernel low-level debugging”, “Early printk” under “Kernel hacking” and selecting an appropriate low-level debugging port, you are ready to get early serial console output. Read more »
In order to use the Linaro ARM cross-toolchain on Arch Linux, some 32-Bit Libraries need to be available. Arch Linux supports multiarch too, just enable the packages in your /etc/pacman.conf:
Include = /etc/pacman.d/mirrorlist
Update your packages
:: Synchronizing package databases...
core is up to date
extra is up to date
community is up to date
multilib 104.5 KiB 148K/s 00:01 [##################################] 100%
Unlike in Debian based distribution the GNU C Library package is called lib32-glibc. Additionally needed libraries by GCC (such as libstdc++6) are included in the package lib32-gcc-libs.
pacman -S lib32-glibc lib32-gcc-libs lib32-zlib
The Cortex-M3 has a new assembler instruction SVC to call the supervisor (usually the operating system). The ARM7TDMI used to call this interrupt SWI, but since this interrupt works differently on Cortex-M3, ARM renamed the instruction to make sure people recognize the difference and implement those calls correctly. The machine opcode however is still the same (bits 0-23 are user defined, bits 24-27 are ones).
On the Cortex-M3, other interrupts can interrupt the processor during state saving of the SVC interrupt (late arrival interrupt handling). Those late arriving interrupts most certainly leave the registers corrupted after execution. Therefor we cannot read the parameters form registers r0 to r4 directly as we could on the ARM7TDMI using SWI interrupts. Fortunately, the Cortex-M3 saves all registers used in standard C procedure call specification (ABI) on the stack. So the SVC handler can get the parameters directly from the stack.
Cortex-M3 stack frame
Read more »