Back to insights
ProductApril 202612 min read

Q-Audion GEN-1: designing the first European PQC headset

Designing a headset that performs ML-KEM-1024 key exchange and ML-DSA-87 signing on a wearable battery budget required rethinking almost every subsystem of a conventional voice device. This is an architecture-level look at the trade-offs behind Q-Audion GEN-1.

Why a headset, and why now

Voice remains the most operationally critical and cryptographically neglected enterprise communication channel. Executives, lawyers, M&A teams and government negotiators conduct conversations whose strategic value vastly exceeds the technical protections in place. The dominant solution stack, consumer headsets paired over BLE to a smartphone running a soft-client, is built on assumptions that no security architect would accept for any other tier-1 asset: keys in mutable software, side-channels through the host OS, and a transport layer that will be cryptanalytically broken within the working lifetime of most current executives [1].

Q-Audion GEN-1 is among the first products in a deliberately narrow category: a hardware-rooted, post-quantum secured voice endpoint that does not trust the phone, the laptop or the operating system it talks to. Every cryptographic operation happens inside a tamper-resistant Secure Element. Every byte of plaintext audio exists only inside an isolated DSP domain. The host device is treated as a hostile network, not as a peer.

We started the GEN-1 programme with a single architectural commitment: if the design forced any compromise between cryptographic strength, side-channel resistance and operational ergonomics, ergonomics would lose. The result is a device that is heavier than a consumer headset, draws more current than a true-wireless earbud, and requires a few seconds for first-use pairing. Every one of those numbers is a deliberate consequence of the threat model.

The threat model in one paragraph

Q-Audion GEN-1 must protect the confidentiality and integrity of a voice conversation against an adversary who controls the host device (laptop, phone, headset gateway), can observe and modify radio traffic, can mount physical side-channel attacks against the device for short windows, and may have access to a fault-tolerant quantum computer at some point during the operational life of the data being protected. The adversary is assumed not to have sustained, multi-hour physical custody of the device with laboratory-grade equipment; that is the boundary at which the threat model defers to physical security policy.

Every design choice that follows is a direct translation of one or more clauses of this threat model into hardware, firmware or protocol decisions. There is no other guiding principle.

Two-domain hardware architecture

Internally the device is split into two electrically and logically isolated domains. The Application Domain handles BLE, the user interface, battery management and the radio link to the paired host. It runs a constrained RTOS on a general-purpose Cortex-class MCU sourced from a European secure-silicon vendor. This domain is treated as semi-trusted: it can fail, be reprogrammed or be compromised without endangering the cryptographic state.

The Secure Domain is built around a certified Secure Element with an integrated PQC accelerator and a dedicated audio DSP. This domain holds the long-term identity key, performs all ML-KEM-1024 [1] and ML-DSA-87 [2] operations, and is the only place where decrypted audio samples exist. The two domains communicate over a hardware-gated, fixed-format command channel with no direct memory access from the Application side into the Secure side. There is no shared bus, no DMA window, no debug path that crosses the boundary in production firmware.

This split is the single most expensive architectural decision in the device. It roughly doubles the silicon cost compared to a single-MCU design, and it makes firmware updates considerably more complex. It also makes the entire class of host-side compromise irrelevant: even with full control of the Application Domain and the paired host, an attacker cannot extract a key, decrypt a stored frame or forge a signature.

The cryptographic core

Identity is established at provisioning time by generating an ML-DSA-87 keypair entirely inside the Secure Element [2]. The private key never leaves the SE and is bound to the device's unique unclonable identifier. The public key is signed by the BCrypto provisioning CA, producing a device certificate that is later presented during pairing. There is no path, in firmware or in factory tooling, to export the private signing key.

Session establishment uses hybrid ML-KEM-1024 with X25519 [7]. When two Q-Audion devices establish a call, each generates an ephemeral keypair from both schemes, exchanges the public keys over an authenticated channel (signed with the long-term ML-DSA-87 identity), and derives a 256-bit session key via HKDF-SHA384 over the concatenation of both shared secrets, with explicit length prefixes and domain separation. The classical and post-quantum legs are independent: compromise of either, in isolation, does not weaken the resulting session key.

Audio transport uses AES-GCM 256 with a 96-bit nonce constructed from a per-session salt and a strictly monotonic 64-bit frame counter. The frame size is 20 ms of Opus at 16 kHz, producing roughly 320-byte ciphertexts. Replay protection is enforced by a sliding window of 1024 frames. Out-of-order frames within the window are accepted; frames outside the window are dropped without further processing. This design tolerates the jitter inherent to BLE without weakening replay defences.

The MEMS array and acoustic isolation

Microphone selection is rarely treated as a security decision, but for a device that claims to protect a conversation against a remote adversary it is one of the most consequential. Q-Audion GEN-1 uses a three-element MEMS array with beamforming targeted at the user's mouth, positioned to maximise rejection of ambient sources beyond about 40 centimetres. The beamformer runs inside the Secure Domain DSP, on raw PDM samples that never cross the domain boundary in plaintext form.

The acoustic objective is to make remote eavesdropping through a laser microphone, a parametric beamforming attack from a nearby device or a compromised host microphone substantially harder. None of these threats can be eliminated by a headset alone, but each can be raised in cost. The MEMS array reduces the signal available to an off-axis attacker by 20 to 30 dB compared to a single-element omnidirectional microphone, which translates into a meaningful change in the engagement envelope of a typical eavesdropping scenario.

The same DSP path includes a hardware mute that physically interrupts the analog frontend of the microphone array. Software-only mute, the dominant pattern in consumer headsets, is unacceptable in our threat model: a compromised firmware can ignore it. The hardware mute is exposed through a dedicated button whose state is independently sensed by both domains; a mismatch triggers a fail-secure shutdown of the audio path.

Side-channel and tamper resistance

All PQC primitives are implemented inside the Secure Element using a constant-time arithmetic library certified against the relevant side-channel resistance profiles. The ML-KEM decapsulation path in particular is audited against timing, power and electromagnetic emanation channels, and the implementation is masked at first order against differential power analysis [6]. We rely on the Secure Element vendor's certification rather than rolling a custom implementation: this is one of the few places in the design where commodity certified IP is strictly safer than bespoke work.

Physical tamper resistance is targeted at the EAL 4+ level, in line with the certification profile of comparable commercial Secure Elements [5]. The Secure Domain is encapsulated under a tamper-evident potting compound, and the device chassis includes intrusion-detection traces that trigger zeroisation of all session and identity material upon mechanical breach. Cold-boot resistance is provided by a charge-loss volatile keystore for ephemeral material and by encrypted storage for long-term material.

Firmware is signed with an ML-DSA-87 root key held offline at BCrypto [2]. Updates are delivered over the air via the paired host but verified inside the Secure Domain before being committed to the boot bank. A rollback counter, anchored in one-time-programmable fuses, prevents downgrade to a vulnerable previous version. The boot path enforces a measured boot of both domains, with the Secure Domain measurement extending the Application Domain measurement so that a compromise of either is detectable on the next reboot. The cryptographic module is targeted at FIPS 140-3 Level 3 [3].

Power, range and the limits of the form factor

Running ML-KEM-1024 key exchange and continuous AES-GCM 256 encryption inside a headset battery envelope is feasible only because the PQC accelerator inside the Secure Element performs each handshake in tens of milliseconds at sub-milliwatt average power. A naive software implementation on a general-purpose Cortex-M would burn several joules per handshake and dominate the battery budget. The hardware accelerator is the enabling component.

Realistic talk time on a single charge is approximately 8 hours for an active call, or roughly 14 days of idle standby with paging beacons active. These numbers were targets, not constraints: we sized the battery to meet them after the cryptographic load was characterised, rather than reducing security to fit a pre-chosen battery. The result is a heavier device than typical consumer headsets, which we consider an acceptable trade for the threat model.

BLE range is intentionally limited to approximately 5 metres at default transmit power, well below the BLE 5.x maximum. The reasoning is that any need to operate at longer range implies a paired host that is not in the user's immediate physical control, which is itself a security concern that no transport-layer countermeasure can address. The design pushes the user toward a deployment posture in which the host device is on their person, not across the room.

What we did not include in GEN-1

Q-Audion GEN-1 deliberately does not include voice biometrics, AI noise suppression, or any cloud-side processing. Each was evaluated and rejected. Voice biometrics add an attack surface (template extraction, replay with synthetic audio) that exceeds the security benefit they provide. AI noise suppression at the quality expected today would require model weights that we cannot fully audit and computation that we cannot constrain to the Secure Domain. Cloud processing of any kind is incompatible with the threat model.

We also do not ship a directory service, a presence indicator or any out-of-band metadata about who is online. The device performs point-to-point pairing with explicit user confirmation on both ends, and a call either connects or fails without leaking even the fact that an attempt was made to anyone outside the two endpoints. This is friction; it is also the only honest answer to a model in which the network is the adversary.

GEN-2, currently in early architecture, will revisit some of these choices in light of dedicated on-device NPU silicon that has emerged recently. For GEN-1, however, the design is intentionally minimal. It does one thing: it makes a voice call between two consenting users that an adversary holding ciphertext today cannot read in the second half of the 2030s.

What this means for buyers

Q-Audion GEN-1 is not positioned against consumer headsets and should not be compared against them on the dimensions where consumer products optimise. It is a category of one: a wearable voice endpoint engineered to the threat model of executive protection, sovereign communications and high-stakes negotiations. The right buyer evaluates it against the cost of the conversation it protects, not against the cost of comparable consumer audio hardware.

If your organisation has voice traffic whose content would materially harm you if read by an adversary in five or ten years, the conventional answer (consumer headset, soft-client, TLS) is no longer adequate. Q-Audion GEN-1 is one defensible answer. The architecture choices behind it, the two-domain split, the certified Secure Element, the hybrid PQC [7][8], the hardware mute, are the engineering vocabulary you should expect from any vendor making credible claims in this space. The CRA [4] will, from December 2027, make many of these choices a legal floor rather than a differentiator.

References

  1. NIST FIPS 203 — Module-Lattice-Based Key-Encapsulation Mechanism Standard
  2. NIST FIPS 204 — Module-Lattice-Based Digital Signature Standard
  3. NIST Cryptographic Module Validation Program — FIPS 140-3 standards
  4. Regulation (EU) 2024/2847 — Cyber Resilience Act
  5. Common Criteria Portal — Protection Profiles and certified products
  6. Side-channel attacks on Kyber/ML-KEM implementations — IACR ePrint 2023/1933
  7. IETF — Hybrid key exchange in TLS 1.3 (draft-ietf-tls-hybrid-design)
  8. ENISA — Post-Quantum Cryptography: Integration Study

Ready to Secure Your Digital Sovereignty?

Join Europe's post-quantum security revolution. Contact our team for institutional partnerships and strategic collaborations.