Digital Eldest Son (Part 3)
All right, let me go through my notes. So, more recording. I was experimenting with various post-exploitation techniques, many of which can actually be executed from userland. This is building upon the concept of “Digital Eldest Son.” So, I guess this will be Digital Eldest Son: Part Three, and I want to explain how a malware sandbox could be useful for spreading misinformation.
Malware sandboxes can be used to spread worms/clickfix attacks through misinformation with telecom APIs!
Did you know that malware sandboxes—even like a standard antivirus sandbox—can be used to spread misinformation? Maybe the sandbox doesn’t allow the malware to run effectively because most attacks these days are fileless and multi-staged. But there is another way. There are APIs that you can use to send pre-recorded audio messages and text messages because a lot of telecommunications involving phones are actually abstracted away.
This is not like the old days. My first time using a VoIP client was a dial pad. Back then, I had to start up my 56k modem—well, actually, I had to disconnect from the Internet on my 56k modem, which was never 56k anyway—and then run a dial pad. Back then, I don’t believe they had XML, so I had to manually dial numbers. But these days, you can automate it. You can route calls and drop MSI installers that automatically configure, through an XML file, a predetermined list of numbers that you want to dial to. You can even send in a soundbite through an MP3 player to make a call, issue commands, and you can use this because sandboxes allow egress traffic for threat intelligence reasons. But at the same time, you can abuse a sandbox to spread misinformation.
Previously, I have shown you in my Substack article how to compute valid phone numbers—or likely valid phone numbers—in PowerShell, C#, and most of the .NET framework, allowing you to send messages almost entirely in memory. Because as it compiles in memory, it calls csc.exe (the C# compiler), and it briefly leaves behind an artifact, but then it eventually deletes itself. So that’s one way. Using what I called “Christian Bale’s John Preston’s Gun Kata method” to create like a wormable-like service, so you can issue a malicious command which someone can receive on their phone. You can issue a malicious command which sends a message over an API—many of them are free, by the way—and then it will send them a text message or a voice message telling them to run a command. Kind of like “clickfix” attacks.
Sandbox evasion via audio device enumeration (only sandboxes, not VMs)
Now that we got that out of the way, I want to tell you something else. What if you don’t want your malware to be detected by the sandbox? You want to escape the sandbox and run. I have looked at this or started playing around with it, and surprisingly, a lot of malware sandboxes are still very easily detectable by network cards. For some reason, like if you did not know, 0x52 on the device and MAC address is a virtual MAC address for virtual network interface cards (specifically QEMU/KVM) amongst other ones that are well-known. You can also check for registry keys to determine if it’s a VM.
But we’re talking explicitly about sandboxes. There is another trick. Virtual machines do have audio devices. Every hypervisor—VMware, Proxmox, VirtualBox—they would virtualize an audio device. But a container or malware sandbox, which is much more lightweight and runs briefly to sniff around the malware and then attempt to determine if it’s malicious? They generally do not.
Something about sandboxes is they don’t actually pass through display devices as an actual video card. They do for virtualized video cards, but they can’t do things like pass through an audio device. When I was a kid, we still had Sound Blaster physical audio cards, and we also had integrated Turtle Beach cards on our motherboards. It was a separate device. I think it wasn’t until around the year 2000 that we started obsoleting physical sound cards; back in the day, you needed RAM, CPU, hard disk space, and you needed to buy a sound card with your video card. But for some reason, modern malware sandboxes do not seem to pass through an audio device, and you can actually enumerate an existing audio device and determine whether or not it is existing. That is a significant weakness, and sandboxes have not gone very far in evolution.
For some reason, people have gone for different types of “Detection and Response,” which is mostly some evolution of EDRs, right? You have EDR, NDR, and then a new term called ITDR. Many of these things are reactive because once you bypass antivirus and sandboxes, you now have to deal with evading some form of detection response, which is not designed to replace antivirus.
But going back: first, we have to break out of the sandbox. There are very few courses on the Internet that will teach you how to evade a sandbox, although it could be taught to you for free on TryHackMe—actually in the Red Team path. I think it’s free. But another trick that you can do is check for sound devices. So you can actually enumerate whether or not the sandbox has a sound device. It’s more reliable because checking for graphics devices is difficult; a lot of non-dedicated GPUs have some sort of integrated graphics device and a device handle that represents, for example, my Intel GPU. But for some reason, they do not really check for audio devices.
I initially thought about using the beep.sys driver, because every driver in Windows (that includes beep.sys, which obviously makes beeps) allows you to call functions. I decided to abandon that because apparently checking if beep.sys is loaded, or whether or not it returns valid states, was inconsistent. You used to be able to play noises even at frequencies that you might not be able to hear because you could specify a frequency and then check LastError or GetLastError on top of making your dog jump. It’s also not good for OPSEC, and it’s actually not definitive of a sandbox.
But the Windows Audio Service is usually disabled or stubbed in a Defender-style sandbox and does not have audio endpoints. In fact, if you call the IMMDeviceEnumerator class and then call EnumAudioEndpoints, you might find nothing. So you might find a dummy device or nothing at all. However, not having an audio device does not mean no sound output; I believe you can beep on the motherboard. You can actually make a motherboard beep through beep.sys.
I’m just looking at the handle. Open the handle to beep: CreateFile wide beep. There are only three meaningful ones: BeepSet, BeepStop, SetFrequency, and duration for our tone, which maps to the Hardware Abstraction Layer to make a beep.
I’m going to be taking more notes from this transcript. I’m going to see if I can check rendering endpoints, which I talked about before. IMMDeviceEnumerator class function EnumAudioEndpoints. Check IAudioSessionManager2. Query driver name, device ID, and state. Attempting audio capture output and measuring latency or timing jitter is also useful, which is true because it would have executed a hypercall.
When hypervisors first came out—or basically virtual machine hosts, right—they had all kinds of buzzwords. So you had a syscall, which executes on the host, and a hypercall, which executes from the guest of the hypervisor. And because of all this shit, that’s why they call it Hyper-V (or maybe intended: “Everything Hyper”). But a hypercall is always slower than the syscall unless you can break out of that virtual machine. So you can always do attempt audio capture or output and measure latency or timing jitter. It’s always a double-digit percentage of your host’s RAM when the Defender sandbox runs it.
Check for the presence of USB/PCI audio devices in Device Manager. A fully stubbed sandbox will return no endpoints, fail to open WASAPI streams, fail to enumerate mixers, or return generic device names like “Microsoft Audio” or “Render Device,” all of which is stronger than the beep.sys driver.
Valid parameters for running or making beeps in beep.sys: you only can do 37 to 32767Hz, and you cannot use a duration of zero. We could probably use a combination of both audio device enumeration and whether or not it could beep, but someone’s pet could hear it. Most pets and animals that people keep as pets—and not just dogs—can hear noises that we cannot process. So if the cat jumps next to a SOC analyst while she’s working at her home, that’s kind of a problem.

