jacoblatonis.me

← Back to blog

Published on 01/04/2024 00:01 by Jacob Latonis

100 Days of Yara in 2024: Day 04

Back to parsing some more data from the Mach-O headers and your regularly scheduled programming. On day 02, we covered parsing UUID load commands, as it was some metadata that is generated and then included in the binary, maybe. If it is spoofed, we could possibly cluster on it if a malware author doesn’t chage it for various parts of the development cycle.

Motivation

I’d like to continue that trend of parsing more data from the Mach-O binaries that allows us a similar view into the development environment or values assigned in metadata about the binary. As such, I chose to parse the version_min_command data structures. This covers the following load commands:

These pieces of metadata can be used to comb through numerous binaries and only find specific ones, filter out older (or newer) compiled binaries, and a lot more. I think there’s some potential here for clustering too with combining it with other information that we can glean from development environments of TAs.

For context on the purpose of this data in Mach-O files, it allows the developer to set what versions of the OS the application should run on. It could be based on calling system APIs only available or up to a certain version OR it could be arbitrary and just selected during the development cycle. Regardless, it is still an interesting piece of metadata we should have access to when writing YARA rules.

Structure

The load commands all follow the same structure, the only difference is the actual initial load_command value that lets us know which device the data is for: macOS, iPhoneOS, WatchOS, or TVOS. We’ll open up that loader.h file for the Mach-O header for reference.

/*
 * The version_min_command contains the min OS version on which this 
 * binary was built to run.
 */
struct version_min_command {
    uint32_t	cmd;		/* LC_VERSION_MIN_MACOSX or
				   LC_VERSION_MIN_IPHONEOS  */
    uint32_t	cmdsize;	/* sizeof(struct min_version_command) */
    uint32_t	version;	/* X.Y.Z is encoded in nibbles xxxx.yy.zz */
    uint32_t	sdk;		/* X.Y.Z is encoded in nibbles xxxx.yy.zz */
};

Parsing

We can see from the struct above, there’s not too much here to parse. Theres two unsigned 32-bit integers we’re concerned with: version and sdk. This is simple enough to parse with nom:

  let (input, version) = le_u32(input)?;
  let (input, sdk) = le_u32(input)?;

The fun part is parsing those integers into the appropriate version strings, which is encoded as X.Y.Z is encoded in nibbles xxxx.yy.zz. A nibble meaning 4 bits or half a byte. As such, we can parse the version numbers from the unsigned 32-bit integer and translate it into a string like so:

fn convert_to_version_string(decimal_number: u32) -> String {
    let major = decimal_number >> 16;
    let minor = (decimal_number >> 8) & 0xFF;
    let patch = decimal_number & 0xFF;
    format!("{}.{}.{}", major, minor, patch)
}

Final Result

Looking at the goldenfiles used for testing, we can see the following is now being parsed:

  min_version_mac_os:
      cmd: 36
      cmdsize: 16
      version: "10.9.0"
      sdk: "10.10.0"

As with all previous days, I have created a PR (#56) for YARA-X :)

Written by Jacob Latonis

← Back to blog