Implementing a std::function<>-like wrapper in C++, part 2: generalizing the return type and arguments

Introduction

Previously, we’ve seen a way to implement our own version of std::move_only_function<>. The implementation we ended up with is as follows:

struct MyFunctionInterface
{
    virtual ~MyFunctionInterface() = default;
    virtual int operator()(int, int) = 0;
};

template<typename Fn>
class MyFunctionImpl : public MyFunctionInterface
{
    Fn fn;

public:
    MyFunctionImpl(Fn fn) : fn(std::move(fn)) { }

    int operator()(int a, int b) override
    {
        return fn(a, b);
    }
};

class MyFunction
{
    std::unique_ptr<MyFunctionInterface> fn;

public:
    template<typename Func>
    MyFunction(Func function)
    {
        fn = std::make_unique<MyFunctionImpl<Func>>(
                          std::move(function));
    }

    int operator()(int a, int b)
    {
        return fn->operator()(a, b);
    }
};

This works for any movable function, for example:

auto i = std::make_unique<int>(3);
MyFunction fn([c = std::move(i)] (int a, int b) {
                                  return a + b + *c; });
//             ^ i is moved from the caller into the lambda as c
return fn(1, 2); // 6

Unfortunately, it is restricted to the function prototype int fn(int, int) – during this post, we’ll make the function able to return any type and take any arguments.

Getting started

It is easiest to work top-down for situations like these: what is the most specific place where the prototype is hardcoded? It is in MyFunctionInterface, so let’s make that generic:

template<typename ReturnType, typename... Args>
struct MyFunctionInterface
{
    virtual ReturnType operator()(Args...) = 0;
};

By changing all uses of MyFunctionInterface to MyFunctionInterface<int, int, int> the code compiles again.

Changing the type-erased implementation

We want to change our type-erased function implementation MyFunctionImpl: instead of just taking the stored function Fn type, we need both the return type and the argument types. Right now, we have:

template<typename Fn>
class MyFunctionImpl : public MyFunctionInterface<int, int, int>
{
    Fn fn;

public:
    MyFunctionImpl(Fn fn) : fn(std::move(fn)) { }

    int operator()(int a, int b) override
    {
        return fn(a, b);
    }
};

Our first step is to add return type and argument types as template arguments:

template<typename Fn, typename ReturnType, typename... Args>
class MyFunctionImpl : public MyFunctionInterface<ReturnType, Args...>
{
    /* ... */
};

This won’t yet compile: we need to alter the instantiation of MyFunctionImpl as well:

    template<typename Func>
    MyFunction(Func function)
    {
        fn = std::make_unique<MyFunctionImpl<Func, int, int, int>>(
                        std::move(function));
    }

As a last step, we update the call operator as follows:

    ReturnType operator()(Args... args) override
    {
        return fn(args...);
    }

This will also work if ReturnType is void: you can return from a function returning void as long as the expression result evaluates to void. This is very convenient for generic code.

Moving towards the function wrapper

We are almost there yet: we need to remove the hardcoded types from MyFunction. Currently, we have the following:

class MyFunction
{
    std::unique_ptr<MyFunctionInterface<int, int, int>> fn;

public:
    template<typename Func>
    MyFunction(Func function)
    {
        fn = std::make_unique<MyFunctionImpl<Func, int, int, int>>(
                 std::move(function));
    }

    int operator()(int a, int b)
    {
        return fn->operator()(a, b);
    }
};

We start by adding template arguments and updating the type-erased function’s type:

template<typename ReturnType, typename... Args>
class MyFunction
{
    std::unique_ptr<MyFunctionInterface<ReturnType, Args...>> fn;

public:
    template<typename Func>
    MyFunction(Func function) 
    {
        fn = std::make_unique<MyFunctionImpl<Func, ReturnType, Args...>>(
                    std::move(function));
    }
    /* ... */
};

As MyFunction is now a template and the compiler can’t inference the return type, we update our example:

MyFunction<int, int, int> fn([c = std::move(i)] (int a, int b) {
    return a + b + *c;
});

We need to update our calling operator to benefit from the generic types:

ReturnType operator()(Args... args)
{
    return fn->operator()(args...);
}

We’ve removed all notion of int in our MyFunction! However, our type is MyFunction<int, int, int> instead of std::move_only_function<int(int, int)>. We aren’t done yet!

Making the type more readable and accessible

std::function<int(int, int)> – and the other function wrappers – use int(int, int) in a clever and readable way to separate the return type from the argument types. In order to achieve this, we use a template specialization:

template<typename ReturnType, typename... Args>
class MyFunction;

template<typename ReturnType, typename... Args>
class MyFunction<ReturnType(Args...)>
{
    /* ...
};

The first declaration introduces MyFunction as a template with the named types, and then we specialize it using for instantiations matching ReturnType(Args..). This allows us to change our invocation as follows:

auto i = std::make_unique<int>(3);
MyFunction<int(int, int)> fn([c = std::move(i)] (int a, int b) {
    return a + b + *c;
});
return fn(1, 2); // 6

And we are done! Well… almost!

Handling movable only parameters

The following example doesn’t yet work with our implementation:

MyFunction<int(std::unique_ptr<int>)> pfn([] (auto v) {
    return *v + *v;
});
return pfn(std::make_unique<int>(9));

This doesn’t compile because our current implementation will attempt to copy the std::unique_ptr<int>. We need to make sure it passes the Args exactly as they are supplied when forwarding to the call operator. This is where std::forward<> comes in.

We need to update all calls as follows:

template<typename ReturnType, typename... Args>
class MyFunction<ReturnType(Args...)>
{
    /* ... */

    ReturnType operator()(Args... args) {
        return fn->operator()(std::forward<Args>(args)...);
    }
};

template<typename Fn, typename ReturnType, typename... Args>
class MyFunctionImpl : public MyFunctionInterface<ReturnType, Args...>
{
    /* ... */

    ReturnType operator()(Args... args) override
    {
        return fn(std::forward<Args>(args)...);
    }
};

This ensures the arguments are passed exactly as supplied, removing the potential copies. This allows non-copyable types to be used as arguments. Amazing!

Next time, in the final part of the series, let’s take a look how we can avoid the dynamic memory allocation and instead use a fixed-sized buffer to store the function.

Posted in Programming | Tagged | Leave a comment

Implementing a std::function<>-like wrapper in C++, part 1: type erasing

Introduction

Recently, a chat with a friend peeked my interested: how would you store an arbitrary function and call it, similar to std::function<>. It turned out a plain C function pointer would suffice for this specific use-case, but I got triggered: let’s implement a generic, move-only function wrapper in C++!

What about std::function<> ?

First of all, we won’t be re-implementing std::function<> as it requires the function to be copyable, which is not always desirable: a copy could be very expensive, or even not possible at all. This is why C++23 introduced std::move_only_function<>. Recently, a proposal for std::copyable_function<> was voted into C++26, which seeks to replace std::function<> with a more explicit and more const-correct alternative. Another proposal seeks to deprecate std::function<>, and use std::move_only_function<> and std::copyable_function<> instead.

As for our implementation, it will be closer to std::move_only_function<> than std::function<>. We’ll approach the problem piece-by-piece, and fill out details as they become important. Let’s go!

Tell me about std::move_only_function<>

The idea is that you can store any movable function, for example:

  std::move_only_function<int(int, int)> fn =
      [](int a, int b) { return a + b; };
  return fn(1, 2); // 3

Because it only needs to be movable, you can have the lambda take ownership of move-only objects:

  auto i = std::make_unique<int>(3);

  std::move_only_function<int(int, int)> fn =
      [i = std::move(i)](int a, int b) { return a + b + *i; };
  //   ^ 'i' will now be owned by the lambda stored in 'fn'
  return fn(1, 2); // 6

Getting started

Let’s start by solving a simpler problem: let’s assume whatever function we are going to manage takes two int parameters, and has a int return type (we’ll lift this limitation later – this is mainly to avoid more templates). Because every lambda has a unique type, we need to use a catch-all type Func. We start with the following:

class MyFunction
{
    template<typename Func>
    MyFunction(Func function)
    {
         /* ... store function somewhere ... */
    }

    int operator()(int a, int b)
    {
        /* ... call stored function somehow ... */
    }
};

Ideally, we’d store Func directly inside our MyFunction class, as a member. But this cannot work: Func can be any type (remember that every lambda has its own unique type), and since the size may differ depending on how much is captured, we cannot know the size up front.

Taking a step back, rather than storing the type of Func itself, we rather want to store any function which fulfills the int fn(int, int) prototype. In other words, the actual type of Func is not really relevant for us: as long as whatever we store provides an int operator()(int, int) function, it will suffice. This is the interface to which whatever implements the storage must provide.

Erasing the type

Let’s define the interface that we want to call:

struct MyFunctionInterface
{
    virtual ~MyFunctionInterface() = default;
    virtual int operator()(int, int) = 0;
};

Using this interface, we can create an implementation, which works for any invokable function:

template<typename Fn>
class MyFunctionImpl : public MyFunctionInterface
{
    Fn fn;

public:
    MyFunctionImpl(Fn fn) : fn(std::move(fn)) { }

    int operator()(int a, int b) override
    {
        return fn(a, b);
    }
};

MyFunctionImpl implements MyFunctionInterface, and will just call operator() in fn. This allows us to do the following:

MyFunctionInterface* f =
    new MyFunctionImpl([] (int a, int b) { return a + b; });
return f->operator()(1, 2); // 3

Note that the actual lambda type does not appear anywhere anymore! This is known as type erasure and is the key to implement generic callable functions such as std::move_only_function<> and others.

Back to our function wrapper

Given that we now have a way to store any function that fulfills the int fn(int, int) prototype, we can implement our function wrapper as follows:

class MyFunction
{
    std::unique_ptr<MyFunctionInterface> fn;

public:
    template<typename Func>
    MyFunction(Func function) {
        fn = std::make_unique<MyFunctionImpl<Func>>(
                 std::move(function)
             );
    }

    int operator()(int a, int b) {
        return fn->operator()(a, b);
    }
};

We get some function of type Func in our constructor, and use MyFunctionImpl<Func> to store this implementation, which will type-erase it.

Let’s try whether this works!

    MyFunction fn([] (int a, int b) { return a + b; });
    return fn(1, 2); // 3

Awesome!

Next time, let’s see how we can make it accept any function prototype, instead of only int fn(int, int).

Posted in Programming | Tagged | Leave a comment

Hacking into a Foscam FI9853EP camera, part 2

In my last post, I examined how to get U-Boot access and obtain the flash data from a Foscam FI9853EP camera. Whereas this data is very useful for offline analysis, I wanted to get a root shell so I can poke around in the system and run commands manually.

Extracting the root filesystem

Last time, we extracted a file called kernel.mtd, which contains an U-Boot uImage with the kernel itself. There will likely be an initrd (a root filesystem) embedded within this kernel, as it contains necessary files such as /sbin/init, the password file, startup scripts etc. This will be our initial target, but how do we get to it?

An easy way is to use binwalk, an utility intended to analyze, extract and reverse engineer firmware files. If we just run this on kernel.mtd, we learn the following:

$ binwalk kernel.mtd 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             uImage header, header size: 64 bytes, header CRC: 0xAF395D8A, created: 2017-11-07 02:41:11, image size: 2976536 bytes, Data Address: 0x80008000, Entry Point: 0x80008000, data CRC: 0x36BDB2E2, OS: Linux, CPU: ARM, image type: OS Kernel Image, compression type: none, image name: "Linux-3.4.35"
64            0x40            Linux kernel ARM boot executable zImage (little-endian)
15124         0x3B14          xz compressed data
15356         0x3BFC          xz compressed data

The final line sure looks interesting: there’s a large blob om xz-compressed data there (we know it is large as it is the last item found). Fortunately for us, binwalk can copy all content it finds into files (--extract) and this can be done recursively (--matryoshka), which is convenient as we don’t want it to stop after extracting just the compressed data. Let’s give that a go:

$ binwalk --extract --matryoshka kernel.mtd 

[...snip...]

Scan Time:     2023-03-12 15:10:21
Target File:   /tmp/z/_kernel.mtd.extracted/3BFC
MD5 Checksum:  ac84d449376b514839602c61d6ca2ada
Signatures:    411

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
[...snip...]
3846256       0x3AB070        Linux kernel version 3.4.35
[...snip...]

Scan Time:     2023-03-12 15:10:23
Target File:   /tmp/z/_kernel.mtd.extracted/_3BFC.extracted/4FD8E8
MD5 Checksum:  09ee7e6258feff31f07bfb9538b78478
Signatures:    411

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             ASCII cpio archive (SVR4 with no CRC), file name: "tmp", file name length: "0x00000004", file size: "0x00000000"
116           0x74            ASCII cpio archive (SVR4 with no CRC), file name: "root", file name length: "0x00000005", file size: "0x00000000"
[...snip...]

Binwalk has found a cpio archive (which is what initrd uses) and extracted it for us to _kernel.mtd.extracted/_3BFC.extracted/_4FD8E8.extracted/cpio-root. We can start looking around.

Yay, root! (or not)

Let’s poke around and see if there is anything interesting:

$ cd _kernel.mtd.extracted/_3BFC.extracted/_4FD8E8.extracted/cpio-root
$ cat etc/passwd
root:$1$uYfJBoag$N8ofdlVBVcfzOY7utbTfo0:0:0::/root:/bin/sh

Cool, we have the root password hash! Unfortunately, I’m not patient enough to feed this to a password cracker such as Hashcat.

We could try to update the passwd file and repack the initrd, but this is very tedious: we’d have to re-create the U-Boot uImage, with the proper headers, compression and all that. Perhaps there’s an easier way?

The boot process

The first userland program any UNIX-y system runs is called init (which is often implemented by systemd these days). Let’s see what our camera uses:

$ ls -ld sbin/init
lrwxrwxrwx 1 rink rink 14 Mar 12 15:10 sbin/init -> ../bin/busybox

Busybox is very common on embedded devices and this camera is no exception. Googling around for a bit learns us that Busybox’s init uses /etc/inittab, so let’s see what that holds (comments stripped)

::sysinit:/etc/init.d/rcS



::respawn:/sbin/getty -L ttyS000 115200 vt100 -n root -I "Auto login as root ..."

::restart:/sbin/init

::ctrlaltdel:/sbin/reboot
::shutdown:/bin/umount -a -r
::shutdown:/sbin/swapoff -a

I’m not sure about the auto-login bit, as it certainly greets me with a login prompt and not a root shell. However, we do learn that it runs /etc/init.d/rcS on startup, so let’s look at that:

#!/bin/sh
/bin/mount -a

[ ...snip ascii art... ]

for initscript in /etc/init.d/S[0-9][0-9]*
do
	if [ -x $initscript ] ;
	then
		echo "[RCS]: $initscript"
		$initscript
	fi
done

OK, so it runs /etc/init.d/S<number>*. What do we have there?

$ ls -l etc/init.d
-rwxr-xr-x 1 rink rink 380 Mar 12 15:10 rcS
-rwxr--r-- 1 rink rink 788 Mar 12 15:10 S00devs
-rwxr--r-- 1 rink rink 111 Mar 12 15:10 S01udev
-rwxr-xr-x 1 rink rink 828 Mar 12 15:10 S80network
-rwxr--r-- 1 rink rink 320 Mar 12 15:10 S90hibernate
-rwxr--r-- 1 rink rink 232 Mar 12 15:10 S90init

Listing all of these would be boring, so let’s summarize what happens:

S00devs creates several devices in /dev, and mounts several filesystems:

  • /dev/mtdblock2 -> /mnt/app (squashfs)
  • /dev/mtdblock3 -> /mnt/app_ext (jffs2)
  • /dev/mtdblock4 -> /mnt/para (jffs2)

It will then add some symlinks:

  • /mnt/app/mtd/boot.sh -> /mnt/mtd/boot.sh
  • [more which are not interesting for now]

S01udev, S80network and S90hiberate are not interesting for our purposes so I’ll skip them.

S90init is the final script. It will try to run either /mnt/para/Debug/boot.sh or /mnt/mtd/boot.sh (note that this is the symlink created by S00devs). These scripts reside on /mnt/app (squashfs) or /mnt/para (jffs2), which means they aren’t contained in the initrd! We ought to go with patching one of them and see what happens.

So we have a choice: we can create /Debug/boot.sh on the app_ext2 jffs2 filesystem, or patch /app/mtd/boot.sh on the app squashfs image. Whatever we put in those scripts will be run as root, and thus we can do whatever we want and hopefully end up with our own root user!

Deciding which one to use

I decided to patch the app squashfs – mainly because the para jffs2 contains settings and I wasn’t sure when it was updated (squashfs images are always read-only so no problem there). Initially, I went with the para jffs2 image but it turned out my flash dump was old and vital configuration files were missing, causing all kinds of problems. Maybe I’ll try again later.

At any rate, once we’ve patched the appropriate image, we can use U-Boot to write it into the flash and we’re done.

Extracting and patching the app squashfs

First step is to extract the image. It turns out there is a dedicated utility available, unsquashfs, which does exactly as it says on the tin:

$ unsquashfs app.mtd
Parallel unsquashfs: Using 24 processors
426 inodes (531 blocks) to write

[========================================================================================|] 957/957 100%

created 422 files
created 36 directories
created 4 symlinks
created 0 devices
created 0 fifos
created 0 sockets
created 0 hardlinks
$ ls -l squashfs-root/mtd
drwxrwxrwx 8 rink rink  4096 Nov 16  2017 app
-rwxr-xr-x 1 rink rink 31253 Nov 16  2017 boot.sh
-rwxr-xr-x 1 rink rink    29 Nov 16  2017 pkg_info
-rwxr-xr-x 1 rink rink    34 Nov 16  2017 resolv.conf

That seemed to have worked! If we look at squashfs-root/mtd/boot.sh, it’s a shell script of almost 900 lines that does a lot of stuff. It ends with the following steps:

hostname IPCamera
#if [ -f ${APP_DIR}/zbin.tar.xz ];then
#   echo "The first time boot OK"
#else
    #rtctool -rtctosys
    MsgServer &
    update_ver
    killall udevd # free 464k mem memory
    sleep 5
    /usr/bin/watchdog &
#fi

I found it interesting to see they killed udev after startup to reclaim extra memory. I added the following lines after hostname IPCamera:

echo "hello world"
echo "qq::0:0::/root:/bin/sh" >> /etc/passwd
/usr/sbin/telnetd &

This should show me that the modifications are active (by showing hello world) it would add a root user with the username qq and it would launch telnetd (which I also noticed on the image) so I can log in remotely.

Repacking the app squashfs

With the changes in place, we need to create our own squashfs image to flash into the device. However, we do need to make it compatible with the camera: it is pretty old and may not support everything squashfs supports these days.

Thankfully, unsquashfs has a -stat option to show this information:

$ unsquashfs -stat app.mtd 
Found a valid SQUASHFS 4:0 superblock on app.mtd.
Creation or last append time Thu Nov 16 08:51:11 2017
Filesystem size 11298912 bytes (11034.09 Kbytes / 10.78 Mbytes)
Compression xz
Block size 131072
Filesystem is exportable via NFS
Inodes are uncompressed
Data is compressed
Uids/Gids (Id table) are uncompressed
Fragments are not stored
Xattrs are compressed
Duplicates are removed
Number of fragments 0
Number of inodes 462
Number of ids 2
Number of xattr ids 0

A tool called mksquashfs can pack a directory into a squashfs image. Using the information above and verifying using unsquashfs -stat, I managed to find the correct flags after a few tries:

$ mksquashfs squashfs-root app-patched.mtd -comp xz -noI -no-fragments

All that remains is to flash this into the camera!

Flashing the image

Since we have U-Boot access, the easiest way is to TFTP our patched image into memory and flash it. The previous blog post detailed the offsets of the flash, but I’ll repeat them here:

Creating 5 MTD partitions on "hi_sfc":
0x000000000000-0x000000080000 : "boot"
0x000000080000-0x000000380000 : "kernel"
0x000000380000-0x000000e80000 : "app"
0x000000e80000-0x000000f80000 : "app_ext"
0x000000f80000-0x000001000000 : "para"

This means our new app-hacked.mtd must be written to offset 0xe80000 and is 0xe80000 - 0x380000 = 0xb00000 bytes in length. First, we need to download the new image to memory:

hisilicon # setenv serverip 192.168.1.1
hisilicon # setenv ipaddr 192.168.1.2
hisilicon # tftp 82000000 app-hacked.mtd

Then we have to initialize the flash, erase the correct bytes and write our new content:

hisilicon # sf probe 0
hisilicon # sf erase 380000 b00000
hisilicon # sf write 82000000 380000 b00000

Now on to the scary part: trying it out!

Trying it out

Resetting the camera, keeping our fingers crossed and watching the boot logs is promising:

init phy power successful!
load hi_mipi driver successful!
==== Your input Sensor type is ov9732 ====
com v200 ptz
ptz_init start ptz_state[1],HorPos[0],VerPos[0]
setup ptz gpio success
setup zoom gpio success111
ptz_init end ptz_state[1],HorPos[-1],VerPos[-1],flag[0]
Hisilicon Watchdog Timer: 0.01 initialized. default_margin=60 sec (nowayout= 0, nodeamon= 0)
mkdir: can't create directory '/usr/local': File exists
hello world
 #### APP_VER_0:  2 == 2 ? #### 
 #### not need to rewrite appVer_item0,old: 2 #### 
 #### APP_VER_2:  2 == 2 ? #### 
 #### not need to rewrite appVer_item2,old: 2 #### 
 #### APP_VER_3: 30 == 30 ? #### 

We see the hello world message, so we know our changed script was run. Let’s try to log in as qq:

Awesome! I like the greeting message in there, I’m sure they expected someone to tinker with this device at some point 🙂

Next step: figuring out the update process

Having root access should make it a lot easier to poke around in the device and see what we can find. In the next step, I’ll dig into the firmware update process. Stay tuned!

Posted in Reverse engineering | Tagged | Leave a comment

Hacking into a Foscam FI9853EP camera, part 1

I have a Foscam FI9853EP, which was introduced in 2014 and has long since been obsoleted. One of the things that stands out, is that all firmware is encrypted:

$ file FosIPC_B_patch_ver2.x.2.31_3_20190718_150013.bin 
FosIPC_B_patch_ver2.x.2.31_3_20190718_150013.bin: openssl enc'd data with salted password

I was curious: what could be in the firmware that needs this amount of protection? Usually, firmwares are plaintext but signed so that you can’t installed any unauthorized software in your device.

My end goal is to be able to decrypt the most recent (2019) firmware. But I’m not there yet.

Finding an UART

There is a lot of information available on how to locate a serial port on the device so you can interact with it. You tend to be looking for a 3-pin or 4-pin header. On the Foscam, there is a 4-pin header close to the ESD warning symbol on the board.

The picture below shows where the header is located. I soldered the second wire using the other side of the board:

Counting pins from left to right, we have:

  • Pin 1 = TX
  • Pin 2 = GND
  • Pin 3 = RX
  • Pin 4 = ?

Using a FTDI I had lying around and the standard 115200/8N1 settings, we have a console!

U-Boot, or not yet

It turns out the system uses standard U-Boot. However, there is a catch:

System startup

U-Boot 2010.06 (Nov 16 2017 - 11:43:05)

Check Flash Memory Controller v100 ... Found
SPI Nor(cs 0) ID: 0xc2 0x20 0x18
##uboot 020sdk patch!##
Block:64KB Chip:16MB Name:"MX25L128XX"
SPI Nor total size: 16MB
MMC:   
EMMC/MMC/SD controller initialization.
Card did not respond to voltage select!
No EMMC/MMC/SD device found !
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Hisilicon ETH net controler
init rmii phy led completed
Hit any key to stop autoboot:  1 
1st input Passwd:

There is a password protection mechanism in place: whenever you interrupt the boot process, it asks for a password that we don’t have.

Where is the firmware stored and how do we get to it?

If you take a closer look at the picture above, you’ll spot a 8-pin SOIC chip to the right. This chip is a MX25L12835F, which is a 128MBit SPI NOR flash memory IC. This is also what the boot log mentions right before we stop it. This chip will likely contain the firmware.

It would be possible to desolder the IC, dump its content and solder it back. This felt like a lot of work, and I like to avoid soldering if I can help it, since we already have a device that can interact with the IC: the device itself. But to dump the contents of the IC, we either need access from Linux or from U-Boot. The latter is easiest, but we still have that pesky password prompt…

Getting U-Boot shell, the easy way

I had an idea: if I could somehow stop the boot process, it will likely just error out and give me a prompt. To this end, I tried shorting pins 5 (SI) and 6 (SCLK) of the flash memory chip with a screwdriver while I powered the device on. Sometimes I was too quick and U-Boot wouldn’t start at all, and usually I was too late and the kernel was already loaded. However, after a few tries, success!

Wrong Image Format for bootm command
ERROR: can't get kernel image!
hisilicon # 

Since I stopped shorting the pins, I can interact with the flash IC now and inspect its contents:

hisilicon # sf probe 0
16384 KiB hi_fmc at 0:0 is now current device
hisilicon # sf read
Usage: sf read addr offset len
hisilicon # sf read 0x82000000 0 0x10000
hisilicon # md.b 0x82000000 0x10000
82000000: 17 04 00 ea 14 f0 9f e5 14 f0 9f e5 14 f0 9f e5    ................
...

Dumping the flash

The approach above with the md.b command works, but it’s very slow. I really did not fancy dumping 16MB this way. There is an ethernet port on the device. Let’s see if we can use it:

hisilicon # setenv serverip 192.168.1.1
hisilicon # setenv ipaddr 192.168.1.2
hisilicon # ping 192.168.1.1
MAC:   00-00-23-34-45-66
eth0 : phy status change : LINK=DOWN : DUPLEX=FULL : SPEED=100M
eth0 : phy status change : LINK=UP : DUPLEX=FULL : SPEED=100M
host 192.168.1.1 is alive

Hooray! The U-Boot is also built with TFTP upload and download capabilities. I decided to dump the flash in 4 parts, as I was getting TFTP errors if I tried to do it in one go and I could easily retransfer the problematic pieces this way.

hisilicon # sf read 82000000 0 400000
hisilicon # tftp 82000000 flash0 400000
hisilicon # sf read 82000000 400000 400000
hisilicon # tftp 82000000 flash400000 400000
hisilicon # sf read 82000000 800000 400000
hisilicon # tftp 82000000 flash800000 400000
hisilicon # sf read 82000000 c00000 400000
hisilicon # tftp 82000000 flashc00000 400000

I can re-combine the flash by just concatenating the flash files together:

$ cat flash0 flash400000 flash800000 flashc00000 > flash.bin

Making sense of the flash

I wanted to split the 16MB flash image up, as it isn’t used as a single blob. Allowing the device to boot and looking at the bootlog revealed the flash layout:

Creating 5 MTD partitions on "hi_sfc":
0x000000000000-0x000000080000 : "boot"
0x000000080000-0x000000380000 : "kernel"
0x000000380000-0x000000e80000 : "app"
0x000000e80000-0x000000f80000 : "app_ext"
0x000000f80000-0x000001000000 : "para"

So we can split it using the following commands:

$ dd if=flash.bin of=boot.mtd ibs=1 skip=0 count=524288
$ dd if=flash.bin of=kernel.mtd ibs=1 skip=524288 count=3145728
$ dd if=flash.bin of=app.mtd ibs=1 skip=3670016 count=11534336
$ dd if=flash.bin of=app_ext.mtd ibs=1 skip=15204352 count=1048576
$ dd if=flash.bin of=para.mtd ibs=1 skip=16252928 count=524288

Let’s see if the resulting images make any sense:

$ file *.mtd
app_ext.mtd: Linux jffs2 filesystem data little endian
app.mtd:     Squashfs filesystem, little endian, version 4.0, xz compressed, 11298912 bytes, 462 inodes, blocksize: 131072 bytes, created: Thu Nov 16 07:51:11 2017
boot.mtd:    data
kernel.mtd:  u-boot legacy uImage, Linux-3.4.35, Linux/ARM, OS Kernel Image (Not compressed), 2976536 bytes, Tue Nov  7 02:41:11 2017, Load Address: 0X80008000, Entry Point: 0X80008000, Header CRC: 0XAF395D8A, Data CRC: 0X36BDB2E2
para.mtd: Linux jffs2 filesystem data little endian

Learning the U-Boot password

The boot.mtd is just a binary data blog, which means it likely contains the bootloader as-is. I decided to look at the strings within to see if there is anything worthwhile, and indeed:

$ strings boot.mtd
[...]
Hit any key to stop autoboot: %2d 
%dst input Passwd:
ipc.fos~
%2d 
[...]

After the 1st input Passwd: we find the string ipc.fos~ ! I decided to try it, and this is indeed the U-Boot password. I no longer have to mess with the pins to get a prompt, and I have the current (factory) firmware of the device.

Next step: extracting data from the images

This post is already long enough, so I’ll leave it at that. However, I plan om extract the images to figure out how the device boots, updates and hopefully learn a thing or two on how to proceed. Stay tuned!

Posted in Reverse engineering | Tagged | Leave a comment

Behind the magic of magic_enum

Recently, a coworker pointed me towards a C++17 library to convert enumeration values to strings and vice versa. The library called magic_enum (https://github.com/Neargye/magic_enum) and it indeed feels like magic. I was immediately curious: how did they pull this off?

The code I am presenting here can be seen as a massively down-scaled version of magic_enum: the approach taken is exactly the same. I’ve mainly tried to simplify for purposes of readability and understandability. Thus, consider the code presented here licensed using MIT, the same license which governs magic_enum – just will less features and more comments. You can find my re-implementation at https://gist.github.com/zhmu/9ac375706ffbafa5d24693f8475abd79.

I would like to thank Daniil Goncharov for giving me something to study and blowing my mind!

Goal

We want the following to code to yield the value as commented:

enum class Colour {
    Red = 27,
    Green,
    Blue = 40,
};

const std::string_view s = enum_to_string(Colour::Green);
puts(s.data()); // Green

This code already has some requirements subtly embedded:

  • enum_to_string() must return owned memory, as it returns a std::string_view
  • The returned string_view must be zero-terminated for puts() to work

This means we need to provide static storage for the string values. We could do without, but it turns out the code optimizes better and is fairly easy to implement so I’ve decided to roll with it.

First steps: how would you convert a given enum value to a string?

In classic C, you have __FILE__ and __LINE__, which expand to character arrays containing the filename and current line number of the input you are compiling. As time went on, C99 introduced __func__, which contains the function you are in. As this is C, function overloading does not exist and no parameters are necessary.

C++ does not provide a standard way to determine the current function name with all parameters involved, but GCC/Clang provides __PRETTY_FUNCTION__ and Microsoft Visual Studio provides __FUNCSIG__. Let’s show some examples:

void fun1() { puts(__PRETTY_FUNCTION__); }
int fun2(int v) { puts(__PRETTY_FUNCTION__); return 0; }

fun1();
// void fun1()
fun2(123);
// int fun2(int)

This also works with template functions:

template<typename T> void fun3() { puts(__PRETTY_FUNCTION__); }

fun3<int>();
// void fun3() [with T = int]
fun3<Colour>();
// void fun3() [with T = Colour]

The clever insight here is that you can create a template that takes an enumeration and a compile-time known value of that enumeration:

template<typename E, E v> void fun4() { puts(__PRETTY_FUNCTION__); }

fun4<Colour, Colour::Red>();
// void fun4() [with E = Colour; E v = Colour::Red]

Hence, if we’d perform a compile-time instantiation of all possible enumeration values, we have their corresponding character representation!

How do you know which enumeration values are possible?

First, let’s see what happens if you try to use fun4() on a value that isn’t in the enumeration:

fun4<Colour, static_cast<Colour>(1)>();
// void fun4() [with E = Colour; E v = (Color)1 ]

This value is distinct in a sense that it doesn’t correspond with the Color::... output we saw previously. This means we can determine whether any integer corresponds to a given enumeration value or not!

So how we do obtain a list of all valid enumeration values? We try them all, one by one. If you look at the magic_enum documentation, it stands out that the enum value must reside within a certain range, which by default is MAGIC_ENUM_RANGE_MIN .. MAGIC_ENUM_RANGE_MAX. Only this range will be evaluated by default.

In magic_enum, the function n() is responsible for the conversion. It uses pretty_name() to normalize the resulting compiler-specific result of __PRETTY_FUNCTION__ / __FUNCSIG__ to the enumeration value name, or an empty string_view in case the value does not exist within the enumeration.

A incomplete implementation of pretty_name() / n() function, which works well enough for enumeration values that do not contain digits, is as follows:

constexpr auto is_pretty(char ch) noexcept
{
    return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z');
}

constexpr auto pretty_name(std::string_view sv) noexcept
{
    for(std::size_t n = sv.size() - 1; n > 0; --n) {
        if (!is_pretty(sv[n])) {
            sv.remove_prefix(n + 1);
            break;
        }
    }
    return sv;
}

template<typename E, E V>
constexpr auto n() noexcept
{
#if defined(__GNUC__) || defined(__clang__)
    return pretty_name({ __PRETTY_FUNCTION__, sizeof(__PRETTY_FUNCTION__) - 2 });
#elif defined(_MSC_VER)
    return pretty_name({ __FUNCSIG__, sizeof(__FUNCSIG__) - 17 });
#endif
}

The implementation in magic_enum supports more cases and will reject invalid names. For our purposes, the implementation above suffices.

Collecting the possible enumeration values

Now to we have a way to query enumeration values one by one, we need a way to assemble them together into an array. We want the following code to compile and yield the output in comments:

auto f = values<Colour>();
for(const auto& i: f)
    std::cout << static_cast<int>(i) << '\n'
// 27
// 28
// 40

First things first, we need a function to determine whether a given enumeration value is valid. Remember the magic n() function above? All we need to check is whether the value returned is non-empty:

template<typename E, E V>
constexpr auto is_valid()
{
    constexpr E v = static_cast<E>(V);
    return !n<E, V>().empty();
}

We also need a way to convert an integer v to the v-th enumeration value. Within magic_enum, this function is called ualue(), so I’ll stick with that. Thankfully, this is pretty straight forward:

template<typename E>
constexpr auto ualue(std::size_t v)
{
    return static_cast<E>(ENUM_MIN_VALUE + v);
}

Now things get more tricky. We want to create a compile-list sequence of all integers it needs to try (this is simply the list of integers between MAGIC_ENUM_RANGE_MIN and MAGIC_ENUM_RANGE_MAX. We know there are ENUM_MAX_VALUE - ENUM_MIN_VALUE + 1 such values. Using std::make_index_sequence<>, we can generate a compile-time list containing all std::size_t values from 0 up to and including ENUM_MAX_VALUE - ENUM_MIN_VALUE.

This allows us to write the values<E>() function, which generates the appropriate list and feeds it into a helper function:

template<typename E>
constexpr auto values() noexcept
{
    constexpr auto enum_size = ENUM_MAX_VALUE - ENUM_MIN_VALUE + 1;
    return values<E>(std::make_index_sequence<enum_size>({}));
}

Let’s give the function prototype of the values() helper function:

template<typename E, std::size_t... I>
constexpr auto values(std::index_sequence<I...>) noexcept;

Here, I is a sequence of std:size_t‘s. There can be zero up to a lot of them, and each one corresponds with an enumeration value we want to try to see if it is valid. C++17 gives us fold expressions, which allow us to conveniently express this using the is_valid() and ualue() functions:

constexpr bool valid[sizeof...(I)] = { is_valid<E, ualue<E>(I)>()... };

Our Colours enumeration has only three values, which means most of valid will be false. We want to condense it to a std::array<E, N> which contains only the values present in the enumeration. The first step is to introduce a helper function to count the number of items in valid that are true:

template<std::size_t N>
constexpr auto count_values(const bool (&valid)[N])
{
    // Cannot use std::count_if(), it is not constexpr pre C++20
    std::size_t count = 0;
    for(std::size_t n = 0; n < N; ++n)
        if (valid[n]) ++count;
    return count;
}

Which makes the remainder of the values() function pretty straight-forward:

constexpr auto num_valid = count_values(valid);
static_assert(num_valid > 0, "no support for empty enums");

std::array<E, num_valid> values = {};
for(std::size_t offset = 0, n = 0; n < num_valid; ++offset) {
    if (valid[offset]) {
        values[n] = ualue<E>(offset);
        ++n;
    }
}

return values;

We’ll be needing values<E>() quite a bit. We’ll introduce values_v as a shorthand:

template<typename E>
inline constexpr auto values_v = values<E>();

We can test our implementation by iterating over all values_v<>. It indeed yields all value enumeration values of Colour, exactly as intended:

auto f = values_v<Colour>;
for(const auto& i: f)
     std::cout << static_cast<int>(i) << '\n'
// 27
// 28
// 40

From values to entries

Our intention is to implement an entries_v<E> variable, which yields a std::array<std::pair<E, string_view>, ...>: that is, for a given enum E, it yields an array containing tuples with each valid value within that enum and its corresponding string representation.

First, we’ll introduce another helper, enum_value_v<V, E> to obtain the string representation of a enumeration value V within enum E. For now, this will simply be a call to the n() function:

template<typename E, E V>
constexpr auto enum_name()
{
    constexpr auto name = n<E, V>();
    return name;
}

template<typename E, E V>
inline constexpr auto enum_name_v = enum_name<E, V>();

We can then introduce the entries() function as follows: given a sequence of all possible enumeration values I, we yield a std::array<> with the enumeration value and the corresponding enumerating name string. Fold expressions make this convenient:

template<typename E, std::size_t... I>
constexpr auto entries(std::index_sequence<I...>) noexcept
{
    return std::array<std::pair<E, std::string_view>, sizeof...(I)>{
        {{ values_v<E>[I], enum_name_v<E, values_v<E>[I]>}...}
    };
}

This allows us to finally express entries_v<E> as follows:

template<typename E>
inline constexpr auto entries_v =
    entries<E>(std::make_index_sequence< values_v<E>.size()>());

We can prove that this works by printing the contents of entries_v<Colour>:

auto q = entries_v<Colour>;
for(const auto [a, b]: q)
    std::cout << static_cast<int>(a) << ' ' << b << '\n'
// 27 Red
// 28 Green
// 40 Blue

Awesome!

Implementing enum_to_string()

Given all the work we did previously, enum_to_string() is trivial:

template<typename E>
constexpr std::string_view enum_to_string(E value)
{
    for (const auto& [ key, name ]: entries_v<E>) {
        if (value == key) return name;
    }
    return {};
}

Of course, C++20’s constexpr algorithms would make this a lot nicer.

Static string storage and zero-termination

This implementation has a major drawback and an annoying bug.

As for the drawback, the complete strings as output by __PRETTY_FUNCTION__ will be stored in the executable (this can be seen using tools like Compiler Explorer and examining the assembly output). We have:

.LC0:
        .string "constexpr auto n() [with E = Colour; E V = Colour::Green]"
main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:.LC0+51
        call    puts
        xor     eax, eax
        add     rsp, 8
        ret

This prints Green] – which illustrates the bug: if we use std::cout instead, we’d get the correct string. This is because we do not properly insert a \0-character – hence, we’ll just end up with whatever was in memory.

We can introduce a helper class to store a compile-time zero-terminated string. magic_enum calls static_string, which I’ll also do. The idea is to provide N + 1 bytes of storage, which are initially zero. In the constructor, we’ll copy the bytes of the string_view:


template<std::size_t N>
struct static_string
{
    constexpr static_string(std::string_view sv) noexcept
    {
        // std::copy() is not constexpr in C++17, hence...
        for(std::size_t n = 0; n < N; ++n)
            content[n] = sv[n];
    }
    constexpr operator std::string_view() const noexcept
    { return { content.data(), N }; }
    
private:
    std::array<char, N + 1> content{};
};

All that remains is to use static_string<> in enum_name(), as follows:

template<typename E, E V>
constexpr auto enum_name()
{
    constexpr auto name = n<E, V>();
    return static_string<name.size()>(name);
}

Which yields the desired assembly output:

main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:enum_name_v<Colour, (Colour)28>
        call    puts
        xor     eax, eax
        add     rsp, 8
        ret

enum_name_v<Colour, (Colour)28>:
        .byte   71       // G
        .byte   114      // r
        .byte   101      // e
        .byte   101      // e
        .byte   110      // n
        .zero   1

Closing words

I’m extremely grateful for Daniil Goncharov’s work. Initially, I would use the tried and proven C macro-style approach, which feels clumsy given the state C++ is in these days. Studying his approach has taught me some wonderful things, and I hope I’ve share some of them by writing this post.

Posted in Programming | Tagged | Leave a comment

On NetWare 3.x password hashing

Way back when, I was involved in trying to obtain passwords for a Novell NetWare 3.12 server. I won’t go into details here, suffice to say that the topic has always interested me – sufficiently to return to it 30-ish years later and write down an algorithm description.

The research presented here is based on my independent disassembly of the embedded server.nlm in NetWare 3.12. I have compared these findings to ncpfs 2.2.0, mars_nwe-0.99.p121 and itsme’s sources, and believe the information as presented here to be correct. However, if you spot mistakes or omissions please let me know!

Note that this only covers password storage and client login. Password changes are not covered in this entry as this is a separate mechanism.

You can find code implementing these algorithms, including test vectors, at https://github.com/zhmu/nw-tools/blob/main/nw-crypt.c.

Introduction

NetWare 3.x stores passwords in the bindery, which is an internal database containing objects (users, groups) with properties which can have a value. Every user should have a password property, which can only be accessed by the server itself. This password property contains a password hash as well as the password length.

Passwords are not stored in plain-text, but in a hashed form. The hash function is used both for password hashing as well as login credentials.

NetWare hash function

The NetWare-specific hash function is one-way and cannot be inverted. It uses the following variables:

  • Salt (salt): 4 bytes, commonly referred to as a key
  • Input (in): 32 bytes
  • Output (out): 16 bytes
  • Temporary storage (temp): 32 bytes
  • Carry value (last): 1 byte, initialized to 0
  • Shuffle keys (keys_table): 32 bytes, provided as an appendix
  • Hash nibble table (nibble_table): 256 bytes, provided as an appendix

The operation of nw_hash(salt, in) is the following:

  1. Apply salt to input data, by Exclusing Or-ing the input with the corresponding salt value, for n in [ 0 .. 31 ]:
    temp[n] = in[n] ExclusiveOr salt[n % 4]
  2. Two rounds
    Execute step 3 twice
  3. Shuffle the content on a per-byte basis, for index in [ 0 .. 31 ]:
    v = temp[(last + index) % 16] - key_table[index]
    new_value = (temp[index] + last) ExclusiveOr v
    last = last + new_value
    temp[index] = new_value
  4. Combine the 32-byte temporary data to a 16-byte result, for index in [ 0 .. 15 ]:
    out[index] = nibble_table[temp[index * 2 + 0] BitwiseOr
    (nibble_table[temp[index * 2 + 1] ShiftLeft 4)
  5. Return out

The addition and subtraction used in step 3 must be performed with wrapping, i.e. 255 + 1 = 0 and 0 - 1 = 255.

Input stretching

The hashing algorithm just described always takes a 32-byte input. This means the password must be transformed in such a way so that it occupies exactly 32 bytes. I’ve dubbed this step input stretching as it is similar in spirit to key stretching. We have the following values:

  • Input (input): arbitrary bytes with a known length
  • Input length (in_len): Length of input, in bytes
  • Output (out): 32 bytes
  • Input position (input_pos): Offset of input byte to process

The operation of stretch_input(in, in_len) is as follows:

  1. Discard trailing zero-bytes from the input
    while (in_len > 0 && in[in_len - 1] == 0)
    in_len = in_len - 1;
  2. Set the initial output to zero, for n in [ 0.. 31 ]:
    output[n] = 0
  3. While the input exceeds 32 bytes, Exclusive Or the input bytes to the output.
    For n in [ 0 .. 31 ]:
    out[n] = out[n] ExclusiveOr input[0]
    discard input[0] from the input stream
    Decrease the input length by 32
  4. Process input data, for n in [ 0.. 31 ], input_pos = 0
    if position input_pos is present in the input:
    out[n] = out[n] ExclusiveOr in[input_pos]
    input_pos = input_pos + 1

    else
    out[n] = out[n] ExclusiveOr key_table[input_pos]
    input_pos = 0
  5. Return out

Calculating a password hash for a given bindery object

We have the following values:

  • Object ID (object_id), a 32-bit unsigned value uniquely identifying the bindery object
  • Password (password): The ASCII password to use, in UPPERCASE.
  • Password length (password_length): Length of password, in bytes
  • Output (output): 16 bytes
  • Salt (salt): 4-byte input salt, generated from object_id

The operation of hash_object_password(object_id, pwd) is as follows:

  1. Generate the salt by encoding the object id in a 4-byte array, big-endian
  2. Invoke stretch_input(password, password_length, expanded_in)
  3. Invoke nw_hash(salt, expanded_in, out)
  4. Return out

Server password storage

The result of hash_object_password() is used for this step. The resulting 16 bytes will be stored in the password property of the object. Byte 17 will be the length of the password. All other bytes are set to zero.

Client password encryption

Password hashes are not sent in plaintext over the network. There is a 8-byte login key involved, which is sent from the server to the client and used by both of them.

We have the following values:

  • Login key (key): a 8 byte pseudo-random value, generated by the server
  • Input hash (input), 16 bytes input value
  • Output (output): 8 bytes
  • Temporary expanded input storage (expanded_input): 32 bytes
  • Temporary data (temp): 32 bytes

The operation of nw_encrypt(key, input, output) is as follows:

  1. Expand input to 32 bytes
    Invoke stretch_input(input, 16), store result in expanded_input.
  2. Calculate NetWare hash for each subkey:
    temp[0..15] = nw_hash(key[0..3], expanded_input)
    temp[16..31] = nw_hash(key[4..7], expanded_input)
  3. Combine the 32-byte resulting values to a 8-byte value, for n in [ 0 .. 7 ]:
    out[n] = temp[n] ExclusiveOr temp[31 - n] ExclusiveOr temp[15 - n] ExclusiveOr temp[16 + n]
  4. Return out

Putting it all together

The image below illustrates how a plaintext password is set, and how the client login mechanism works. Colours are used to highlight values that are always identical, given a correct password. If the password mismatches, the red and blue values will differ.

Appendix: tables

nibble_table is used to combine two bytes. The byte values are looked up, and the first byte maps to the high nibble and the second byte maps to the low nibble (refer to step 4 of nw_hash):

const uint8_t nibble_table[256] = {
0x7, 0x8, 0x0, 0x8, 0x6, 0x4, 0xE, 0x4, 0x5, 0xC, 0x1, 0x7, 0xB, 0xF, 0xA, 0x8,
0xF, 0x8, 0xC, 0xC, 0x9, 0x4, 0x1, 0xE, 0x4, 0x6, 0x2, 0x4, 0x0, 0xA, 0xB, 0x9,
0x2, 0xF, 0xB, 0x1, 0xD, 0x2, 0x1, 0x9, 0x5, 0xE, 0x7, 0x0, 0x0, 0x2, 0x6, 0x6,
0x0, 0x7, 0x3, 0x8, 0x2, 0x9, 0x3, 0xF, 0x7, 0xF, 0xC, 0xF, 0x6, 0x4, 0xA, 0x0,
0x2, 0x3, 0xA, 0xB, 0xD, 0x8, 0x3, 0xA, 0x1, 0x7, 0xC, 0xF, 0x1, 0x8, 0x9, 0xD,
0x9, 0x1, 0x9, 0x4, 0xE, 0x4, 0xC, 0x5, 0x5, 0xC, 0x8, 0xB, 0x2, 0x3, 0x9, 0xE,
0x7, 0x7, 0x6, 0x9, 0xE, 0xF, 0xC, 0x8, 0xD, 0x1, 0xA, 0x6, 0xE, 0xD, 0x0, 0x7,
0x7, 0xA, 0x0, 0x1, 0xF, 0x5, 0x4, 0xB, 0x7, 0xB, 0xE, 0xC, 0x9, 0x5, 0xD, 0x1,
0xB, 0xD, 0x1, 0x3, 0x5, 0xD, 0xE, 0x6, 0x3, 0x0, 0xB, 0xB, 0xF, 0x3, 0x6, 0x4,
0x9, 0xD, 0xA, 0x3, 0x1, 0x4, 0x9, 0x4, 0x8, 0x3, 0xB, 0xE, 0x5, 0x0, 0x5, 0x2,
0xC, 0xB, 0xD, 0x5, 0xD, 0x5, 0xD, 0x2, 0xD, 0x9, 0xA, 0xC, 0xA, 0x0, 0xB, 0x3,
0x5, 0x3, 0x6, 0x9, 0x5, 0x1, 0xE, 0xE, 0x0, 0xE, 0x8, 0x2, 0xD, 0x2, 0x2, 0x0,
0x4, 0xF, 0x8, 0x5, 0x9, 0x6, 0x8, 0x6, 0xB, 0xA, 0xB, 0xF, 0x0, 0x7, 0x2, 0x8,
0xC, 0x7, 0x3, 0xA, 0x1, 0x4, 0x2, 0x5, 0xF, 0x7, 0xA, 0xC, 0xE, 0x5, 0x9, 0x3,
0xE, 0x7, 0x1, 0x2, 0xE, 0x1, 0xF, 0x4, 0xA, 0x6, 0xC, 0x6, 0xF, 0x4, 0x3, 0x0,
0xC, 0x0, 0x3, 0x6, 0xF, 0x8, 0x7, 0xB, 0x2, 0xD, 0xC, 0x6, 0xA, 0xA, 0x8, 0xD
};

key_table is used when transforming the input values (refer to step 3 of nw_hash), as well as additional input when stretching the input to 32 bytes (step 4 of stretch_input):

const uint8_t key_table[32] = {
0x48, 0x93, 0x46, 0x67, 0x98, 0x3D, 0xE6, 0x8D,
0xB7, 0x10, 0x7A, 0x26, 0x5A, 0xB9, 0xB1, 0x35,
0x6B, 0x0F, 0xD5, 0x70, 0xAE, 0xFB, 0xAD, 0x11,
0xF4, 0x47, 0xDC, 0xA7, 0xEC, 0xCF, 0x50, 0xC0
};

I haven’t found any specific patterns in nibble_table or key_table. They are hardcoded in both server.nlm and setpass.exe. However, perhaps there is a way to generate these tables by use of a function. If you manage to find it, let me know!

Posted in Reverse engineering | Tagged , | Leave a comment

Reverse engineering the NetWare 386 filesystem format

I decided to take a look into the NetWare 386 filesystem, which was used in NetWare 3.x and 4.x and perhaps later versions as well. This post serves to give a high-level background on the design and layout. Tools to analyze and extract content from such a filesystem can be found at https://github.com/zhmu/nwfs386.

If you have any corrections, additional information or questions, please reach out to me by email at rink@rink.nu. I’ll try to update this post as necessary. I also accept pull requests on Github!

Overview

Even though NWFS386 was introduced in the late 80-ies, it is more than a filesystem: it performs bad sector remapping, mirroring and divides the partition into volumes, which can span multiple partitions. This is indeed no ordinary filesystem. A brief overview:

  • A physical disk can have a single NetWare partition
  • A NetWare partition can be mirrored (similar to RAID-1) and performs its own bad sector remapping (redirection)
  • A NetWare volume contains files/directories and is allocated to one or more partitions

Hence, a volume can span multiple partitions, which may can be mirrored as desired for added reliability. This is indeed much more than just a filesystem!

In-depth feature list

  • NWFS386 uses 32-bit block numbers and 32-bit file/directory identifiers
  • Important NWFS386 partition structures are stored 4 times
  • A volume has its own block size, which can be 4KB, 8KB, 16KB, 32KB or 64KB in size.
  • Volumes can span multiple partitions, and volumes may be extended at any time
  • All meta-data is read in memory when mounting
  • Volume meta-data (FAT chain, directory contents) are stored twice
  • File data is stored a single time

Layout

NetWare 386 uses the MBR to locate partitions. There can be a single partition per device, and this partition must have an ID of 0x65.

  • The first 16KB of the NetWare partition is not in use
  • The next 16KB contains the hotfix/mirror area. These detail the number of redirection sectors and data sectors available, and how many sectors are allocated to remap bad sectors (the redirection area). This information is replicated 4 times.
  • The redirection area hasn’t been thoroughly explored; the idea is that bad sectors are remapped here by the NetWare OS.
  • The volume area contains which volumes are stored on this partition. There can be multiple volumes stored on a single partition, and each volume can span multiple partitions. This area contains 16KB of data which is replicated 4 times.
  • Finally, the data area stores the FAT chain and data blocks for the files/directories on the volume.

We’ll go in depth on these areas in the next sections.

Hotfix header

Being the first structure in the partition, it describes how many sectors of the partition are allocated for block data as well as redirection purposes.

FieldTypeDescription
idbyte[8]Identification string (HOTFIX00)
v_idu32Identification code
?u16[4]?
data_area_sectorsu32Number of sectors for data
redir_area_sectorsu32Number of sectors for redirection
?u32[8]?

Our main field of interest is redir_area_sectors which contains the number of sectors used for the hotfix/mirror area and the redirection area. In other words, it allows to calculate where the volume area starts within the partition.

The next sector contains the mirror header as illustrated below.

Mirror header

The second structure in the partition, this describes whether the partition is mirrored – and if so, with with other partitions.

FieldTypeDescription
idbyte[8]Identification string (MIRROR00)
create_timeu32Creation timestamp
?u32[5]?
v_id1u32Hotfix area ID #1
v_id2u32Hotfix area ID #2

v_idN contain the hotfix header v_id values of all partitions that appear within the mirror. Hence, the v_id of the partition’s hotfix header should always be appear in any of the v_idN fields.

Note: there are likely be more than 2 area ID’s allowed, but I haven’t looked into this due to my inability to add SCSI disks to my qemu instance (the dc390 driver seems the only supported one in NetWare 3.x and I can’t get it to work) as I ran out of IDE devices. Any help would be appreciated.

Volume area

The volume area is located directly past the redirection area; the offset is 16384 + redir_area_sectors * 512 bytes within the partition.

FieldTypeDescription
magicbyte[16]“NetWare Volumes” + \0
num_volumesu32Number of volume entries
?u32[3]?

This header is followed by num_volumes times a volume header, which is detailed below. Regardless of the number of volumes, 16KB will be used to store the volume information. Furthermore, the volume information is replicated 4 times, which means 64KB is used for the volume area.

Volume entry

FieldTypeDescription
name_lengthu8Volume name length, in bytes
namebyte[19]Volume name
?u16?
segment_numu16Volume segment number
first_sectoru32Always 160 (?)
num_sectorsu32Number of sectors in this segment
total_blocksu32Total number of blocks in this volume
first_segment_blocku32First data block this segment contains
?u32?
block_valueu32Used to calculate the block size
rootdir_block_nru32Block number containing the directory
rootdir_copy_block_nru32Block number containing the directory copy
?u32?

The volume’s block size is (256 / block_value) * 1024. The smallest block size is 1KB, whereas the largest would be 256KB. The installer does not allow you to create such large blocks, and I haven’t tried if they work at all.

If your volume spans multiple partitions, all partitions will have the volume listed in their volume area. However, the first segment will have first_segment_block = 0, whereas the second segment contains a non-zero value. Thus, all blocks within the segment are relative to first_segment_block, and given a block number you must determine which partition must be accessed.

FAT

The NetWare 386 filesystem was clearly inspired by the FAT file system, as it also uses a singly linked list to be able to find the next block of each file. Like FAT, this linked list is stored twice. As the entire FAT is read into memory at mount time and updated on disk as necessary, this is quite speedy.

The root directory is not fixed-length or fixed-size: the rootdir_block_nr and rootdir_copy_block_nr fields of the volume determine where the initial block of the root directory is – the FAT chain can then be used to determine the subsequent blocks as needed. This is similar to the approach taken in FAT32, except for the extra copy.

An important difference is that the root directory is the only directory stored in NWFS386. Every directory entry contains the ID of the directory in which the entry resides. There are a few magic values with their own respective entry content, such as volume information and additional trustee lists. The root directory uses ID 0.

Every FAT entry is completely incompatible with DOS FAT. Every entry is 128 bytes, which ensures it will never span across multiple sectors.

File entry

FieldTypeDescription
parent_dir_idu32Directory ID where the item resides
attru32Entry attributes (must not have directory bit set)
?byte[3]?
name_lenbyteFile name length, in bytes
namebyte[12]8.3 file name
create_timeu32File creation time
owner_idbu32Object ID of the current file owner
?u32[2]?
modify_timeu32File last modification time
modifier_idbu32Object ID of the last file modifier
lengthu32File length, in bytes
block_nru32First file block number
?u32?
trusteesTrustee[6]Trustees
?u32[2]?
delete_timeu32When was file deleted, zero if not deleted
delete_idbu32Object ID of who deleted the file
?u32[2]?
file_entryu32Unknown
?u32?

Directory entry

FieldTypeDescription
parent_dir_idu32Directory ID where the item resides
attru32Entry attributes (must have directory bit set)
?u8[3]?
name_lenu8File name length, in bytes
nameu8[12]8.3 file name
create_timeu32Directory creation time
owner_idbu32Directory owner object ID
?u32[2]?
modify_timeu32Directory last modification time
?u32?
trusteesTrustee[8]Trustees
?u16[2]?
inherited_rights_masku16Mask for inherited rights
subdir_indexu32Unknown
?u16[7]?
directory_idu32ID of this directory
?u16[2]?

Available entry

Available entries contain a parent_dir_id of 0xffff ffff. The other 124 bytes tend to be zeros.

Grant list

Whenever more trustees are added to a file/directory which cannot be stored in the corresponding FAT entry itself, a grant list will be added to the volume with the responding information. This hasn’t been decoded in too much detail.

FieldTypeDescription
parent_dir_idu320xffff fffe
?u32[5]?
trusteesTrustee[16]Trustees
?u32[2]?

Volume information

FieldTypeDescription
parent_dir_idu320xffff fffd
?u32[5]?
create_timeu32Volume creation time
owner_idu32Object ID of volume owner
?u32[2]?
modify_timeu32Last modification time
?u32?
trusteesTrustee[8]Trustees
?u32[8]?

Trustee structure

FieldTypeDescription
object_idbu32Object ID where the trustee applies to
rightsu16Bitmask containing trustee rights

The rights mask contains the following bits:

BitNetWare rightDescription
0RRead access
1WWrite acces
2Seems to be used internally?
3CAccess to create subentries
4EErase subentries
5AAccess control
6FFile scan
7MModify attributes
8SSupervisory (overrides all others)

Attribute bits

BitFILER flagDescription
0Ro (Rw if clear)Read-only
1HHidden
2SySystem
4Directory
5AArchive
7SShareable
12TTransactional
16PPurge
17RiRename Inhibit
18DiDelete Inhibit
19CiCopy Inhibit

Timestamp

A timestamp is a 32-bit value, which is to be interpreted as two 16-bit values: the high part is the date and the low part is the time.

PieceBitsDescription
Time11:15Hour
5:10Minute
0:4Second divided by 2
Date9:15Year minus 1980
5:8Month
0:4Day
Posted in Reverse engineering | Tagged , | Leave a comment

Creating a binutils/gcc toolchain for your OS

I wanted to update the binutils/gcc-based compiler toolchain of my Dogfood operating system. This entry describes some concepts and the changes that needed to be made in order to add this target to binutils 2.39 and gcc 12.2.0.

Configuration targets

Most of the GNU autotools work with configuration targets, which have the form <cpu>-<vendor>-<os>, where <os> can be <system> or <kernel>-<system>. For example, my Debian installation is x86_64-pc-linux-gnu, which decodes to a x86_64 cpu, vendor pc with a linux kernel and a gnu system.

These tend to be auto-detected, yet have to be specified if you want to do cross-compilation. Most often, --host=… is used to specify on which system the resulting binaries will run.

Since my OS is called Dogfood, runs on x86_64 and uses ELF binaries, the resulting configuration target will be x86_64-elf-dogfood.

More information regarding configuration targets can be found in the GNU autoconf manual.

config.sub

One of the programs involved is config.sub, which is a script to validate and canonicalize a configuration target. The idea is that you give it a configuration target (or system alias) and it will validate and yield a configuration target. Usually this means it will just output whatever the input was.

             | onefs* | tirtos* | phoenix* | fuchsia* | redox* | bme* \
             | midnightbsd* | amdhsa* | unleashed* | emscripten* | wasi* \
             | nsk* | powerunix* | genode* | zvmoe* | qnx* | emx* | zephyr* \
             | fiwix* | dogfood* )
                ;;
        # This one is extra strict with allowed versions
        sco3.2v2 | sco3.2v[4-9]* | sco5v6*)

Depending on the actual version of autotools used, config.sub may differ a bit from what is started above but the idea is that you add your OS without any specifics.

GNU binutils

Dogfood-specific patches to binutils

GNU binutils contains an assembler (as), a linker (ld) and various other tools. It needs to know which executable and object file format to use for the target platform. Like most *NIX-like systems and homebrew OS-es, I’ll go with ELF.

bfd/config.bfd

We need to tell Binutils to create ELF x86-64 executables by default. However, even though x86_64 is a 64-bit platform, it can run 32-bit code as well. Let’s enable 32-bit ELF support as it may be useful later on.

  x86_64-*-dogfood*)
    targ_defvec=x86_64_elf64_vec
    targ_selvecs="i386_elf32_vec"
    want64=true
    ;;

gas/configure.tgt

  i386-*-dogfood*)          fmt=elf ;;

ld/configure.tgt

x86_64-*-dogfood*)     targ_emul=elf_x86_64
                       ;;

GCC dependencies

Dogfood-specific patches to GMP, MPC and MPFR

GCC depends on three libraries: gmp (I used 6.2.1), mpc (I used 1.2.1) and mpfr (I used 4.1.0). mpc has a config.sub in the build-aux directory, whereas gmp supplies a wrapper which calls configfsf.sub. Finally, mpfr has config.sub in the root directory. All of these need to be patched to recognize the Dogfood OS.

GCC

Dogfood-specific patches to gcc

fixincludes/mkfixinc.sh

GCC contains a script to fix system header files. Our header files do not need such fixing, and to prevent strange issues we just add our OS to the list which does not need any work:

    *-musl* | *-dogfood*)

gcc/config.gcc

We need to specify which assembler, linker and some other flags are to be used for gcc. We’ll be using the standard GNU tools.

*-*-dogfood*)
  gas=yes
  gnu_ld=yes
  default_use_cxa_atexit=yes
  use_gcc_stdint=wrap
  ;;

Furthermore, for x86_64, we need to include some extra settings for the ELF support, i386 and x86_64 support and others.

x86_64-*-dogfood*)
    tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h newlib-stdint.h i386/i386elf.h i386/x86-64.h dogfood.h"
    ;;

This refers to a file dogfood.h which we need to create.

gcc/config/dogfood.h

This file contains all the definitions for our OS: which libc to link against by default, which startfile/endfile objects are to be used, which #define‘s are to be enabled, etc.

/* Useful if you wish to make target-specific GCC changes. */
#undef TARGET_DOGFOOD
#define TARGET_DOGFOOD 1

#undef LIB_SPEC
#define LIB_SPEC "-lc"

/* Files that are linked before user code.
   The %s tells GCC to look for these files in the library directory. */
#undef STARTFILE_SPEC
#define STARTFILE_SPEC "crt0.o%s crti.o%s crtbegin.o%s"

/* Files that are linked after user code. */
#undef ENDFILE_SPEC
#define ENDFILE_SPEC "crtend.o%s crtn.o%s"

/* Additional predefined macros. */
#undef TARGET_OS_CPP_BUILTINS
#define TARGET_OS_CPP_BUILTINS()       \
  do {                                 \
    builtin_define_std ("unix");       \
    builtin_define ("__dogfood__");    \
    builtin_assert ("system=unix");    \
    builtin_assert ("system=dogfood"); \
  } while(0);

libgcc

libgcc is a low-level library provided by GCC, for which “GCC generates calls to routines in this library automatically, whenever it needs to perform some operation that is too complicated to emit inline code for.”. More information in the gcc internals documentation

We need to instruct the build infrastructure how to build the library for our purposes, and which parts (crt etc) need to be build.

libgcc/config.host

For the Dogfood OS itself, we want libgcc to provide the crt files as they are fine and it avoids having to maintain them ourselves.

*-*-dogfood*)
  extra_parts="$extra_parts crti.o crtbegin.o crtend.o crtn.o"
  ;;

We also need some CPU-specific settings:

x86_64-*-dogfood*)
    tmake_file="$tmake_file i386/t-crtstuff t-crtstuff-pic t-libgcc-pic"
    ;;

libstdc++ (bundled with gcc)

libstdc++ is the C++ standard library implementation that is shipped with GCC.

libstdc++-v3/crossconfig.m4

Libstdc++ does not know anything about our OS, so we must add it to the configuration.

  *-dogfood*)
    GLIBCXX_CHECK_COMPILER_FEATURES
    GLIBCXX_CHECK_LINKER_FEATURES
    GLIBCXX_CHECK_MATH_SUPPORT
    GLIBCXX_CHECK_STDLIB_SUPPORT
    ;;

Afterwards, you must run autoreconf in the libstdc++-v3 directory to update the configure script with the changes made to the crossconfig.m4 file.

Building

All done now – we just need to build everything at this point. Both binutils and gcc use out of tree builds, which means the build directory resides outside of the source code. This ensures the source files will not be modified, which in turn helps to ensure builds are reproducible.

We start by building binutils:

$ mkdir binutils-build
$ cd binutils-build
$ ../binutils-2.39/configure --target=x86_64-elf-dogfood --disable-nls --disable-werror --prefix=/tmp/dogfood-toolchain --with-sysroot=/tmp/dogfood-sysroot
$ make
$ make install

And gcc:

$ mkdir gcc-build
$ cd gcc-build
$ ../gcc-12.2.0/configure --target=x86_64-elf-dogfood --disable-nls --with-newlib --without-headers --enable-languages='c,c++' --prefix=/tmp/dogfood-toolchain --disable-libstdcxx --disable-build-with-cxx --disable-libssp --disable-libquadmath --with-sysroot=/tmp/dogfood-sysroot --with-gxx-include-dir=/tmp/dogfood-sysroot/usr/include/c++/12.2.0

And we’re done! You should end up with something like the following:

$ x86_64-elf-dogfood-ld -v
GNU ld (GNU Binutils) 2.39
$ x86_64-elf-dogfood-gcc -v
Using built-in specs.
COLLECT_GCC=x86_64-elf-dogfood-gcc
COLLECT_LTO_WRAPPER=/opt/dogfood-toolchain/libexec/gcc/x86_64-elf-dogfood/12.2.0/lto-wrapper
Target: x86_64-elf-dogfood
Configured with: ../../userland/gcc-12.2.0/configure --target=x86_64-elf-dogfood --disable-nls --with-newlib --without-headers --en
able-languages=c,c++ --prefix=/opt/dogfood-toolchain --disable-libstdcxx --disable-build-with-cxx --disable-libssp --disable-libqua
dmath --with-sysroot=/tmp/dogfood-sysroot --with-gxx-include-dir=/tmp/dogfood-sysroot/usr/include/c++/12.2.0
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC)

Posted in Operating System Development, Toolchain | Tagged , | Leave a comment