Memory Bugs in Multiple Linux Kernel Drivers using DebugFS

Multiple drivers in the Linux Kernel using the DebugFS follow a bug pattern enabling memory disclosure and corruption of heap allocated memory. Generally, kernel memory is copied back into user space using an incorrect length field leading to memory disclosure.

There's no need for concern. These bugs are almost completely mitigated by the kernel configuration CONFIG_HARDENED_USERCOPY, which is set as a default on many Linux distributions. In the code review below, only 1 bug out of 4 is not mitigated by this configuration. Additionally, these bugs have low impact because DebugFS is normally only enabled by kernel developers. It is unlikely for production kernels to enable this feature.

Introduction

The Linux kernel is the heart of the operating system and controls things from device interfacing to process scheduling. DebugFS is a filesystem used internally by kernel developers to provide additional debugging information undesirable in production.

From Wikipedia,


debugfs is a special file system available in the Linux kernel since version 2.6.10-rc3.[1] It was written by Greg Kroah-Hartman.[2]debugfs is a simple-to-use RAM-based file system specially designed for debugging purposes. It exists as a simple way for kernel developers to make information available to user space.[3] Unlike /proc, which is only meant for information about a process, or sysfs, which has strict one-value-per-file rules, debugfs has no rules at all. Developers can put any information they want there.[4]

In the Linux kernel, debugfs is enabled using the DEBUGFS configuration variable.

Formatted String APIs in the Kernel

Before we look at the bugs, let's look at some kernel APIs which are used in triggering the bugs.
Most people are probably familiar with snprintf(). snprintf will write a formatted string and place it into a buffer. It performs bounds checking so will prevent simple buffer overflows.
snprintf — Format a string and place it in a buffer

int snprintf (char * buf,
size_t size,
const char * fmt,
...);

The return value is the number of characters which would be generated for the given input, excluding the trailing null, as per ISO C99. If the return is greater than or equal to size, the resulting string is truncated.
The return value is the number of bytes it would have written if bounds checking was ignored. The return value allows you to determine the number of characters needed in the buffer, so the correct size buffer may be allocated before trying to fully construct the output.
In the Linux kernel, an additional API is provided which slightly modifies the semantics of snprintf. In this case, the return value is altered.

scnprintf — Format a string and place it in a buffer

int scnprintf (char * buf,
size_t size,
const char * fmt,
...);

The return value is the number of characters written into buf not including the trailing '\0'. If size is == 0 the function returns 0. 

This is interesting. And if people aren't aware of the differences between scnprintf and snprintf, there is an opportunity for vulnerabilities.

Userspace Copies

Memory gets copied back and forth between userspace and kernelspace. This is a mechanism that could allow either memory corruption if too much memory was copied into kernelspace, or memory disclosure if too much memory was copied into userspace.

One API of interest is simple_copy_from_buffer.
ssize_t simple_read_from_buffer(void __user * to, size_t count, loff_t * ppos, const void * from, size_t available) 
copy data from the buffer to user space

The simple_read_from_buffer() function reads up to count bytes from the buffer from at offset ppos into the user space address starting at to.
On success, the number of bytes read is returned and the offset ppos is advanced by this number, or negative value is returned on error.

Hardened Usercopies

simple_read_from_buffer uses the copy_to_user API internally. Furthmore, copy_to_user is normally protected by the HARDENED_USERCOPY mitigation.

From https://lwn.net/Articles/727322/
Much of the heap memory used within the kernel is obtained from the slab allocator. The hardened usercopy patch set, merged for the 4.8 kernel, attempts to limit the impact of erroneous copy operations by ensuring that no single operation can cross the boundary between one slab-allocated object and the next. But the kernel gets a lot of large memory objects from the slab allocator, and it is often not necessary to copy the entire object between the kernel and user space. In cases where only part of an object needs to be copied, it would be useful to prevent a rogue copy operation from copying to or from parts of the structure that do not need to be exposed in this way. 
Ultimately, this means that if simple_copy_to_user copies heap allocated memory, then this mitigation will prevent a memory disclosure.

DebugFS Drivers Done the Right Way

Here is a kernel driver writing data to a buffer that will be copied back into userspace. Note that it uses scnprintf.
        buf = kmalloc(buf_size, GFP_KERNEL);

        if (!buf)
                return -ENOMEM;

        off = 0;
        off += scnprintf(buf + off, buf_size - off,
                        "Mp2 Device Information:\n");

        off += scnprintf(buf + off, buf_size - off,
                        "========================\n");
        off += scnprintf(buf + off, buf_size - off,
                        "\tMP2 C2P Message Register Dump:\n\n");
        u.v32 = readl(privdata->mmio + AMD_C2P_MSG0);
        off += scnprintf(buf + off, buf_size - off,
                        "AMD_C2P_MSG0 -\t\t\t%#06x\n", u.v32);

...

        u.v32 = readl(privdata->mmio + AMD_P2C_MSG_INTSTS);
        off += scnprintf(buf + off, buf_size - off,
                        "AMD_P2C_MSG_INTSTS -\t\t%#06x\n", u.v32);

        ret = simple_read_from_buffer(ubuf, count, offp, buf, off);
        kfree(buf);
        return ret;
}

There are a couple of interesting points here. Let's outline the algorithm.
  1. Allocate a kernel buffer
  2. Write formatted content to the buffer.
    1. Ensure the buffer does not overflow by using the appropriate size value in scnprintf
    2. Track how much data is actually written by summing the return values of scnprintf
  3. Copy buffer back to userspace
The code above requires scnprintf to work.

DebugFS Almost Done the Right Way

Let's look at a partially correct driver using snprintf.
        struct eg20t_port *priv = file->private_data;
        char *buf;
        u32 len = 0;
        ssize_t ret;
        unsigned char lcr;

        buf = kzalloc(PCH_REGS_BUFSIZE, GFP_KERNEL);
        if (!buf)
                return 0;

        len += snprintf(buf + len, PCH_REGS_BUFSIZE - len,
                        "PCH EG20T port[%d] regs:\n", priv->port.line);

        len += snprintf(buf + len, PCH_REGS_BUFSIZE - len,
                        "=================================\n");
        len += snprintf(buf + len, PCH_REGS_BUFSIZE - len,
                        "IER: \t0x%02x\n", ioread8(priv->membase + UART_IER));

...

        if (len > PCH_REGS_BUFSIZE)
                len = PCH_REGS_BUFSIZE;

        ret =  simple_read_from_buffer(user_buf, count, ppos, buf, len);
        kfree(buf);
        return ret;
}

We can see that an extra length check must be made if snprintf is used. If this check was not in place, then it would be possible to leak memory back into user space (providing debugfs was enabled and hardened usercopy was not).

First Bug

Lets look at a code snippet from the snprintf code I showed earlier.

        len += snprintf(buf + len, PCH_REGS_BUFSIZE - len,
                        "=================================\n");

What happens when len is greater than PCH_REGS_BUFSIZE? This could occur if we ended up writing a larger than expected size to the buffer. This is seemingly possible, otherwise, why do we need all the extra code for bounds checking? The code tries to mitigate against larger than expected results by using bounds checking in snprintf.

Lets assume len can be greater than PCH_REGS_BUFSIZE.

snprintf will protect the driver right?

When len exceeds the expected bounds, PCH_REGS_BUFSIZE - len becomes negative. And this value is passed to snprintf. That's not good.

int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
{
        unsigned long long num;
        char *str, *end;
        struct printf_spec spec = {0};

        /* Reject out-of-range values early.  Large positive sizes are
           used for unknown buffer sizes. */
        if (WARN_ON_ONCE(size > INT_MAX))
                return 0;

So size is passed as a negative as I discussed earlier. This  makes size > INT_MAX. We don't get memory corruption, but there will be a warning.

So this almost correct driver will warn, but there won't be any information disclosure or memory corruption.

Memory Corruption (1)

Here is a driver in drivers/misc/lkdtm_core.c

static ssize_t lkdtm_debugfs_read(struct file *f, char __user *user_buf, size_t count, loff_t *off) { char *buf; int i, n, out; buf = (char *)__get_free_page(GFP_KERNEL); if (buf == NULL) return -ENOMEM; n = snprintf(buf, PAGE_SIZE, "Available crash types:\n"); for (i = 0; i < ARRAY_SIZE(crashtypes); i++) { n += snprintf(buf + n, PAGE_SIZE - n, "%s\n", crashtypes[i].name); } buf[n] = '\0'; out = simple_read_from_buffer(user_buf, count, off, buf, n); free_page((unsigned long) buf); return out; }

We know that snprintf does bounds checking and if the bounds are exceeded (n is greater PAGE_SIZE), we'll get a negative size for snprintf. The kernel will issue a warning because it is greater than INT_MAX.

The allocated buffer is manually NUL terminated using buf[n]. If n is greater than PAGE_SIZE, which is a possibility, then we'll have a NUL byte written out of bounds. We achieved memory corruption that will work even without HARDENED_USERCOPY.

Memory Disclosure (2)

Here is a vulnerable driver in drivers/spi/spi-dw.c

#ifdef CONFIG_DEBUG_FS
#define SPI_REGS_BUFSIZE        1024
static ssize_t dw_spi_show_regs(struct file *file, char __user *user_buf,
                size_t count, loff_t *ppos)
{
        struct dw_spi *dws = file->private_data;
        char *buf;
        u32 len = 0;
        ssize_t ret;

        buf = kzalloc(SPI_REGS_BUFSIZE, GFP_KERNEL);
        if (!buf)
                return 0;

        len += snprintf(buf + len, SPI_REGS_BUFSIZE - len,
                        "%s registers:\n", dev_name(&dws->master->dev));
        len += snprintf(buf + len, SPI_REGS_BUFSIZE - len,
                        "=================================\n");
        len += snprintf(buf + len, SPI_REGS_BUFSIZE - len,
...
        len += snprintf(buf + len, SPI_REGS_BUFSIZE - len,
                        "DMARDLR: \t0x%08x\n", dw_readl(dws, DW_SPI_DMARDLR));
        len += snprintf(buf + len, SPI_REGS_BUFSIZE - len,
                        "=================================\n");

        ret = simple_read_from_buffer(user_buf, count, ppos, buf, len);
        kfree(buf);
        return ret;

In this case, len might be large and lead to a userspace copy larger than the kernel buffer.

Memory Disclosure (3)


Let's look at drivers/char/virtio_console.c

static ssize_t debugfs_read(struct file *filp, char __user *ubuf,
                            size_t count, loff_t *offp)
{
        struct port *port;
        char *buf;
        ssize_t ret, out_offset, out_count;

        out_count = 1024;
        buf = kmalloc(out_count, GFP_KERNEL);
        if (!buf)
                return -ENOMEM;

        port = filp->private_data;
        out_offset = 0;
        out_offset += snprintf(buf + out_offset, out_count,
                               "name: %s\n", port->name ? port->name : "");
        out_offset += snprintf(buf + out_offset, out_count - out_offset,
                               "guest_connected: %d\n", port->guest_connected);
...
        out_offset += snprintf(buf + out_offset, out_count - out_offset,
                               "is_console: %s\n",
                               is_console_port(port) ? "yes" : "no");
        out_offset += snprintf(buf + out_offset, out_count - out_offset,
                               "console_vtermno: %u\n", port->cons.vtermno);

        ret = simple_read_from_buffer(ubuf, count, offp, buf, out_offset);
        kfree(buf);
        return ret;
}

Another memory disclosure following the same bug pattern.

Potential Memory Disclosures (4)


Let's look at drivers/net/wireless/marvell/libertas/debugfs.c

static ssize_t lbs_dev_info(struct file *file, char __user *userbuf,
                                  size_t count, loff_t *ppos)
{
        struct lbs_private *priv = file->private_data;
        size_t pos = 0;
        unsigned long addr = get_zeroed_page(GFP_KERNEL);
        char *buf = (char *)addr;
        ssize_t res;
        if (!buf)
                return -ENOMEM;

        pos += snprintf(buf+pos, len-pos, "state = %s\n",
                                szStates[priv->connect_status]);
        pos += snprintf(buf+pos, len-pos, "region_code = %02x\n",
                                (u32) priv->regioncode);

        res = simple_read_from_buffer(userbuf, count, ppos, buf, pos);

        free_page(addr);
        return res;
}

This might not be possible to trigger if the string length is short, but it still presents a buggy pattern . This driver has a number of cases using this pattern however. For example:

        for (i = 0; i < num_of_items; i++) {
                if (d[i].size == 1)
                        val = *((u8 *) d[i].addr);
                else if (d[i].size == 2)
                        val = *((u16 *) d[i].addr);
                else if (d[i].size == 4)
                        val = *((u32 *) d[i].addr);
                else if (d[i].size == 8)
                        val = *((u64 *) d[i].addr);

                pos += sprintf(p + pos, "%s=%d\n", d[i].name, val);
        }

        res = simple_read_from_buffer(userbuf, count, ppos, p, pos);

        free_page(addr);
        return res;
}

Other Bugs

There a few other places following similar patterns, but they are unlikely to be triggered.

Recommendations


The solution is quite easy. Replace snprintf with scnprintf.

Conclusion


This is an interesting look into bug patterns. There is almost certainly going to be a history of cut and pasting insecure code between drivers that leads to problems. It also highlights the problems of duplicate code leading to bugs and vulnerabilities - when code is manually duplicated instead of being centralised, bugs appear.

Overall, these bugs are extremely low impact. Highlighting them is more an exercise into the code review process and the nature of bugs. Still, they are security bugs and should be fixed.

Comments

Popular posts from this blog

NetBSD kernel wscons IOCTL vulnerable bug class

Linux Kernel Infoleaks