Summary
This is the 2nd part of the C++ memory corruption series*. In this post, we'll look at corrupting the std::vector class in Linux and see what exploitation primitives we can gain. We'll see that we can build arbitrary read/write primitives.
* https://blog.infosectcbr.com.au/2020/08/c-memory-corruption-part-1.html
Author: Dr Silvio Cesare
Introduction
C++ is a common language for memory corruption. However, there is much more literature on exploiting C and not C++ programs. C++ presents new classes, objects, and data structures which can all be effectively used for building exploitation primitives. In this post, we'll look at the std::vector class and see what specific primitives we can obtain.
Let's start by looking at /usr/include/c++/bits/stl_vector.h
namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
_GLIBCXX_BEGIN_NAMESPACE_CONTAINER
/// See bits/stl_deque.h's _Deque_base for an explanation.
template<typename _Tp, typename _Alloc>
struct _Vector_base
{
typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
rebind<_Tp>::other _Tp_alloc_type;
typedef typename __gnu_cxx::__alloc_traits<_Tp_alloc_type>::pointer
pointer;
struct _Vector_impl_data
{
pointer _M_start;
pointer _M_finish;
pointer _M_end_of_storage;
We can see there are 3 members of importance. _M_start, _M_finish, and _M_end_of_storage. The first 2 of these members are the ones we will corrupt and are reasonable self explanatory. They point to the beginning and just past the end of the vector's contents. To see this, we'll write a simple program and debug it.
#include <iostream>
#include <cstdio>
#include <vector>
#include <unistd.h>
static std::vector<long> v;
int
main()
{
v = std::vector<long>(10);
v[0] = 10;
v[1] = 20;
asm("int3");
exit(0);
}
Now let's run it inside a debugger (GDB with the GEF plugin).
10 {
11 v = std::vector<long>(10);
12 v[0] = 10;
13 v[1] = 20;
14 asm("int3");
→ 15 exit(0);
16 }
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "vector", stopped 0x5555555552e2 in main (), reason: SIGTRAP
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x5555555552e2 → main()
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
gef➤ x/gx &v
0x555555558040 <_ZL1v>: 0x000055555556aeb0
gef➤
0x555555558048 <_ZL1v+8>: 0x000055555556af00
gef➤
0x555555558050 <_ZL1v+16>: 0x000055555556af00
gef➤
0x555555558058: 0x0000000000000000
gef➤ x/gx 0x000055555556aeb0
0x55555556aeb0: 0x000000000000000a
gef➤
0x55555556aeb8: 0x0000000000000014
gef➤
Now inside a debugger, we can see the 3 members in the vector starting at 0x55...040. We can also view the contents of the vector starting at 0x55...eb0. In hexadecimal, 10 and 20 are 0xa and 0x14 respectively.
At this point, we have enough information to test some exploitation techniques.
Technique 1
This technique is simple. We'll overwrite the vector's _M_start member with an arbitrary address. We'll then access the vector at index 0. This is an arbitrary read/write primitive!
Here's the code:
#include <iostream>
#include <cstdio>
#include <vector>
#include <unistd.h>
/*
* std::vector consists of 3 pointers
* first pointer points to the backing contents
*/
static long x;
static std::vector<long> v;
int
main()
{
std::vector<long>::iterator it;
v = std::vector<long>(10);
v[0] = 10;
v[1] = 20;
*(long *)&v = (long)&x; // memory corruption
v[0] = 0x41414141;
printf("%lx\n", x);
_exit(0);
}
Technique 2
This technique is similar to the first. We'll overwrite the vector's _M_start member with arbitrary address and then use an iterator to access the vector.
#include <iostream>
#include <cstdio>
#include <vector>
#include <unistd.h>
/*
* std::vector consists of 3 pointers
* first pointer points to the backing contents
*/
static long x;
static std::vector<long> v;
int
main()
{
std::vector<long>::iterator it;
v = std::vector<long>(10);
v[0] = 10;
v[1] = 20;
*(long *)&v = (long)&x; // memory corruption
x = 0x42424242;
printf("%lx\n", v[0]);
it = v.begin();
*it = 0x41414141;
printf("%lx\n", x);
_exit(0);
}
Another variation of this technique is to build an arbitrary read use the front() method.
#include <iostream>
#include <cstdio>
#include <vector>
#include <unistd.h>
/*
* std::vector consists of 3 pointers
* first pointer points to the backing contents
*/
static long x;
static std::vector<long> v;
int
main()
{
long y;
std::vector<long>::iterator it;
v = std::vector<long>(10);
v[0] = 10;
v[1] = 20;
x = 0x41414141;
*(long *)((char *)&v + 0) = (long)&x; // memory corruption
y = v.front();
printf("%lx\n", y);
_exit(0);
}
Technique 3
Can we use the back() method for an arbitrary read? Yes. But we need to corrupt the _M_finish member. We also need this pointer to point just pass the address that we use:
#include <iostream>
#include <cstdio>
#include <vector>
#include <unistd.h>
/*
* std::vector consists of 3 pointers
* first pointer points to the backing contents
*/
static long x;
static std::vector<long> v;
int
main()
{
long y;
std::vector<long>::iterator it;
v = std::vector<long>(10);
v[0] = 10;
v[1] = 20;
x = 0x41414141;
*(long *)((char *)&v + 8) = (long)&x + 8; // memory corruption
y = v.back();
printf("%lx\n", y);
_exit(0);
}
Technique 4
Can we use the push_back() method for an arbitrary write? Yes. We need to use the _M_finish member again.
#include <iostream>
#include <cstdio>
#include <vector>
#include <unistd.h>
/*
* std::vector consists of 3 pointers
* first pointer points to the backing contents
*/
static long x;
static std::vector<long> v;
int
main()
{
std::vector<long>::iterator it;
v = std::vector<long>(10);
v[0] = 10;
v[1] = 20;
*(long *)((char *)&v + 8) = (long)&x; // memory corruption
v.push_back(0x41414141);
printf("%lx\n", x);
_exit(0);
}
Naturally, we can use push_front for an arbitrary write by corruption _M_start.
Conclusion
This post looked at a number of techniques that we can convert a memory corruption of std::vector into useful exploitation primitives such as arbitrary read/write, arbitrary read, and arbitrary write. Keep watching the blog for more posts on C++ memory corruption.