yara引擎调研
发表于更新于
字数总计:1.9k阅读时长:9分钟 中国
随想病毒检测开发yara引擎调研
Ivoripuionyara引擎调研
规则通用性问题
使用CS4.0生成如下的payload,其中powershell脚本10个,pe后门12个,dll文件2个,共计24个样本:
且这些样本的md5值均不一样,也就是所采用原始的拉黑md5方法是不可行的:
使用开源的CS yara规则(GitHub - chronicle/GCTI)对其进行扫描,可以看到所有的样本均检出了,且有一个样本被重复检出两次(beacon_server64.exe),最终的检出率为:100%=24/24:
静态的文本类特征在这种非混淆对抗的情况下通用性还是很高的,且是可以摆脱拉黑md5这种策略的局限性的。
如何内嵌到相应的产品引擎中
基于yara官方的文档,可以有两种内嵌的方法:
- 直接通过命令行调用编译好的yara引擎;
- 使用yara C api来调用yara相关的api来进行病毒扫描;
这两种方法在调用上是一样的,因为官方给的编译的yara引擎也是基于yara C api实现的:
这里使用的样本案例和yara样例:
test.txt内容:
test.yar内容:
1 2 3 4 5 6 7 8
| rule test { strings: $s1 = "111" condition: any of ($s*) }
|
命令行调用编译好的yara引擎进行病毒扫描
即将编译好的yara引擎(windows下yara.exe)直接内嵌到产品中,在进行病毒扫描时通过命令行调用,demo如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| #include <iostream> #include <stdlib.h> #include <string> using namespace std;
#define RULES_FILE "D:\\workstation\\yara_demo\\yara_demo\\test.yar" #define VIRUS_FILE "D:\\workstation\\yara_demo\\yara_demo\\test.txt"
int main() { string rule_file = RULES_FILE; string virus_file = VIRUS_FILE; string cmdline = "yara " + rule_file + " " + virus_file; system(cmdline.c_str()); return 0; }
|
扫描结果如下:
使用yara C api来实现病毒扫描
将yara.h、libyara64.lib导入到执行目录和库目录,将libyara64.lib、ws2_32.lib、crypt32.lib添加到链接器的附加依赖项。
基于官方文档实现的一个简单的Demo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
| #include <iostream> #include <windows.h> #include <yara.h>
#define RULES_FILE "D:\\workstation\\yara_demo\\yara_demo\\test.yar" #define VIRUS_FILE "D:\\workstation\\yara_demo\\yara_demo\\test.txt"
int my_callback_function(YR_SCAN_CONTEXT* context, int message, void* message_data, void* user_data) { if (message == CALLBACK_MSG_RULE_MATCHING) { std::cout << "Matched!" << std::endl; return CALLBACK_ABORT; }
if (message == CALLBACK_MSG_SCAN_FINISHED) { std::cout << "Not Matched!" << std::endl; return CALLBACK_ABORT; }
return CALLBACK_CONTINUE; }
int main(int argc, char** argv) { HANDLE hFile = CreateFileA(RULES_FILE, GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL); if (hFile == INVALID_HANDLE_VALUE) { std::cerr << "Failed to open rule file for reading" << std::endl; return EXIT_FAILURE; }
HANDLE vFile = CreateFileA(VIRUS_FILE, GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL); if (vFile == INVALID_HANDLE_VALUE) { std::cerr << "Failed to open virus file for reading" << std::endl; return EXIT_FAILURE; }
int result = yr_initialize(); if (result != ERROR_SUCCESS) { std::cerr << "Failed to initialize Yara engine" << std::endl; CloseHandle(hFile); return EXIT_FAILURE; }
YR_COMPILER* compiler; result = yr_compiler_create(&compiler); if (result != ERROR_SUCCESS) { std::cerr << "Failed to create Yara compiler" << std::endl; CloseHandle(hFile); return EXIT_FAILURE; }
result = yr_compiler_add_fd(compiler, hFile, NULL, NULL); if (result != ERROR_SUCCESS) { std::cerr << "Failed to add file contents to Yara compiler" << "\nError Code:" << result << std::endl; CloseHandle(hFile); return EXIT_FAILURE; }
YR_RULES* rules; result = yr_compiler_get_rules(compiler, &rules); if (result != ERROR_SUCCESS) { std::cerr << "Failed to compile Yara rules" << std::endl; CloseHandle(hFile); return EXIT_FAILURE; }
result = yr_rules_scan_fd(rules,vFile,SCAN_FLAGS_FAST_MODE,my_callback_function,NULL,1000); if (result != ERROR_SUCCESS) { std::cerr << "Failed to scan memory with Yara rules" << std::endl; CloseHandle(hFile); return EXIT_FAILURE; }
yr_rules_destroy(rules); yr_compiler_destroy(compiler); yr_finalize(); CloseHandle(hFile); CloseHandle(vFile);
return EXIT_SUCCESS; }
|
病毒扫描后的动作均在于病毒扫描函数的回调函数中。
扫描结果如下:
两种方法的对比
- 性能对比
两种方法根本上是没区别的,但是不断地启动一个新的进程和销毁进程的开销对效率还是有影响的。
- 效率对比
将样本beaconx64.dll(大小:281k)复制1000次,生成1024个样本,使用yara进行扫描五次文件夹,运行时间:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| ┌─[root@cars]─[/home/cars/workstation/cobaltstrike4.0-cracked/shellcode] └──╼ warning: rule "CobaltStrike_Resources_Artifact32svc_Exe_v1_49_to_v3_14" in GCTI/YARA/CobaltStrike/CobaltStrike__Resources_Artifact32svc_Exe_v1_49_to_v4_x.yara(47): string "$decoderFunc" may slow down scanning
real 0m0.653s user 0m0.983s sys 0m0.499s ┌─[✗]─[root@cars]─[/home/cars/workstation/cobaltstrike4.0-cracked/shellcode] └──╼ warning: rule "CobaltStrike_Resources_Artifact32svc_Exe_v1_49_to_v3_14" in GCTI/YARA/CobaltStrike/CobaltStrike__Resources_Artifact32svc_Exe_v1_49_to_v4_x.yara(47): string "$decoderFunc" may slow down scanning
real 0m0.603s user 0m1.008s sys 0m0.436s ┌─[✗]─[root@cars]─[/home/cars/workstation/cobaltstrike4.0-cracked/shellcode] └──╼ warning: rule "CobaltStrike_Resources_Artifact32svc_Exe_v1_49_to_v3_14" in GCTI/YARA/CobaltStrike/CobaltStrike__Resources_Artifact32svc_Exe_v1_49_to_v4_x.yara(47): string "$decoderFunc" may slow down scanning
real 0m0.648s user 0m1.040s sys 0m0.456s ┌─[✗]─[root@cars]─[/home/cars/workstation/cobaltstrike4.0-cracked/shellcode] └──╼ warning: rule "CobaltStrike_Resources_Artifact32svc_Exe_v1_49_to_v3_14" in GCTI/YARA/CobaltStrike/CobaltStrike__Resources_Artifact32svc_Exe_v1_49_to_v4_x.yara(47): string "$decoderFunc" may slow down scanning
real 0m0.620s user 0m1.066s sys 0m0.416s ┌─[✗]─[root@cars]─[/home/cars/workstation/cobaltstrike4.0-cracked/shellcode] └──╼ warning: rule "CobaltStrike_Resources_Artifact32svc_Exe_v1_49_to_v3_14" in GCTI/YARA/CobaltStrike/CobaltStrike__Resources_Artifact32svc_Exe_v1_49_to_v4_x.yara(47): string "$decoderFunc" may slow down scanning
real 0m0.594s user 0m1.008s sys 0m0.446s
|
平均时间为user
时间的平均:1.02s。
使用yara C api实现的批量读取规则并批量扫描的Demo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122
| #include <iostream> #include <windows.h> #include <yara.h>
#define RULES_DIR "D:\\workstation\\yara_demo\\yara_demo\\rules" #define SCAN_DIR "D:\\workstation\\yara_demo\\yara_demo\\scan"
int my_callback_function(YR_SCAN_CONTEXT* context, int message, void* message_data, void* user_data) { if (message == CALLBACK_MSG_RULE_MATCHING) { std::cout << "Matched!" << std::endl; return CALLBACK_ABORT; }
if (message == CALLBACK_MSG_SCAN_FINISHED) { LARGE_INTEGER liFinishTime; QueryPerformanceCounter(&liFinishTime); LARGE_INTEGER liFrequency; QueryPerformanceFrequency(&liFrequency); double elapsedTime = static_cast<double>(liFinishTime.QuadPart - *reinterpret_cast<LONGLONG*>(user_data)) / liFrequency.QuadPart; std::cout << "Not Matched! Elapsed Time: " << elapsedTime << " s" << std::endl; return CALLBACK_ABORT; }
return CALLBACK_CONTINUE; }
int main(int argc, char** argv) { WIN32_FIND_DATAA find_data; HANDLE hFind = FindFirstFileA((std::string(RULES_DIR) + "\\*").c_str(), &find_data); if (hFind == INVALID_HANDLE_VALUE) { std::cerr << "Failed to open rule directory" << std::endl; return EXIT_FAILURE; }
int result = yr_initialize(); if (result != ERROR_SUCCESS) { std::cerr << "Failed to initialize Yara engine" << std::endl; FindClose(hFind); return EXIT_FAILURE; }
YR_COMPILER* compiler; result = yr_compiler_create(&compiler); if (result != ERROR_SUCCESS) { std::cerr << "Failed to create Yara compiler" << std::endl; FindClose(hFind); return EXIT_FAILURE; }
do { if (!(find_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) { std::string rule_file = std::string(RULES_DIR) + "\\" + find_data.cFileName; HANDLE hFile = CreateFileA(rule_file.c_str(), GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL); if (hFile == INVALID_HANDLE_VALUE) { std::cerr << "Failed to open rule file for reading: " << rule_file << std::endl; continue; } result = yr_compiler_add_fd(compiler, hFile, NULL, NULL); if (result != ERROR_SUCCESS) { std::cerr << "Failed to add file contents to Yara compiler: " << rule_file << "\nError Code:" << result << std::endl; CloseHandle(hFile); continue; } } } while (FindNextFileA(hFind, &find_data));
FindClose(hFind);
YR_RULES* rules; result = yr_compiler_get_rules(compiler, &rules); if (result != ERROR_SUCCESS) { std::cerr << "Failed to compile Yara rules" << std::endl; yr_compiler_destroy(compiler); return EXIT_FAILURE; }
hFind = FindFirstFileA((std::string(SCAN_DIR) + "\\*").c_str(), &find_data); if (hFind == INVALID_HANDLE_VALUE) { std::cerr << "Failed to open scan directory" << std::endl; yr_rules_destroy(rules); yr_compiler_destroy(compiler); yr_finalize(); return EXIT_FAILURE; }
LARGE_INTEGER liStartTime; QueryPerformanceCounter(&liStartTime); do { if (!(find_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) { std::string scan_file = std::string(SCAN_DIR) + "\\" + find_data.cFileName; result = yr_rules_scan_file(rules, scan_file.c_str(), SCAN_FLAGS_FAST_MODE, my_callback_function, &liStartTime.QuadPart, 1000); if (result != ERROR_SUCCESS) { std::cerr << "Failed to scan file with Yara rules: " << scan_file << std::endl; continue; } } } while (FindNextFileA(hFind, &find_data));
FindClose(hFind);
yr_rules_destroy(rules); yr_compiler_destroy(compiler); yr_finalize();
return EXIT_SUCCESS; }
|
这里没有用到多线程等提高效率的方式,所以总的时间比较长,相同数量规则和相同数量的文件情况下,总的扫描时间为3s左右:
可以看到,在不使用效率优化的策略情况下,使用yara C api的效率还不如直接调用官方给的yara引擎。