GuidedTracker: Track the Victims with Access Logs to Finding Malicious Web Pages

沙泓州  刘庆云  周舟  郑超 



Abstract: Malicious web pages have become a malignant tumour for the Internet, which spread malicious code, steal people’s private information, and deliver spamming advertisements. And how to distinguish them from the huge number of normal web pages effectively remains a huge challenge in the era of big data. To detect malicious pages, one needs to first collect candidate web pages that are live on the web; then filter massive legitimate pages using fast filters and finally examine the remaining pages using precisely but slow analyzer. However, there are new challenges recently for these conventional techniques, including large scale, imbalance data and the usage of cloaking techniques. To cope with these challenges, the malicious URL detection system should perform more efficiently. 

In this paper, we propose a system, named GuidedTracker, to search for suspicious malicious pages. GuidedTracker starts from the seed set which includes known malicious pages. Then, it automatically figures out those victims based on the seed set and the visit relation database. Finally, the access records of these victims are used to identify other malicious pages. In this way, GuidedTracker increase the percentages of malicious URLs in the input URL stream submitted to the precisely analyzer. To our best knowledge, GuidedTracker is the first to introduce visit relations to tackle the malicious URL detection problem. The introduction of visit relations limits the scope of URL inspection and enables this approach to have the ability of self-learning. Experimental results show that the overall “toxicity” can be improved by 6.97% - 50.38% compared with full inspection of access logs. 




首页
团队介绍
发展历史
组织结构
MESA大事记
新闻中心
通知
组内动态
科研成果
专利
论文
项目
获奖
软著
人才培养
MESA毕业生
MESA在读生
MESA员工
招贤纳士
走进MESA
学长分享
招聘通知
招生宣传
知识库
文章
地址:北京市朝阳区华严北里甲22号楼五层 | 邮编:100029
邮箱:nelist@iie.ac.cn
京ICP备15019404号-1