CONCLUSIONS
The key of IDS based on artificial immune is
generating the Detector set. Our approach
helps to reduce both time and space
complexities. This is really new result in the
area. In the future, we plan to report more
detail experimental data about the algorithm
as well as to apply them for dynamic data
space in our next papers. Because we believe
that location table A can contain not only bit
data, but also integer data for dynamic S. It
means that A[i,j] will increase or decrease one
when adding a self to S or deleting a self from
S, respectively. Furthermore, it is our belief
that our method can be used for variable
matching length r. This new idea will be
suitable for various problems in the field of
computer science.
ACKNOWLEDGMENT
The authors would like to thank Dr. Nguyen
Xuan Hoai and many colleagues in IT R&D
Center - HNU, for their useful suggestions
when authors presented the research in seminar
series on computational intelligence last year.
5 trang |
Chia sẻ: yendt2356 | Lượt xem: 418 | Lượt tải: 0
Bạn đang xem nội dung tài liệu A new approach for generating a complete detector repertoire in artificial immune systems, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Nguyễn Văn Trường và đtg Tạp chí KHOA HỌC & CÔNG NGHỆ 80(04): 121 - 125
121
A NEW APPROACH FOR GENERATING A COMPLETE DETECTOR
REPERTOIRE IN ARTIFICIAL IMMUNE SYSTEMS
Nguyen Van Truong1, Pham Dinh Lam2
1 College of Education – TNU; 2 Board of Information Technology - TNU
ABSTRACT
Generation of the Detector set is a key problem in Artificial Immune Systems (AIS) due to cost of
time and space. A weakness in many detector generating algorithms was random generation of
detectors so that many candidate detectors may be discarded. We present a novel data structure for
extracting data from protected self set. That helps to produce all possible detectors, or a complete
detector repertoire, with lower time and space complexities compare to recent novel approaches.
Theoretical analysis and experimental results show that our approach is effective and feasible.
These new valuable characteristics can make complex AIS have the highest ability to detect data
changes and to reduce false detection.
Keywords: Artificial immune system, Intrusion detection system, self set, detector set, complete
detector repertoire, false detection
INTRODUCTION*
It is impractical for security software to detect
all intrusions in computer networks. Many
network technologies on security came into
being. The more successful one is based on
AIS, which illustrates biology immune system
on computers. It is evaluated as a new and
effective soft computing method in the field
of network security and information security.
Especially, it is suitable for building up the
Intrusion Detection Systems (IDS) that can
protect computer systems against intruders
and the destruction of computer viruses or
other malicious software system.
This paper is concerned with one aspect of
computer security: ability of detecting all
kinds of changes or attacks, both known and
unknown ones. The method explored involves
the most discussed immune models: Negative
selection algorithm. In the algorithm,
generating the detector set plays a very
important role and helps to increase the
overall performance of AIS. Our method is
to concentrate on producing complete
repertoire for negative selection in not only
security systems, but also in problem
requiring some tolerance of noise, or
involving dynamic streams of data (such as
system call sequences).
*
Tel: 0915016063; Email: nvtruongtn@gmail.com
There are many researches on intrusion
detection generating, such as Exhaustive
Detector Generating Algorithm, Linear Time
Detector generating algorithm, the Greedy
Detector Generating Algorithm [2], [7]. Our
algorithm involves a special data structure
called Location table. It plays an important
part in our algorithm and leads to reduce both
time and space complexities. Furthermore,
our approach has many advantages when
applying in dynamic environment as
mentioned in the 5th section.
The remaining of the paper is organized as
follows. In the 2nd section, we give a brief
literature review of AIS. Our technique of
generating complete repertoire, the main part of
the paper, is presented in the 3rd section. Some
analyses and experiments are given in the 4th
section. The 5th section concludes the paper.
LITERATURE REVIEW
In this section, we first described negative
selection algorithm, very commonly used
algorithm in AISs. After that, we present a bit
some recent researches in the field.
Artificial negative selection is a
computational imitation of self/nonself
discrimination, firstly designed as a change
detection method. This mechanism is first
given by Forrest et al. [1]. The outline of a
typical negative selection algorithm contains
two stages [10].
Nguyễn Văn Trường và đtg Tạp chí KHOA HỌC & CÔNG NGHỆ 80(04): 121 - 125
122
Figure 1. Original model of detector generation
In the generation stage (Fig. 1), the detectors
are generated by some random process and
censored by trying to match self samples
taken from set S. Those candidates that match
are eliminated and the rest are kept as
detectors in set D. In the detection stage (Fig.
2), the collection of detectors (or detector set)
is used to check whether an incoming data
instance is self or nonself. If it matches any
detector, it is claimed as nonself or an
anomaly. This description is limited to some
extent, but conveys the essential idea.
Figure 2. Detection of new instances
There have never been Vietnamese
individuals or groups have studied
systematically artificial immune system and
its application, except that NC Group (Natural
Computing) of the Military Technical
Academy. However, it only studied some
theory aspects without experiments [5]. Our
recent research evolves negative selection
algorithm, but it only concentrates on how to
detect effectively by using given location
table without generating detectors [6]. The
approach therefore is not suitable for
distributed environment. There are ongoing
studies of many foreign researches in the field
[7]. But their algorithms’ complexities are
much lager than our ones. The details of our
algorithm are described in the following
section.
GENERATING COMPLETE REPERTOIRE
In the Fig. 3, we illustrate the process for
generating a detector set by using the number
of five contiguous matches required for a
match. The string to be protected is logically
segmented into five equal-length “self”
strings. To generate the detector, random
strings are produced and matched against
each of the self strings. The first two strings,
00000111 and 00000010, are eliminated
because they both match self string 00000110
at at least five contiguous positions. The
string 11101001 fails to match any string in
the self at at least five contiguous positions,
so it is accepted into the detector set.
Figure 3. Illustration of generating Detector set
Our approach is derived from previous work
[6]. Data space is often huge, so AIS need
huge memory storage. This also leads to
increase of time complexity. Our
improvement has lower complexity than that
of current fastest algorithm. Detail notations
are described in the following before giving
our algorithm.
Nguyễn Văn Trường và đtg Tạp chí KHOA HỌC & CÔNG NGHỆ 80(04): 121 - 125
123
- U is the set of all possible binary strings of
length l, and r is the matching threshold.
- s is the binary string, s ∈ U.
- D is the detector set, D ⊂ U.
- A is two dimensions matrix presenting
location table with the size 2r.(l-r+1)
The pseudo code of the first part of the
algorithm to create table A is described as
follow.
Initialize all cells of A to 0;
For each s string to be protected, s ∈ U
For j ∈ [1, l - r+1]
Begin for
i = is an integer number of sub-string of s
whose length is r and starting position is j
within the string s;
A[i, j] = 1;
End for
For example, given self set S containing five
strings to be protected S = {s1 = 01011; s2 =
11001; s3 = 10110; s4 = 00101; s5 = 01010}; l
= 5, r = 3. Applying our algorithm we had the
table A below.
1 2 3
0 0 0 0
1 1 0 1
2 1 1 1
3 0 1 1
4 0 1 0
5 1 1 1
6 1 0 1
7 0 0 0
For s1 = 01011: with j = 1 we calculate i =
(010)2 = (2)10. Thus, A[2,1] equals to 1. A[6,2]
equals to 0 because there are no sub-string
110 (or 6 in integer form) of all five strings
starting in 2nd position. Other values in the
table are calculated in a similar way.
The pseudo code of the second part of the
algorithm for generating complete repertoire
is as follow. In the algorithm, the function
right(d, r-1) returns a r – 1 bit string from the
end of the bit string d.
D = ∅ ;
Function Try(d,i)
begin
if (i = l)D ← d;
else
begin
d1=right (d,r-1) + “1”;
d0 = right (d,r-1) + “0”;
k = integer form of d1;
if (A[k,i] = 0) Call Try(d1,i+1);endif
k = integer form of d0;
if (A[k,i] = 0) Call Try(d0,i+1);endif
end
endif
end
for i satisfying A[i,1] = 0
d = binary form of integer i having r bit;
Call Try(d,r);
endfor
For example, with A[4,1] = 0, we initialize d
= 100 as first three bits of candidate detectors.
We calculate d1 = 001 and d0 = 000.
With d1 = 001, we have k = 1, A[1,2] = 0 so
that d1 will be used in next step. Similarly, in
the next step, we calculate d0 = 010 and d1 =
011. These two strings have values 2 and 3 in
integer form, respectively. Both A[2,3] and
A[3,3] equal to 1, so that we cannot find out
any detector in this step.
With d0 = 000, we have k = 0, A[0,2] = 0 so
that d0 will be used in next step. It is similar
and easy to find out one detector d = 10001 in
the step.
By examining A[0,1], A[3,1] and A[7,1] in
similar way, we can find out a complete
detector repertoire D containing six detectors
D = {d1 = 00000; d2 = 01100; d3 = 01111; d4
= 10000; d5 = 11111; d6 = 11100}.
ANALYSIS AND EXPERIMENT
The principle data structure used in this
algorithm consists of the large (l − r) × 2r A
array to save data extracted from self. This
has an impact on the time and space
complexities of the algorithm: time: O((l –
r).|S| + O(2l-r)), and space: O(2r.(l-r)).
Nguyễn Văn Trường và đtg Tạp chí KHOA HỌC & CÔNG NGHỆ 80(04): 121 - 125
124
The above algorithm runs in time linear in the
size of the self set (for given parameters l and
r). This is much lower than algorithm presents
in [7]. Our algorithm’s complexities are 18
times lower than that of the algorithms in [7]
in some cases. The detailed comparison is in
table 1 below.
We now present our experiment on AIS for
detecting real viruses. Our viruses are quite
strange to some well known antivirus
software; and they can affect .COM and .EXE
file types in the current directory only. The
experiments were carried out by compared the
viruses detecting ability of our AIS-IDS and
that of famous commercial antivirus software
KIS, Kaspersky Internet Security. The
software was licensed and updated on Feb 15,
2011. The software can not detect our virus
and can only detect the viruses after a few
days. This can be explained as follows: 1.
Database of the software has not had or has
not updated signatures of the viruses; 2.
Mechanism of virus detection based on
behavior is not good enough. In contrast, our
approach allows to detect changes even one
bit of data immediately. Our system can alert
100% accurately, it means that the rate of
false positive reduces to zero and the rate of
false negative reduces to minimum.
Picture 1. Our system found 3 files infected with virus
Picture 2. KIS can not detect virus
Table 1. The result comparison
|S| l R
Our alg.’s
space comp.:
O(2r.(l-r))
Space comp. of
the algo. in [7]:
O(2r.(l-r)2)
Our algo.’s time
complexity:
O((l – r).|S| +
O(2l-r))
Time comp. of
the algo. in [7]:
O((l – r).|S| +
O(2r(l-r))
250 16 10 6,144 36,864 1,564 7,644
500 16 11 10,240 51,200 2,532 12,740
500 32 14 294,912 5,308,416 271,144 303,912
1,000 16 12 16,384 65,536 4,016 20,384
1,000 32 14 294,912 5,308,416 280,144 312,912
1,000,000 32 15 557,056 9,469,952 17,000,362 17,557,056
Nguyễn Văn Trường và đtg Tạp chí KHOA HỌC & CÔNG NGHỆ 80(04): 121 - 125
125
CONCLUSIONS
The key of IDS based on artificial immune is
generating the Detector set. Our approach
helps to reduce both time and space
complexities. This is really new result in the
area. In the future, we plan to report more
detail experimental data about the algorithm
as well as to apply them for dynamic data
space in our next papers. Because we believe
that location table A can contain not only bit
data, but also integer data for dynamic S. It
means that A[i,j] will increase or decrease one
when adding a self to S or deleting a self from
S, respectively. Furthermore, it is our belief
that our method can be used for variable
matching length r. This new idea will be
suitable for various problems in the field of
computer science.
ACKNOWLEDGMENT
The authors would like to thank Dr. Nguyen
Xuan Hoai and many colleagues in IT R&D
Center - HNU, for their useful suggestions
when authors presented the research in seminar
series on computational intelligence last year.
REFERENCES
[1].Forrest et al, Self-Nonself Discrimination in a
Computer, in Proceedings of 1994 IEEE
Symposium on Research in Security and Privacy,
Oakland, CA, 202-212, 1994.
[2].F. Liu, Q. Wang and X. Gao, Survey of
Artificial Immune System, in Proceedings of The
First IEEE International Symposium on Systems
and Control in Aerospace and Astronautics, 985-
989, 2006.
[3].Jamie Twycross at al., Detecting Anomalous
Process Behavior using Second Generation
Artificial Immune Systems, Journal of
Unconventional Computing, Vol. 1, pp. 1–26,
2010.
[4].L. N de Castro and J. Timmis, Artificial
Immune Systems: A New Computational
Intlligence Approach, Springer-Verlag, 2002.
[5].Nguyễn Xuân Hoài, Nguyễn Văn Trường, Vũ
Mạnh Xuân, “Hệ miễn dịch nhân tạo và ứng
dụng”, Tạp chí Khoa học và Công nghệ - ĐH Thái
Nguyên, số 2, 2007, 13-18.
[6].Nguyen Van Truong, Pham Dinh Lam,
Improving negative selection algorithm in
artificial immune systems for computer virus
detection, Journal of Science and Technology,
Thai Nguyen University, 72(10), 2010, 53-58.
[7].Patrik D’haeseleer et al., An Immunological
Approach to Change Detection: Algorithms,
Analysis and Implications, IEEE Symposium on
Security and Privacy, 1996.
[8].U. Aickelin, J. Greensmith and J. Twycross,
Immune System Approaches to Intrusion
Detection - A Review, in The Proceedings of the
Third International Conference on Artificial
Immune Systems, LNCS 3239, 316-329, 2004.
[9].T. Lin et al. Research on The Network
Intrusion Detection Based on The Immune
System, in Proceedings of the Fifth IEEE
International Conference on Machine Learning
and Cybernetics, 4479-4482, 2006.
[10]. Zhou Ji, Dipankar Dasgupta, Revisiting
Negative Selection Algorithms, Evolutionary
Computation 15(2), 2007.
TÓM TẮT
MỘT CÁCH TIẾP CẬN MỚI ĐỂ SINH RA MỘT KHO ĐẦY ĐỦ CÁC BỘ DÒ
TRONG HỆ MIỄN DỊCH NHÂN TẠO
Nguyễn Văn Trường1*, Phạm Đình Lâm2
1Trường Đại học Sư phạm – ĐH Thái Nguyên; 2 Đại học Thái Nguyên
Vấn đề chủ yếu trong các Hệ miễn dịch nhân tạo là việc sinh ra tập bộ dò tốn nhiều chi phí về thời
gian và bộ nhớ. Điểm yếu trong nhiều thuật toán sinh bộ dò là do các bộ dò được sinh ra ngẫu
nhiên nên nhiều ứng cử viên cho bộ dò bị loại bỏ. Chúng tôi trình bày một cấu trúc dữ liệu mới để
trích rút dữ liệu từ tập tế bào cần bảo vệ. Nó cho phép sinh ra tất cả các bộ dò, hay kho đầy đủ các
bộ dò, với chi phí thấp hơn về thời gian và bộ nhớ so với những nghiên cứu gần đây. Những phân
tích lý thuyết và kết quả thực nghiệm chứng tỏ cách tiếp cận của chúng tôi hiệu quả và khả thi.
Những đặc điểm giá trị này cho phép các hệ thống AIS phức tạp có khả năng cao nhất để phát hiện
những thay đổi trong dữ liệu và làm giảm phát hiện âm cực.
Từ khóa: Hệ miễn dịch nhân tạo, Hệ phát hiện xâm nhập, tập tế bào, tập bộ dò, kho đầy đủ các
bộ dò, phát hiện âm cực
*
Tel: 0915016063; Email: nvtruongtn@gmail.com
Các file đính kèm theo tài liệu này:
- brief_33255_37081_318201281315tap80so04_nam2011_split_24_0794_2052435.pdf