ffnn_GA-FFNN：基于签名的IDS的智能分类方法

ffnn

Abstract. Intrusion Detection System (IDS), the second line of defense mechanism plays a major role in safeguarding the network infrastructure from various threats imposed by the “Black hat” attackers. The ever-advancing nature of cyber-attacks makes the design and development of an efficient IDS, a complex task. Hence, this paper presents an intelligent IDS based on Feed Forward Neural Network (FFNN) and Genetic Algorithm (GA) for parameter optimization and classification of malicious and normal data. The experiments of GA-FFNN were evaluated on the NSL KDD dataset and the performance of the proposed algorithm has been validated with the performance metrics such as Classification accuracy, Detection rate, and false alarm rate.

抽象。 第二道防御机制入侵检测系统(IDS)在保护网络基础架构免受“黑帽”攻击者施加的各种威胁方面发挥着重要作用。网络攻击的不断发展的本质使高效IDS的设计和开发成为一项复杂的任务。因此，本文提出了一种基于前馈神经网络(FFNN)和遗传算法(GA)的智能IDS，用于对恶意数据和正常数据进行参数优化和分类。在NSL KDD数据集上评估了GA-FFNN的实验，并通过分类精度，检测率和误报率等性能指标验证了算法的性能。

Keywords: Genetic Algorithm, Feed Forward Neural Network, Parameter optimization, IDS

关键字：遗传算法，前馈神经网络，参数优化，IDS

1 Introduction

1引言

With the growth of “The internet” and “Computer networks”, the digital transformation has led to the massive generation of sensitive information over the network that might be affected when intrusions or vulnerabilities occur in the network [1]. The recent security incidents like, the nexus repository breach [2], ransomware [3], wannacry [4], password leak on yahoo [5] and data theft on adobe [6] insist the importance of protecting the sensitive information against intruders. Earlier, the traditional security measures like antivirus, access control, and firewall were used to protect the networks from various threats. However, these security mechanisms are obsolete due to the dynamic nature of intrusions and further, it has motivated the researchers to develop a robust security mechanism, Intrusion Detection system to fight against the ever-advancing intrusions [7]. According to NIST, “Intrusion detection is defined as an automated process which identifies any suspicious activities that compromise the Confidentiality, Integrity, and Availability (CIA) of the computer or network resources”. Based on the detection methodologies, IDS are classified into two types: (i) Misuse detection and (ii) Anomaly detection. The former method detects the intrusions based on the predefined patterns and provides less false positives. However, it fails to identify the new anomalies. Whereas, the latter mechanism identifies both known and unknown attacks. However, the false-positive rate is high [8]. Several researchers prefer misuse detection over anomaly detection to achieve high classification accuracy.

随着“互联网”和“计算机网络”的发展，数字化转型已导致网络上大量生成敏感信息，当网络中出现入侵或漏洞时，这些信息可能会受到影响[1]。最近发生的安全事件，如关系库破坏[2]，勒索软件[3]，幻想[4]，雅虎[5]上的密码泄漏和Adobe [6]上的数据盗窃，都强调了保护敏感信息不受入侵者侵害的重要性。以前，传统的安全措施(如防病毒，访问控制和防火墙)用于保护网络免受各种威胁。但是，由于入侵的动态性质，这些安全机制已经过时，而且，它促使研究人员开发一种强大的安全机制，即入侵检测系统，以对抗不断发展的入侵[7]。根据NIST，“入侵检测被定义为一种自动过程，该过程可识别危害计算机或网络资源的机密性，完整性和可用性(CIA)的任何可疑活动”。根据检测方法，IDS分为两种类型：(i)滥用检测和(ii)异常检测。前一种方法基于预定义的模式来检测入侵，并提供较少的误报。但是，它无法识别新的异常。而后一种机制可以识别已知和未知攻击。但是，假阳性率很高[8]。一些研究人员更喜欢滥用检测而不是异常检测，以实现较高的分类精度。

In general, Intrusion detection is identified as a classification problem that discriminates the “normal” and “malicious data”[9]. It has led the researchers to use the machine learning algorithms like Artificial Neural Network (ANN), K-Nearest Neighbor, Random forest, etc. with IDS to achieve better classification accuracy and detection rate [10]. Among these, ANN was significant in designing an Intelligent IDS as it can handle the imbalanced or incomplete dataset. The major problem in existing ANN-based IDS is the architecture of ANN is unstable due to the high dimensionality of the dataset which may trap at local minima [11]. To overcome this challenge, GA-FFNN IDS is proposed where the hyperparameters of FFNN (learning rate, number of hidden units, dropout, and penalty) have been optimized using a genetic algorithm to improve the stability of the ANN-based IDS. The major contribution of this work is:

一般而言，入侵检测被识别为区分“正常”和“恶意数据”的分类问题[9]。它导致研究人员将机器学习算法(如人工神经网络(ANN)，K最近邻，随机森林等)与IDS结合使用，以实现更好的分类准确性和检测率[10]。其中，ANN在设计智能IDS方面意义重大，因为它可以处理不平衡或不完整的数据集。现有的基于ANN的IDS的主要问题是ANN的体系结构不稳定，原因是数据集的高维度可能会陷入局部最小值[11]。为了克服这一挑战，提出了GA-FFNN IDS，其中使用遗传算法优化了FFNN的超参数(学习率，隐藏单元数，辍学和罚分)，以提高基于ANN的IDS的稳定性。这项工作的主要贡献是：

1. The proposed GA-FFNN was designed to classify normal and malicious data.

1.拟议的GA-FFNN旨在对正常数据和恶意数据进行分类。

2. Hyperparameters of FFNN were optimized with GA that avoids premature convergence.

2. FFNN的超参数使用GA进行了优化，避免了过早收敛。

3. The effectiveness of the proposed algorithm has been evaluated with the benchmark IDS dataset, NSL-KDD and the performance has been validated with accuracy and detection rate.

3.已使用基准IDS数据集NSL-KDD评估了该算法的有效性，并以准确性和检测率验证了性能。

2 Materials and Methods:

2。材料和方法：

2.1 Genetic Algorithm:

2.1遗传算法：

The Genetic Algorithm is an adaptive, meta-heuristic optimization approach, inspired by Darwin’s theory of evolution where stronger individuals are selected in a competing environment occurred in a biological process [19]. GA postulates that the potential solution of a problem is an individual chromosome that can be expressed as a set of parameters. GA guarantees the global optimal solution as it searches over the large sample space. The working behind the Genetic Algorithm is described in Algorithm 1.

遗传算法是一种自适应的元启发式优化方法，其灵感来自达尔文的进化论，即在竞争过程中选择更强的个体发生在生物过程中[19]。 GA假定问题的潜在解决方案是可以表达为一组参数的单个染色体。 GA可以在较大的样本空间中进行搜索，因此可确保提供全球最佳解决方案。遗传算法背后的工作在算法1中进行了描述。

Procedure

程序

Step 1:  Begin the algorithm by initializing random populationStep 2: At each step, GA uses the current individuals to generate the next populationStep 3: Compute fitness valueStep 4: Select the best individuals in the current populationStep 5: Apply cross over and mutation operationsStep 6: Replace the current population by crossover to create next gen-erationStep 7: Terminate the algorithm when stopping criteria is satisfied.

1.1 FeedForward Neural Network:

1.1前馈神经网络：

FFNN is a deep learning model often called Multilayer Perceptron (MLP). FFNN architecture comprises an input layer, hidden layers, and an output layer (Figure 1). In FFNN architecture, each neuron in one layer is connected to all the neurons of the next layer. It is a fully connected network that learns through supervised algorithms. FFNN operates with the ReLU (Rectified Linear Units) activation function in hidden layers [20].

FFNN是一种通常称为多层感知器(MLP)的深度学习模型。 FFNN体系结构包括输入层，隐藏层和输出层(图1)。在FFNN架构中，一层中的每个神经元都连接到下一层的所有神经元。这是一个通过监督算法学习的完全连接的网络。 FFNN在隐藏层中使用ReLU(整流线性单位)激活功能运行[20]。

And the net function is termed as,

净功能称为

Feed Forward Neural Network

前馈神经网络

FFNN is a deep learning model often called Multilayer Perceptron (MLP). FFNN architecture comprises the input layer, hidden layers, and the output layer. In FFNN architecture, each neuron in one layer is connected to all the neurons of the next layer. It is a fully connected network that learns through supervised algorithms. FFNN operates with the ReLU (Rectified Linear Units) activation function in hidden layers.

FFNN是一种通常称为多层感知器(MLP)的深度学习模型。 FFNN体系结构包括输入层，隐藏层和输出层。在FFNN架构中，一层中的每个神经元都连接到下一层的所有神经元。这是一个通过监督算法学习的完全连接的网络。 FFNN在隐藏层中使用ReLU(整流线性单位)激活功能进行操作。

Working of FFNN:

FFNN的工作：

Step 1: Initialize the input as number of samples and number of features in the dataset and output as decision classStep 2: Initialize number of features in input layer and compute net function using eqn. (1)Step 3: Initialize epoch=100 and error >=0.01Step 4: Use ReLU as an activation function for the hidden neurons of the hidden layer (Eqn.2)Step 5: Compute error.Step 6: If error is greater than or equal to 0.01, Update the weights of the network and repeat the iteration. (i.e. epoch=epoch+1)Step 7: Else return decision class.

Proposed Methodology:

拟议方法：

Step 1: Initialize the number of features as input and the optimize the number of hidden neurons, learning rate(l), momentum(m), and dropout(d), number of epochs and batch size

第1步：初始化作为输入的特征数量，并优化隐藏神经元的数量，学习率(l)，动量(m)和辍学(d) ，时期数和批量大小

Step 2: Initialize the maximum number of iterations, number of population and fitness= 0

步骤2：初始化最大迭代次数，总体数量和适应度= 0

Step 3: Optimize l,m, and d using Algorithm 1

步骤3：使用算法1优化l，m和d

Step 4: Compute the error using Eqn. (3)

步骤4：使用方程式计算错误。 (3)

Step 5: Calculate fitness=accuracy(best)

步骤5：计算健身=准确度(最佳)

Step 6: Terminate the condition when optimal parameters obtained or maximum number of iterations reached.

步骤6：在获得最佳参数或达到最大迭代次数时终止条件。

Step 7: Based on the best fitness function, Update the position of the population.

步骤7：根据最佳适应度函数，更新总体位置。

拟议流程图 (Proposed Flowchart)

**Fig. 2.** Flowchart for Proposed GA-FFNN图2. GA-FFNN拟议流程图

实验分析与讨论： (Experimental Analysis and Discussions:)

Experimental Setup

实验装置

To carry out the experiments of GA-FFNN, the NSL KDD dataset was used. The GA-FFNN algorithm was implemented using python 3.4 in an INTEL® CoreTM i5 processor @2.40 GHz, 8 GB RAM running windows 10 operating system. Further, Weka tool was used for validation purposes. The entire set of experiments were divided into three phases, (i) Data preprocessing, (ii) Training and Testing and (iii) Evaluate the performance of GA-FFNN based on classification accuracy.

为了进行GA-FFNN实验，我们使用了NSL KDD数据集。 GA-FFNN算法是使用Python 3.4在2.40 GHz，运行Windows 10操作系统的8 GB RAM的INTEL®CoreTM i5处理器中实现的。此外，Weka工具用于验证目的。整个实验分为三个阶段，(i)数据预处理，(ii)培训和测试，以及(iii)根据分类准确性评估GA-FFNN的性能。

2. Data Preprocessing:

2.数据预处理：

NSL-KDD dataset:

NSL-KDD数据集：

Tavallaee et al proposed NSL-KDD, an improved version of the KDD ’99 dataset to remove uncertainties in KDD-CUP [21]. As compared to KDD ’99 dataset, there are no duplicate records in the test and train sets. This dataset consists of approximately 1,074,992 single connection vectors, each of which contains a total of 41 features including basic features, Content related features, Time related traffic features, and Host-based traffic features. It has attribute value types grouped by Nominal, Binary, and Numeric. From connection vectors, each can be categorized as either an attack or a normal type. Attack types may be classified as DoS, U2R, R2L, and Probe. Data mapping and data normalization techniques were carried out as in our previous works [10].

Tavallaee等人提出了NSL-KDD，它是KDD '99数据集的改进版本，以消除KDD-CUP中的不确定性[21]。与KDD '99数据集相比，测试和训练集中没有重复的记录。该数据集由大约1,074,992个单连接向量组成，每个向量包含总共41个功能，包括基本功能，与内容相关的功能，与时间相关的流量功能以及基于主机的流量功能。它具有按名义，二进制和数值分组的属性值类型。从连接向量中，可以将每个分类为攻击类型或正常类型。攻击类型可以分为DoS，U2R，R2L和Probe。数据映射和数据归一化技术已按照我们以前的工作进行[10]。

3. Training and testing:

3.培训和测试：

Subsequently, the entire dataset was partitioned into 80% for training (TrainNSL) and 20% for testing (TestNSL) respectively.

随后，将整个数据集分别分为用于训练的80％(TrainNSL)和用于测试的20％(TestNSL)。

4. Evaluate the performance of GA-FFNN based on classification accuracy:

4.根据分类准确性评估GA-FFNN的性能：

The proposed GA-FFNN was designed to classify whether the incoming network traffic pattern is malicious or normal. It has been evaluated and validated with the following metrics: classification accuracy, Detection rate, and false alarm rate. The proposed GA-FFNN architecture was designed with one input layer, 2 hidden layers, and an output layer. “Adam” function was used to optimize the hidden layers. Figure 3 visualizes the classification accuracy of the proposed GA-FFNN that outperforms than the existing classifiers like the random forest, bayesnet, k-star, and BFFO-CNN. Table 2 compares the detection and false alarm rate of different classifiers where the proposed approach shows its dominance over the existing approaches.

拟议的GA-FFNN旨在对传入的网络流量模式是恶意的还是正常的进行分类。它已通过以下指标进行了评估和验证：分类准确性，检测率和误报率。提出的GA-FFNN体系结构设计为具有一个输入层，2个隐藏层和一个输出层。 “亚当”功能用于优化隐藏层。图3可视化了所提出的GA-FFNN的分类精度，其优于现有分类器(如随机森林，Bayesnet，k-star和BFFO-CNN)的准确性。表2比较了不同分类器的检测率和误报率，其中所提出的方法显示了其在现有方法上的优势。

**Fig. 3.** Classification Accuracy图3.分类精度

Table 2. Performance Evaluation- Detection Rate and False Alarm Rate

结论 (Conclusions)

This paper has presented the Genetic Algorithm based Feedforward Neural Network for the parameter optimization of FFNN and also for the classification of malicious samples from normal samples. The NSL-KDD dataset has been used to evaluate the proposed GA-FFNN and the results were validated with classification accuracy, detection rate, and false alarm rate. From the extensive experiments, the proposed classification approach, GA-FFNN has provided better accuracy than the existing approaches. This work can be further extended for feature selection by varying the genetic operations to optimize the parameters of FFNN.

本文提出了基于遗传算法的前馈神经网络，用于FFNN的参数优化以及对正常样本中恶意样本的分类。 NSL-KDD数据集已用于评估拟议的GA-FFNN，并以分类准确性，检测率和误报率验证了结果。通过广泛的实验，提出的分类方法GA-FFNN提供了比现有方法更好的准确性。通过改变遗传操作以优化FFNN的参数，这项工作可以进一步扩展到特征选择。

M. Raman, K. Kannan , S. Pal.: Rough set-hypergraph-based feature selection approach for intrusion detection systems, Def. Sci. (2016) .M. Raman，K。Kannan，S。Pal .：入侵检测系统的基于粗糙集超图的特征选择方法，Def。科学 (2016)。
Kacy Zurkus (2019).: www.infosecurity-magazine.com/news/thousands-left-vulnerable-in-nexus (accessed July 2019)

Kacy Zurkus(2019).: www.infosecurity-magazine.com/news/thousands-left-vulnerable-in-nexus(2019年 7月访问)
Armerding T (2018) The 18 biggest data breaches of the 21st century. https://www.csoonline.com/article/2130877/data-breach/the-biggest-data-breaches-of-the-21st-century.html. Accessed July 2019

Armerding T(2018)是21世纪18个最大的数据泄露事件。 https://www.csoonline.com/article/2130877/data-breach/the-biggest-data-breaches-of-the-21st-century.html。 2019年7月访问
G. Swenson, Bolstering Government Cybersecurity Lessons Learned from WannaCry, (2017).https://www.nist.gov/speech-testimony/bolstering-government-cybersecurity-lessons-learned-wannacry (accessed July, 2019)G.Swenson，《加强从WannaCry汲取的政府网络安全教训》(2017年).https：//www.nist.gov/speech-testimony/bolstering-government-cybersecurity-lessons-learned-wannacry(2019年7月访问)
Yahoo Password leak (2017).: https://www.cnet.com/news/massive-breach-leaks-773-million-emails-21-million-passwords/ (Accessed July 2019)

雅虎密码泄露(2017).: https://www.cnet.com/news/massive-breach-leaks-773-million-emails-21-million-passwords/(2019年 7月访问)
Adobe breach (2013).: https://krebsonsecurity.com/tag/adobe-breach/ (Accessed on July 2019)

Adobe漏洞(2013).: https://krebsonsecurity.com/tag/adobe-breach/(2019年 7月访问)
M.R. Gauthama Raman, K. Kirthivasan, V.S. Shankar Sriram.: Development of rough set –hypergraph technique for key feature identification in intrusion detection systems, Comput. Electr. Eng. 1–12, 2017Gauthama Raman先生，K。Kirthivasan先生和VS Shankar Sriram .：粗糙集–用于入侵检测系统关键特征识别的超图技术的开发，计算机。电器。。 2017年1月12日
K. Scarfone, P. Mell Guide to Intrusion Detection and Prevention Systems (IDPS) NIST Spec. Publ (2007)K. Scarfone，P. Mell入侵检测和防御系统(IDPS)指南NIST规范。 Publ(2007)
Almseidin, Mohammad, et al. : Evaluation of machine learning algorithms for intrusion detection system, 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), IEEE, 2017.Almseidin，Mohammad等。：评估入侵检测系统的机器学习算法，2017年IEEE第15届智能系统和信息学国际研讨会(SISY)，IEEE，2017年。
Gauthama Raman MR, Somu N, Kirthivasan K, V. S. S. Sriram.: An efficient intrusion detection system based on hypergraph — Genetic algorithm for parameter optimization and feature selection in support vector machine. Knowledge-Based Syst 134:1–12 (2017)Gauthama Raman MR，Somu N，Kirthivasan K，VSS Sriram 。：一种基于超图的高效入侵检测系统-用于支持向量机中参数优化和特征选择的遗传算法。基于知识的系统134：1–12(2017)
Beghdad, R.: Critical study of neural networks in detecting intrusions,Computers & security, 27(5–6), 168–175 (2008).Beghdad，R .：“检测入侵的神经网络的批判性研究”，《计算机与安全》，27(5–6)，168–175(2008)。
Shin, Yeonju, et al.: Development of NOx reduction system utilizing artificial neural network (ANN) and genetic algorithm (GA), Journal of Cleaner Production (2019).Shin，Yeonju等人：利用人工神经网络(ANN)和遗传算法(GA)开发NOx还原系统，《清洁生产杂志》(2019年)。
Xu, Feiyi, et al. “Training Feed-Forward Artificial Neural Networks with a Modified Artificial Bee Colony Algorithm.” Neurocomputing (2019).徐飞一等。 “使用改进的人工蜂群算法训练前馈人工神经网络。” 神经计算(2019)。
Blum, Christian, and Krzysztof Socha. “Training feed-forward neural networks with ant colony optimization: An application to pattern classification.” Fifth International Conference on Hybrid Intelligent Systems (HIS’05). IEEE, 2005.Blum，Christian和Krzysztof Socha。 “通过蚁群优化训练前馈神经网络：在模式分类中的应用。” 第五届混合智能系统国际会议(HIS'05)。 IEEE，2005年。
Chiba, Zouhair, et al.: Intelligent Approach to Build a Deep Neural Network Based IDS for Cloud Environment Using Combination of Machine Learning Algorithms, Computers & Security (2019).Chiba，Zouhair等人..结合机器学习算法，计算机和安全性，为云环境构建基于深度神经网络的IDS的智能方法(2019)。
Vijayanand, R., D. Devaraj, and B. Kannapiran.: Intrusion detection system for wireless mesh network using multiple support vector machine classifiers with genetic-algorithm-based feature selection, Computers & Security 77, 304–314, 2018Vijayanand，R.，D。Devaraj和B.Kannapiran .:无线网状网络的入侵检测系统，使用多个支持向量机分类器，并具有基于遗传算法的特征选择，计算机与安全77，304–314，2018
Akashdeep, I. Manzoor, and N. Kumar.: A feature reduced intrusion detection system using ANN classifier, Expert Syst. Appl., vol. 88, pp. 249–257, 2017.Akashdeep，I。Manzoor和N. Kumar 。：一种使用ANN分类器Expert Syst的功能精简的入侵检测系统。应用卷。 88，第249–257页，2017年。
M. R. G. Raman, N. Somu, K. Kirthivasan, and V. S. S. Sriram.: A Hypergraph and Arithmetic Residue-based Probabilistic Neural Network for classification in Intrusion Detection Systems,” Neural Networks, vol. 92, pp. 89–97, 2017MRG Raman，N。Somu，K。Kirthivasan和VSSriram 。：《基于超图和算术残差的概率神经网络，用于入侵检测系统的分类》，《神经网络》，第1卷。 92，第89–97页，2017年
L. Davis, Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, 1991 .戴维斯(L. Davis)，《遗传算法手册》，范·诺斯特兰德·雷因霍尔德(Van Nostrand Reinhold)，纽约，1991年。
Engel J (1988) Teaching feed-forward neural networks by simulated annealing. Complex Syst 2:641–648Engel J(1988)通过模拟退火教学前馈神经网络。复杂系统2：641–648
Tavallaee M, Bagheri E, Lu W, Ghorbani AA.: A detailed analysis of the KDD CUP 99 data set. In: IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA, IEEE, pp 1–6, 2009Tavallaee M，Bagheri E，Lu W，Ghorbani AA ：：对KDD CUP 99数据集的详细分析。在：IEEE安全与国防应用计算智能研讨会，CISDA，IEEE，第1-6页，2009年

Python packages imported snippets

Python包导入的代码段

import numpy as npimport matplotlib.pyplot as pltimport pandas as pdimport matplotlib.colorsfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import accuracy_score, mean_squared_errorfrom tqdm import tqdm_notebookfrom keras import regularizers // to avoid overfitfrom sklearn.preprocessing import OneHotEncoder // binary classification 0 and 1from sklearn.datasets import make_blobsfrom sklearn import preprocessing

KDD-Train and Test

KDD培训与测试

df_train = pd.read_csv(‘KDDTrain+.txt’, header=None, index_col=None) df_test = pd.read_csv(‘KDDTest+.txt’, header=None, index_col=None) df_train.head()

df_train = pd.read_csv('KDDTrain + .txt'，标头=无，index_col =无)df_test = pd.read_csv('KDDTest + .txt'，标头=无，index_col =无)df_train.head()

In that Dataset, We have 42 features present in them. We can remove the last column of that dataset since we won’t be needing them.

在该数据集中，我们具有42个功能。我们可以删除该数据集的最后一列，因为我们不需要它们。

df_train.drop(42, axis=1, inplace=True) df_test.drop(42, axis=1, inplace=True)

df_train.drop(42，axis = 1，inplace = True)df_test.drop(42，axis = 1，inplace = True)

# Classifying Attacks counterparts as 1 and Normal as 0.

＃将Attacking对应项分类为1，Normal则分类为0。

df_train.loc[df_train[41]!=’normal’, 41] = 1 df_test.loc[df_test[41]!=’normal’, 41] = 1 df_train.loc[df_train[41]==’normal’, 41] = 0 df_test.loc[df_test[41]==’normal’, 41] = 0 df_train.groupby(41).count()

df_train.loc [df_train [41]！='普通'，41] = 1 df_test.loc [df_test [41]！='普通'，41] = df_train.loc [df_train [41] =='普通'， 41] = 0 df_test.loc [df_test [41] =='正常'，41] = 0 df_train.groupby(41).count()

#To determine it’s size and shape after classification.

＃确定分类后的大小和形状。

X_train = df_train.drop(41, axis=1) y_train = df_train.loc[:,[41]] X_test = df_test.drop(41, axis=1) y_test = df_test.loc[:,[41]] print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

X_train = df_train.drop(41，axis = 1)y_train = df_train.loc [：，[41]] X_test = df_test.drop(41，axis = 1)y_test = df_test.loc [：，[41]]打印( X_train.shape，y_train.shape，X_test.shape，y_test.shape)

We can apply One hot Encoding (categorical encoding) on data to cover some special characters.

我们可以对数据应用One hot Encoding(分类编码)以覆盖一些特殊字符。

le = preprocessing.LabelEncoder() enc = OneHotEncoder()

le = preprocessing.LabelEncoder()enc = OneHotEncoder()

Guys, a Complete version of code can be found here.

伙计们，可以在这里找到完整的代码版本。

I hope you guys understood it!!

我希望你们能理解！

翻译自: https://medium.com/swlh/ga-ffnn-an-intelligent-classification-approach-for-signature-based-ids-b18a8dd2158d

ffnn

查看全文

http://www.taodudu.cc/news/show-3859632.html

python3数据存储—四个数据库（sqlite，mysql，redis，mongodb）
一个非常简单的小小小小。。。。。。游戏（自编）
Unity精华☀️点乘、叉乘终极教程：用《小小梦魇》讲解这个面试题~
小小码农的产品观念
2014年末，和小小在一起
OLAP(业务)—事务分析（查询）
java springBoot实现QQ机器人，定时发送信息，自动回复功能
python终结一个循环额_103.md · 小小懒羊羊/StarterLearningPython - Gitee.com
GPGPU小小心得
奇奇怪怪的return
十一小小记
小小的情感
小小勇者服务器维护,小小勇者超详细技巧总汇新手必备心得一览[多图]
小小的python编程故事_小小的 Python 编程故事
微信小小屠龙攻略服务器,小小屠龙实用技巧攻略合集
gdufe1534-小小怪一定认真听课-dfs
《Python编程快速上手》---项目记录（第12章）
windows上安装netcat
Mac安装Netcat教程
Windows下安装使用netcat
linux 静态编译netcat,linux上安装netcat
netcat使用教程
如何启动netcat_linux netcat命令使用技巧
瑞士军刀 netcat
centos安装netcat
Netcat 了解
Netcat介绍及安装使用
如何启动netcat_Netcat基础
Netcat使用
Netcat简介