The Role of AI in Cyber Security

Artificial Intelligence (AI) is gaining increasing prominence in the field of cyber security. Statista forecasts that AI in cyber security will be worth $46.3 billion USD by 2027, a multifold increase from $10.5 billion USD in 2020. Its adoption is driven by an ever-evolving threat landscape where malicious actors are always finding new ways to hurt us. They have become highly adept at weaponizing AI in their malware to improve the potency of their attacks. For example, we now have malware that intelligently learns the environment it’s in and executes malicious code only when specific conditions are met. The use of AI among attackers and defenders means that the side that develops the more powerful AI comes out on top. As a result, we’ve seen remarkable advancements in AI-based cyber security systems in recent years, particularly in threat detection. By leveraging AI technology, organizations can significantly augment their threat detection capabilities and strengthen their security posture. This is backed up by IBM's 2022 Cost of a Data Breach report, which found that "organizations that had a fully deployed AI and automation program were able to identify and contain a breach 28 days faster than those that didn’t, saving USD 3.05 million in costs."

We can all agree that the use of artificial intelligence in cyber security is a good thing, but have you ever wondered how it works? If you asked someone who uses or plans to use AI-enabled security technology to explain how it works, it’s likely that they won’t be able to do so. But why does it matter? Well, this lack of understanding can lead to several problems. These include:

  • An inability to understand the value of AI-enabled cyber security products compared to non-AI products
  • An inability to articulate the value of investing in AI-enabled cyber security products to management
  • An inability to set the right expectations for AI-enabled cyber security products in terms of what they can and cannot do
  • An inability to differentiate AI-enabled cyber security products offered by different vendors
  • An inability to properly configure and operate AI-enabled cyber security products to maximize their effectiveness

This article bridges these gaps and enhances your understanding of how AI works in cyber security. But before we delve into the intricacies, it’s a good idea to back up and make sure we have the correct understanding of AI and some of its related jargon.

What is Artificial Intelligence?

Artificial Intelligence (AI) is the ability of computer systems to perform tasks that would typically require human intelligence, such as learning, reasoning, and problem-solving. This is achieved using sophisticated AI algorithm models to process and analyze large amounts of data. The trained AI model uses patterns and insights discovered from the data to perform tasks like making predictions and generating output.

AI algorithm: An AI algorithm is a set of instructions that define how an AI system processes data. For example, the CNN (convolutional neural network) algorithm defines how an AI system analyzes visual imagery.

AI model: An AI model is a trained instance of an algorithm that captures the learned knowledge and is used to perform tasks. For example, a trained CNN model is used to recognize and classify new and unseen images.

What is Machine Learning?

Machine Learning (ML) and AI are often discussed as if they were separate technologies, but this is not the case. In fact, ML is a specific approach within AI that focuses on enabling machines to learn and improve from experience without explicit programming. Instead of giving precise instructions, ML algorithms learn patterns in large volumes of data to make predictions or decisions.

To sum up, AI is a broader concept that includes the entire field of developing intelligent machines, while ML serves as a crucial component within AI, providing the means for machines to achieve artificial intelligence.

AI Techniques Used in Cyber Security

AI Techniques Used in Cyber Security

AI in cyber security serves a variety of purposes, including threat detection, threat hunting, threat intelligence, and incident response. This article will focus exclusively on threat detection, which has truly embraced the use of AI. Among the widely adopted AI techniques in threat detection are Anomaly Detection, Natural Language Processing (NLP), Random Forests, and Graph Analysis. We will discuss these in greater detail, providing analogies and examples to help you understand their application and advantages. Let’s kick off with the easiest one to grasp – Natural Language Processing.

Natural Language Processing (NLP)

Natural language processing (NLP) is a branch of AI that enables computers to understand, interpret, and generate human language. It makes use of a variety of AI and machine learning algorithms to process and analyze textual data. Trained NLP models are used for tasks like translation, text classification, named entity recognition (NER), sentiment analysis, and question-answering.

AI in Cyber Security: What is Natural Language Processing (NLP)

NLP plays a crucial role in detecting threats that involve language-related elements. One example is the detection of malicious domains, which attackers use to distribute malware, steal credentials, exfiltrate data, and more. A popular way attackers use to trick us into trusting malicious domains is typosquatting, more commonly known as URL hijacking. This technique involves creating domain names that are similar to legitimate ones but contain slight typographical variations, such as replacing the letter "o" with the number 0. To counteract this, we apply NLP on a large body of known domain names to learn elements like keywords, patterns, and character combinations to differentiate between benign and malicious domains. This enables security tools to identify and block malicious domains like "g00gle.com" and "paypall.com." Apart from this, NLP is an important component in defense strategies against other text-based threats, such as phishing emails, webshells, and website defacements.

Random Forests

Before we dive into random forests, it’s a good idea first to have a basic understanding of decision trees.

A decision tree is a model used for making a prediction about a particular problem. Say you were asked to predict whether a random car will break down within the next year. You have a decision tree to help you make the prediction. The decision tree was created based on data of 1,000 "instances" of whether a car broke down or not within a year and its associated "features," such as car age, mileage, brand, service history, and driving conditions. The decision tree is modeled in a way that best represents the dataset, with the most predictive feature at the top of the tree and the least predictive feature at the bottom. To make a prediction about a new car, you simply take the features of this instance and work your way through the decision tree.

In the above example, a random forest will take many random samples of the dataset, say, 100 samples with 10 cars per sample. A decision tree is then created for each sample, with each decision tree using only a random subset of features (e.g., car age and brand for Tree 1, mileage and service history for Tree 2, and so on). The relevant features of a new car are passed through each decision tree to produce 100 predictions. In the end, the prediction with the most votes is taken as the final prediction.

Artificial Intelligence (AI) in Cyber Security: random forest

Random forests can be used in various cyber security scenarios to detect whether new instances of an event are malicious. One such scenario is the detection of brute force attacks, where attackers attempt to gain unauthorized access to an account or system by trying different combinations of login credentials. A popular target of brute force attacks is the remote desktop service. To detect such attacks, a random forest algorithm is applied to learn the patterns of RDP (remote desktop protocol) logs, which record information about remote desktop connections. The random forest model can then predict whether a new remote desktop connection is malicious based on features like the number of failed login attempts, login success ratio, the time interval between login attempts, and login location. Other applications of random forests include the detection of command-and-control communication and DNS tunneling.

Anomaly Detection

Anomaly detection is a technique used to identify unusual or abnormal instances and patterns in a dataset. The process starts by training a machine learning model to learn the normal patterns and characteristics of a dataset. The trained model is then used to detect anomalies in new, unseen data.

Anomaly detection is a crucial technique in the detection of sophisticated threats that manage to breach the network. The idea is that, despite the stealthiness of advanced techniques, there must be ways in which they differ from normal behavior and, therefore, can be picked up by powerful anomaly detection models.

Anomaly detection systems collect traffic from across the network. They use AI and ML algorithms to analyze traffic metadata over a period of time to build baselines of normal network traffic and behavior. This can include baselines of normal behavior for users, devices, and applications using what is known as User and Entity Behavior Analytics (UEBA). The system then compares real-time traffic with the learned baselines to detect and flag anomalous traffic or behavior.

AI in cybersecurity: Anomaly Detection

Anomaly detection is enhanced by correlation analysis. In cyber security, correlation analysis is the analysis and identification of relationships between different security events, logs, and indicators of compromise (IOCs). By learning the relationships between different data, correlation analysis enables anomaly detection to establish baselines based on the normal behavior of one indicator with respect to other indicators. This enables highly accurate detection of abnormal and suspicious network activities, resulting in low false positives.

To illustrate this concept, let’s imagine that one of your colleagues has been in a really good mood for the past week. You suspect that something is going on but can’t figure it out based on this single observation. However, you've also noticed that they’ve started dressing better, leaving work on time, and always using their mobile phone. By connecting these dots and using some common sense, you are confident that they’ve recently started a romantic relationship.

Turning back to cyber security, this technique is highly effective in the detection of data exfiltration, which is often the most harmful stage of cyber-attacks. Detecting data exfiltration is challenging because attackers employ sophisticated techniques to blend exfiltrated data within normal network traffic. These techniques include fragmenting data into smaller chunks, scheduling exfiltration during normal business hours, using different protocols, and sending data to trusted cloud storage services. Nevertheless, a powerful anomaly detection model that simultaneously analyzes multiple dimensions of network data is well-equipped to detect concealed data exfiltration. This could include traffic metadata, file and data access logs, user behavior, communication patterns, and so on. This enhanced approach improves the detection of data exfiltration, especially covert espionage operations carried out by APT (advanced persistent threat) groups.

Graph Analysis

Graph analysis uses ML algorithms to analyze a graphical representation of relationships between a group of objects called "entities." In such a graph, each entity is depicted as a node, while the relationships between them are represented as lines or "edges." Graph analysis enables us to gain valuable insights by discovering patterns within complex networks.

Suppose you want to gain a deeper understanding of the friendships among students in a specific year group. In that case, you can plot a graph that represents the friendships, with the students as nodes and friendships as edges. You can then apply ML algorithms to analyze the graph and extract insights about the friendships. For instance, you can identify the individual with the most friends (the node with the highest number of connections) and friendship groups (multiple connections within a group of nodes).

Artificial Intelligence (AI) in Cyber Security - Graph Analysis

Graph analysis can be used to address cyber security events that involve a complex web of connections. A perfect application scenario is the detection of botnets, which are intricate networks of malware-infected computers or devices controlled by a central attacker. Botnets are often used for malicious purposes like distributed denial-of-service (DDoS) attacks, malware distribution, and cryptomining. They are difficult to detect due to their ability to distribute control among numerous compromised devices, making it a challenge to identify the central source of malicious activity. Nonetheless, we can apply graph analysis on network connections to reveal insights about the relationships and patterns between devices, such as devices with a high degree of centrality or abnormal connections. Security teams can use this intelligence to identify the controlling device and mitigate the botnet.

A Leader In AI-Enable Cyber Security

Sangfor Technologies is a pioneer of AI-enabled cyber security technology, releasing the world’s first AI-enabled next-generation firewall, Sangfor NGAF. Sangfor NGAF and Sangfor Endpoint Secure are both integrated with Sangfor Engine Zero, a state-of-the-art AI-powered malware detection engine. To maximize the accuracy of malware detection, Engine Zero is tested against millions of malware samples to enable it to run and teach itself, expanding its capacity to discover new and unknown malware. NGAF’s powerful capabilities and innovations have seen Sangfor recognized as a “Visionary" in the Gartner Magic Quadrant for Network Firewalls, awarded the "Recommended Rating" in the CyberRatings Enterprise Firewalls Test, and honored with the Frost & Sullivan "Company of the Year" award. Endpoint Secure received the "Top Product" award from AV-Test for achieving perfect test scores for protection, performance, and usability, as well as the "Advanced Approved Endpoint Protection" certificate for providing 100% protection against ransomware attacks.

Sangfor Cyber Command, our Network Detection & Response (NDR) solution, is the epitome of AI-driven security technology. Cyber Command harnesses multiple purpose-built AI models to detect a wide range of advanced threats hidden in network traffic. By using AI to analyze and correlate events from across the network, Cyber Command connects the dots between events from various data sources to uncover threats that are missed by point solutions. Event correlation provides security teams with highly contextualized alerts to streamline investigation and threat hunting, enabling rapid identification and response to threats. Cyber Command has earned Sangfor multiple industry recognitions, including a Top 5 NDR vendor in the world based on Gartner Market Share data, a "Representative Vendor" in Gartner Market Guide for NDR, and a "Notable Vendor" in Forrester Network Analysis and Visibility Landscape. Watch this webinar to learn how Cyber Command utilizes purpose-built AI models.

 

Sample of Purpose-Built AI Models Used in Sangfor Security Products

Scenarios Data Algorithm Products
DNS hidden tunnel DNS Logs Random Forests Cyber Command,
NGAF, Endpoint Secure
DGA domain name DNS Logs NLP, Graph Analysis Cyber Command, NGAF, Endpoint Secure
New malicious domain name DNS Logs NLP, Anomaly Detection Cyber Command, NGAF, Endpoint Secure
Botnet family variant tracking DNS Logs Graph Analysis Cyber Command, NGAF, Endpoint Secure
HTTPS C&C HTTPS Logs Random Forests Cyber Command, NGAF
Encrypt RDP and SSH slow brute force RDP Logs, SSH Logs Random Forests Cyber Command, NGAF, Endpoint Secure
Website defacement Web Access Logs NLP Cyber Command, NGAF
Engine-Zero files anti-virus Files XGBoost Cyber Command, NGAF, Endpoint Secure
Webshell HTTP Logs NLP Cyber Command, NGAF
Abnormal outbound behavior Sessions Anomaly Detection Cyber Command
Abnormal login behavior Login Logs Anomaly Detection Cyber Command
Attack path recovery Multiple Logs Knowledge Mapping Cyber Command

 

Contact Us for Business Inquiry

Listen To This Post

Search

Get in Touch

Get in Touch with Sangfor Team for Business Inquiry

Related Articles

Cyber Security

12 Top Extended Detection and Response (XDR) Solutions

Date : 06 Sep 2024
Read Now
Cyber Security

13 Top Secure Web Gateway Solutions

Date : 04 Sep 2024
Read Now
Cyber Security

How to Communicate the Cyber Security Value to the Board

Date : 03 Sep 2024
Read Now

See Other Product

Sangfor Omni-Command
Replace your Enterprise NGAV with Sangfor Endpoint Secure
Cyber Command - NDR Platform
Endpoint Secure
Internet Access Gateway (IAG)
Sangfor Network Secure - Next Generation Firewall