What is Confusion Matrix ?
A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.
It is generally used in the places where generally the result is in true or false.
Confusion matrix consist of four parts , true positive , false positive , false negative , true negative.
or consider this diagram :
The things to be noted from the diagram is that :
- There are two possible predicted classes: “yes” and “no”. If we were predicting the presence of a disease, for example, “yes” would mean they have the disease, and “no” would mean they don’t have the disease.
- The classifier made a total of 165 predictions (e.g., 165 patients were being tested for the presence of that disease).
- Out of those 165 cases, the classifier predicted “yes” 110 times, and “no” 55 times.
- In reality, 105 patients in the sample have the disease, and 60 patients do not.
The definition of the 4 confusion matrix terms :
- true positives (TP): These are cases in which we predicted yes (they have the disease), and they do have the disease.
- true negatives (TN): We predicted no, and they don’t have the disease.
- false positives (FP): We predicted yes, but they don’t actually have the disease. (Also known as a “Type I error.”)
- false negatives (FN): We predicted no, but they actually do have the disease. (Also known as a “Type II error.”)
and what about cyber crime and cyber attacks ?
Cybercrime, also called computer crime, the use of a computer as an instrument to further illegal ends, such as committing fraud, trafficking in child pornography and intellectual property, stealing identities, or violating privacy. Cybercrime, especially through the Internet, has grown in importance as the computer has become central to commerce, entertainment, and government.
A cyber attack is an attack on the servers or computer in the public or private internet where the attacker seeks to expose, damage, alter, disable or try stealing the current data or changing the system configuration, and that is done unauthorized.
The act of doing Cyber attack is Cyber Crime.
Here are some of the common types of cyber attacks :
- Malware. Malware is a term used to describe malicious software, including spyware, ransomware, viruses, and worms.
- Man-in-the-middle attack.
- Denial-of-service attack.
- SQL injection.
- Zero-day exploit.
- DNS Tunneling.
The above mentioned attacks from corporate point of view are very dangerous and may cost loads of losses when a company incur any of the exploits or attacks .
The techniques used to prevent cyber attacks are :
- using key and certificates in ssh login to avoid spof.
- End to end encryption which greatly help to avoid attacks.
- automating the process of cyber attack monitoring.
- using a dedicated hardware firewall server.
- using encrypted cloud storage.
and Et cetera , Et cetera.
There are many more methods to which are not covered here.
Using Machine Learning
As i said , automating the process , of attack monitoring , can hugely decrease the cost of monitoring , which obviously means you don’t have to pay people to do the same task, but a cyber attack can be lethal and deadly and could cost loss in millions and hence you cannot lease a single point of failure and so a human monitoring is always necessary in this case .
Continuing forward …. machines do mistakes , and so do humans, and there could be one in a million chance that both do at the same time , so we deploy both for the same task so that one’s mistake could be covered by the other , and humans are more trusted as they have years of experience of detecting cyber crime .
And so now comes the task of confusion matrix, we have seen in general the security systems give the status of the security in a boolean (true or false ) and so there efficiency is showing in a form of a table called a confusion matrix and hence , we devise two types of errors from them , False Positive and False Negative .
False Negative —
It means that the result is actually true but the system returned false ,
Suppose in a company the security system start beeping that the security has been breached and the system is not safe(negative of , “is system safe?” ) when actually no cyber attack is going on the system .
It is case of False Negative , and come under “ Type II “ errors and are not so much dangerous.
False Positive -
It means the result is actually false but due to the system inefficiency the result shown to you True.
Suppose in a company the system alarm keeps showing you that you are safe , “all is well” and gives Positive message that the system is safe , and suddenly a cyber attack take place and the system is still unable to detect it and the message is still positive , The message of the system should be negative which would tell everyone that be alert the system is under attack , but instead everything is positive .
This type of error comes under Type I error , and is a very dangerous types of error and could cause huge loss to a company and companies are always for a lookout for these types of errors ’cause these types of errors are the man players that ruin the system.