Log files can be used to gather details about the state of the system and attacks on the system.Suppose we have a system connected to the Internet with SSH enabled. Many attackers are trying to log in to the system. We need to design an intrusion detection system to identify users who fail their login attempts. Such attempts may be of a hacker using a dictionary attack. The script should generate a report with the following details:
We can use the associative arrays of awk to solve this problem in different ways. Words are alphabetic characters, delimited by space or a period. First, we should parse all the words in a given file and then the count of each word needs to be found. Words can be parsed using regex with tools such as sed, awk, or grep.
Using this hash, we can compare the hash against a list of hashes already computed. If the has matches, we have seen the contents of this file before and so we can delete it. If the hash is new, we can record the entry and move onto calculating the hash of the next file until all files have been hashed.