Malicious Resource Detection with Python, Wireshark and Virustotal

Although Wireshark is a very useful tool for network forensic, when it comes to analyse massive number of packets we may need further tools to find malicious resources like domain names, IP addresses, URLs etc.

In this article, we will use both Wireshark (Tshark) and Virustotal API to discover malicious resources in a packet capture.

Virustotal API

You may have heard about www.virustotal.com, which provides a HTML website interface through which you enter a URL, IP address or upload a file to view threat, content and reputation analysis. It also provides an API that lets you upload and scan files, submit and scan domain names, IP addresses, URLs. After scanning, the API returns you a report in a json format.

Even though the API is a free service, you need to sign up to get an API key that authorizes you to use the service. The steps to obtain the key are below.

1) Visit https://www.virustotal.com/gui/home/upload and click on "Sign up" button to register.

2) After completing signing up, log in with your credentials.

3) Click on your account on the top right corner and a menu appears like below.

4) Click on "API Key" and you will reach your API key. Mine is below.

The API key access level is limited. You can make 4 requests per a minute and your daily quota is 500 requests (lookups). The quota on monthly usage is 15.50 K lookups. There is also a paid version of the API that allows customers to examine resources or any file uploaded to the service. Of course, you can upgrade to the premium but the free version is well enough for testing purpose.

So the question is how do we extract domain names, IP addresses and URLS from a packet capture file? Thanks to Python Pyshark module.

Pyshark Module

It is a Python wrapper for Tshark which is a terminal oriented version of Wireshark. In another saying, Tshark is a version of Wireshark without a GUI. When you use Pyshark module, Pyhton spawns (creates) a Tshark subprocess. To understand how the module works with Tshark, I coded a simple function to filter some packets, using “test_pcap.pcapng” file. As I executed the code, I used "Process Explorer" tool to follow the process creation.

You can download it from here: https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer

The following screenshot shows information about Tshark subprocess when it was created.

Extracting malicious resources with Pyshark

I assume you already have a capture file. We will extract resources below from the file.

Domain Names from DNS packets
IP addresses from IP headers
Server Names (Domain Name) from TLS client hello packets
URLs from http/https requests (Unless you provide the SSL/TLS keys to Wireshark, you will not be able to obtain the URLSs from https. For more information please read this article https://www.golinuxcloud.com/wireshark-decrypt-ssl-tls-tutorial/)

Step-1: Importing required Python modules

Import modules below:

import requests 
import json
import time
import pyshark
from ipaddress import ip_address

Here,

“requests” module will be used to make a GET request to Virustotal API (version 2).
“json” module will be used to convert response from the API into json format.
“time” module will be used to create a request per 16 seconds, since we have a limit.
“pyshark” module will be used to extract resources from the capture file.
“ip_address” will be used to eliminate private IP addresses since we have private IP addresses in our capture file and Virustotal does not have any idea of them.

Step-2: Creating a display filter for interesting traffic

Create a function that takes a file and display filter. Since Pyshark is just a wrapper of Thsark, you can even use the same filter in Wireshark as well or you can use your saved display filter from Wireshark.

python

def filter_packets(file_path, disp_filter):

    # capture only interesting traffic
  capture = pyshark.FileCapture(file_path, display_filter=disp_filter)
 return capture

Step-3: Creating a function for extracting DNS resource records

Create a function that extracts Domain Names from DNS packets.

python

def dns(file_path):
    # this list will store all domain names in the dns packets
  resource_list = []

    # filters only dns packets
  packets = filter_packets(file_path, "dns")
  for pkt in packets:

        # if the packet contains a query
  if pkt.dns.qry_name:
  resource_list.append(pkt.dns.qry_name)
  packets.close()
 return resource_list

Step-4: Creating a function that extracts IP addresses from IP headers

Create a function that extracts IP addresses from IP headers.

python

def ip(file_path):
    # this list will store all IP addresses except the private ones
  resource_list = [] 

    # filters only IP packets
  packets = filter_packets(file_path, "ip")
  for pkt in packets:
  if pkt.ip:
  src_ip=ip_address(pkt.ip.src)

            # check if it is a private ip or not
  if not src_ip.is_private:
  resource_list.append(pkt.ip.src)
  packets.close()
 return resource_list

Step-5: Creating a function that extracts Server Names from TLS client hello packets

Create a function that extracts Server Names from TLS client hello packets.

python

def tls(file_path):

    # this list will store server names from TLS client hello
  resource_list = []
 # only TLS client hello packet, no QUICK protocol which uses UDP
  packets = filter_packets(file_path, "tls.handshake.type == 1 and tcp") 

 for pkt in packets:
  if pkt.tls.handshake_extensions_server_name:
  resource_list.append(pkt.tls.handshake_extensions_server_name)
  packets.close()
  return resource_list

Step-6: Creating a function that extracts URLs from http/https packets

Create a function that extracts URLS from http/https packets.

python

def http(file_path):

    # this list will store URLS from http and https packets
  resource_list = []
  # only requests like get, post, delete, put, trace, option
    # no SSDP, only http methods
  packets = filter_packets(file_path, "http.request.method and tcp")

  for pkt in packets:
  if pkt.http.request_full_uri:
  resource_list.append(pkt.http.request_full_uri)
  packets.close()
  return resource_list

Step-7: Creating a function that uses Virustotal's API to detect the malicious resources

Create a function that asks Virustotal's API if the resource is malicious or not. When it is malicious, it

returns "positive" attribute with a number higher than 0 (zero). The following output is the format of the data returns from Virustotal's API.

python

def ask_virustotal(recource_list):

 # this key will authorize our requests
  api_key = "465a6ec6d05f0d5c7e6b73f84c36e7f4e4a7ea7e63c294958585564e2ede6e57"

 for maliouc_resource in recource_list:
        # Virustotal API endpoint
  url = 'https://www.virustotal.com/vtapi/v2/url/report'
  params = {'apikey': api_key, 'resource': maliouc_resource}
  response = requests.get(url, params=params)
  response_json = json.loads(response.content)

    # if the resource is malicious then it will list which antivirus vendor has marked it.
  try:
  if response_json['positives'] > 0:
  antivir_list = []

  for antivir in response_json['scans']:
  if response_json['scans'][antivir]['detected'] == True:
  antivir_list.append(antivir)
  print(response_json['resource'])
  print("Resource above is found malicious by",antivir_list)
  except:
  pass

# since we are using free version, we can not make more than 4 requests per minute
     # we will limit that by making the script sleeping for 16 seconds after each request
  time.sleep(16)

Step-8: Testing your code

Put everything together and call the resource extraction function you want then ask Virustotal.

I will use http function and download a sample pcap file that contains malicious resources.

python

# since the capture file contains duplicate resources (IP addresses, domain names, URLS etc.)
# we will make them unique first, then send them to the Virustotal API
resouces = set(http("malicious.pcap"))
ask_virustotal(resouces)

The result for http URLs and IP addresses is below.

Final Thoughts

Sometimes we need to integrate multiple tool’s feature to do an effective network forensic. Combining Wireshark skill with Virustotal scan can produce excellent results.