Is it possible to log (to the access log) only the initial requests to a domain? I am not interested in seeing logs about all the accesses to a domain. Just which domains were accessed (and which were denied).
If it is not possible to limit the logs this way, is there some sort of filter I can use on the logs themselves? One suggestion made to me was to only pay attention to URLs that end with a ‘/’.
Perhaps a Python script could do the trick? Something like:
import re
log_file_path = "/var/log/squid/access.log"
output_file_path = "/path/to/filtered_log.log"
initial_requests = set()
with open(log_file_path, 'r') as log_file:
with open(output_file_path, 'w') as output_file:
for line in log_file:
match = re.search(r'(\S+://)?([^/]+)/$', line.split()[6])
if match:
domain = match.group(2)
if domain not in initial_requests:
initial_requests.add(domain)
output_file.write(line)
print(f"Filtered log saved to {output_file_path}")