Building Log Analysis Workflows
Cristian Pascariu
Information Security Professional
www.cybersomething.com
Go beyond single-purpose scripts
- Design log analytics capabilities
Overview Touch upon technical aspects
- Working with JSON
- Interacting with web services using built-
in modules
Integrating with other services and
technologies
- Generate alerts and store them in a
database
- Index enriched data to a SIEM
Building Log Analysis Workflows
Leverage Python to gain insight from log files
- Identify and extract IoC’s
- Visualize trends
Automate manual analysis
- Focus on next steps in the process
Opportunities to enhance existing workflows
Investigation Workflows
Few requests
DNS
Logs
High similarity score
Connection
Logs Attacker
Long duration infrastructure
High number of requests
Http
Logs
Log Analysis Workflows in Practice
Push results into Correlate with data Build stand-alone
centralized from other services solutions
repositories
Integration Workflows
Raw file
API
Log file Log processing
with Python
Database
Integration Workflows
Raw file
API
Log file Log processing
with Python
Database
Integration Workflows
Raw file
API
Log file Log processing
with Python
Database
ORM
Investigation Workflows
Smb logs Parse log data Detect known Save alerts in
ransomware database
extensions
Auth logs Parse log data Add geolocation Index logs in
data based on IP Elastic for
visualizations
Investigation Workflows
Smb logs Parse log data Detect known Save alerts in
ransomware database
extensions
Auth logs Parse log data Add geolocation Index logs in
data based on IP Elastic for
visualizations
Interacting with REST APIs
Interacting with other services via a REST API
- Retrieve data that can be used for enriching
log data
- Send results to another service for reporting
or further analysis
Two important aspects
- Format of the data being sent and received
- Module for initiating connections
Working with JSON
REST APIs requires serialized data
- JSON is a widely used format
Python has built-in support via JSON module
- Easy conversion for lists and dictionaries
# import Json module
import json
# convert a dictionary to a Json encoded string
json_data = json.loads(<dict>)
# convert a Json string to an object
raw_data = json.dumps(json_data)
Working with JSON
Python and REST APIs
Implement with standard libraries
- Requires attention around authentication
details
Two popular modules available
- Urllib
- Requests
# import requests module
import requests
# make a get request and save response
r = requests.get(‘url’)
# response in json format
r.json()
Leveraging Python Requests
Working with a Database
Leverage a database
- Store newly processed data
- Extract information from DB to further enrich
current log analysis
Interact with relational databases using an
Object Relational Mapper
- SQLalchemy for SQL databases
MongoDB provides native client
MongoDB Hierarchy
MongoDB instance
Database
Collection
Document
MongoDB Hierarchy
MongoDB instance
Log DB Database
Log type Collection
Individual log
entries
Document
# import pymongo module
from pymongo import MongoClient
# connect to the MongoDB instance
client = MongoClient()
# select database
db = client['test-database']
# select a collection
collection = db['test-collection']
Getting Started with Pymongo
# Get the collection
logs = db[‘logs’]
# Insert a document into the collection
logs.insert_one({…})
# Retrieve the first document that matches the query
logs.find_one({…})
Getting Started with Pymongo
Demo
Set up a Mongo database using Docker
- Connect from Python using Pymongo
Generate alerts based on common
ransomware extensions identified in SMB
logs
[Demo 4.1] Script
Working with Elasticsearch
The elastic stack is a very popular framework
- Initially designed for distributed search
- Adopted for log analytics
Python client available to interact with a
cluster directly from a script
- Elasticsearch
from elasticsearch import Elasticsearch
# Establish the connection with the Elastic cluster
es = Elasticsearch()
# Add a document to an index
es.index(index=”auth", document={})
# Search an index
es.search(index=”auth", query={"match_all": {}})
Elasticsearch and Python
Demo
Set up a single-note Elasticsearch instance
using docker
Send enriched log data
Build a map visualization
[Demo 4.2] Script
Develop Yourself Further
Integrations with Data analysis and Machine learning and
other tools and statistics artificial intelligence
services
Go beyond single-use scripts to develop
Summary log analysis workflows
Generated alerts and stored them in a
MongoDB for future use
Indexed enriched data into Elastic to build
visualizations
Thank you!