RBO10. Leveraging LLMs for Complete Life-cycle of Cyber Security Analytical Tasks

Primary Focus Area: AI for Cybersecurity
Secondary Focus Areas:
Hybrid AI pipelines, Adaptation of foundation models, Safeguards and trust in AI, Privacy and security in AI, AI for e-governance, AI for healthcare, AI for business processes

Abstract
Cyber threats are growing in scale and complexity, producing massive volumes of security data that challenge timely analysis. This RBO studies the applicability of large language models (LLMs) for key cybersecurity analytical tasks such as real-time alert management, log analysis, and vulnerability detection from source code. The aim is to reduce workload for security analysts and improve efficiency in Security Operation Centers (SOCs) by introducing novel LLM-based data analysis methods.

Gap

While machine learning is well-established in cybersecurity (intrusion detection, malware analysis, alert prioritization), the application of LLMs is still nascent and under-explored. Existing research mostly covers anomaly detection and code analysis, but lacks maturity and breadth. Important tasks such as alert management and software vulnerability detection have been little studied with LLMs. This project addresses these gaps by investigating LLMs’ suitability for these tasks, combining cutting-edge prompting techniques and hybrid AI pipelines to enhance SOC operations.

Objective

To develop and evaluate LLM-based approaches that reduce security analysts’ workload by improving alert management, alert log summarization, and vulnerability detection in source code. The research will explore both LLM-centric and LLM-supported hybrid AI methods, with the goal to enhance cybersecurity monitoring capabilities in real-world SOC environments.

Approach

  • Alert management & log analysis: Use LLMs in black-box mode with advanced prompting on anonymized production alert logs from TalTech SOC and public datasets to extract patterns, create summaries, and identify critical alerts.
  • Vulnerability detection: Apply few-shot and chain-of-thought prompting on source code datasets classified by Common Weakness Enumeration (CWE). Benchmark against static code analyzers.
  • Additional tasks: Investigate LLM-based agentic approaches for automating incident handling and other SOC tasks. Also, identify further cybersecurity problems where LLMs add value.
  • Focus on privacy by leveraging local LLMs (e.g., Ollama) for sensitive data scenarios, supplemented by evaluation with commercial models (GPT-4, Claude 3).

Impact

This RBO will enhance SOC productivity by automating labor-intensive cybersecurity tasks, enabling analysts to focus on critical incidents and speeding up resolution. Success will be measured by effective deployment and evaluation of methods on production SOC data, reductions in analyst workload, and improvements in incident detection accuracy and prioritization.