Real-time File System Monitoring
Using the watchdog library to hook into OS events. This allows the system to monitor file system changes in real-time, detecting when new files are created, modified, or deleted in monitored directories.
An educational exploration of building security tools with LLM-powered IDEs - Sentinel-AI EDR Prototype
This project is purely educational and experimental. Building a production-ready antivirus system using AI is still far too complex and requires deep expertise in cybersecurity, kernel-level programming, malware analysis, and threat intelligence. Commercial antivirus solutions are developed by teams of hundreds of security experts over many years, with access to massive threat intelligence databases, advanced machine learning models, and kernel-level drivers.
This project demonstrates the potential of AI-assisted development for rapid prototyping and learning, but it should never be used as a replacement for professional security solutions. The limitations discussed in this article highlight why AI alone cannot yet replace the sophisticated multi-layered defense systems used by commercial security vendors.
In an era where cyber threats evolve in minutes, I decided to explore a fascinating question: Can we use LLM-powered IDEs to build bespoke security tools? Using Cursor, I developed a functional, lightweight, and custom-tailored Antivirus (EDR prototype) for my machine.
This project wasn't just about "generating code"—it was about architecting a multi-layered defense system using AI as a high-speed co-pilot. Here is the breakdown of how I built it, the prompts I used, and what I learned about the limits of AI in cybersecurity.
Educational Value: While this prototype demonstrates the power of AI-assisted development, it's crucial to understand that creating a truly production-ready antivirus requires expertise far beyond what AI can currently provide. This project serves as an excellent learning tool to understand the fundamental concepts of endpoint protection, but it should be viewed as an educational exercise rather than a practical security solution.
The system, which I've named "Sentinel-AI", is built in Python and operates on three fundamental pillars of endpoint protection:
Using the watchdog library to hook into OS events. This allows the system to monitor file system changes in real-time, detecting when new files are created, modified, or deleted in monitored directories.
A hashing engine that compares new files against a database of known threats (SHA-256). This is the most basic form of malware detection, similar to how traditional antivirus software identifies known malicious files.
A process monitor powered by psutil that flags suspicious CPU spikes and unauthorized execution patterns. This helps detect potentially malicious behavior even when file signatures aren't recognized.
I didn't ask the AI for a "complete antivirus" in one go. I used a Modular Prompting strategy to ensure code quality and avoid "hallucinated" security holes. This approach is crucial when building security-critical applications, as AI can sometimes generate code that appears correct but contains subtle vulnerabilities.
Why Modular Prompting? Breaking down complex security requirements into smaller, focused prompts helps ensure that each component is properly implemented and tested. This is especially important in security applications where a single vulnerability could compromise the entire system.
Subject: Development of Local Security Monitoring System Prototype (Basic EDR)
I want to develop an educational prototype of a protection system for my Mac machine.
The project must be written in Python and must consist of the following modules:
1. File Watcher:
- Use the watchdog library to monitor in real-time the 'Downloads' folder and Desktop
- If a new executable file is created, the system must intercept it immediately
- Support monitoring of multiple directories simultaneously
- Handle file creation, modification, and deletion events
2. Integrity Check:
- For each new file detected, calculate the SHA-256 hash
- Compare the hash with a local JSON file called malware_db.json
(which will function as our blacklist of signatures)
- Implement efficient hash lookup using Set or Dictionary data structures
- Support both file-based and in-memory hash database for performance
3. Process Monitor:
- Implement periodic monitoring (using psutil) that lists active processes
- Signal if a process is consuming more than 90% CPU for more than 10 seconds,
suggesting potential anomalous behavior
- Track process creation and termination events
- Monitor memory usage and network connections for suspicious patterns
- Generate alerts for processes that spawn multiple child processes rapidly
4. Logging & Alert:
- Write every suspicious event to a security_log.txt file
- Include timestamp, event type, file path/process name, and severity level
- Show system notifications (use plyer or similar library)
- Support different alert levels: INFO, WARNING, CRITICAL
- Implement log rotation to prevent file size issues
Technical Requirements:
1. Generate modular and commented code:
- Separate each module into its own class/file
- Include comprehensive docstrings for all functions
- Add type hints for better code clarity
- Implement proper error handling throughout
2. Add a function to my project that connects to MalwareBazaar (abuse.ch) APIs
or downloads their CSV/text list of the most recent SHA-256 hashes:
- The system must download the list, extract only the hashes
- Save them in my malware_db.json file
- Include a check to avoid duplicates
- Implement automatic periodic updates (e.g., daily)
- Handle API rate limiting and connection errors gracefully
- Support both API and CSV/text file formats
3. Data Structure Optimization:
- For the antivirus to be fast, do not save hashes as a simple list,
but as a Set or Dictionary
- Here's how the JSON file should appear to be efficient (example):
{
"last_update": "2023-10-27",
"version": "1.0",
"hash_count": 2,
"hashes": [
"5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8",
"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
]
}
- Use Set for O(1) hash lookup performance
- Implement hash validation (must be 64-character hexadecimal strings)
- Support incremental updates without full database reload
4. Exception Handling:
- Include exception handling to prevent the program from crashing
if it doesn't have permissions to access certain system files
- Handle file system errors, network errors, and permission errors gracefully
- Implement retry logic for transient failures
- Log all exceptions with full stack traces for debugging
5. Additional Features:
- Implement a configuration file (config.json) for customizable settings
- Add command-line arguments for different operation modes
- Support daemon/service mode for background operation
- Include a simple CLI interface for status checking and manual scans
- Implement graceful shutdown handling (SIGTERM/SIGINT)
6. Installation & Execution:
- Explain briefly how to install dependencies (requirements.txt)
- Explain how to start the monitor in administrator mode
- Include setup instructions for macOS permissions (Full Disk Access, etc.)
- Provide example usage scenarios and test cases
Security Considerations:
- Do not run with root/admin privileges unless absolutely necessary
- Validate all file paths to prevent directory traversal attacks
- Sanitize all log entries to prevent log injection
- Use secure file permissions for sensitive files (malware_db.json, security_log.txt)
- Implement rate limiting for API calls to avoid abuse
Note: This expanded prompt demonstrates the level of detail required when using AI to build security-critical applications. Each requirement must be explicit, and edge cases must be considered to avoid vulnerabilities.
To make the tool "real," I prompted Cursor to build a Threat Feed Integrator:
"Write a Python script that connects to the MalwareBazaar (Abuse.ch) API. It should pull the latest 100 SHA-256 hashes of confirmed malware and update my local malware_db.json automatically while avoiding duplicates."
Note on Threat Intelligence: While this approach works for educational purposes, production antivirus systems use much more sophisticated threat intelligence feeds that include behavioral indicators, network signatures, and machine learning models trained on millions of samples. The 100-hash database used here is minuscule compared to the millions of known threats that commercial solutions track.
I implemented hashlib to generate SHA-256 signatures. I chose SHA-256 over MD5 because it is collision-resistant and the industry standard for modern IoCs (Indicators of Compromise).
Limitation: While SHA-256 is cryptographically secure, signature-based detection has fundamental weaknesses. Polymorphic malware can change its hash with minimal code modifications, rendering signature databases ineffective against advanced threats.
The watchdog.observers module allows the program to sleep while waiting for an OS-level interrupt, making it extremely lightweight on system resources.
Architecture Note: This user-space monitoring approach is efficient for learning, but production security solutions often require kernel-level drivers to intercept system calls before malicious code can execute. Without kernel-level access, sophisticated malware can bypass user-space monitors.
The system doesn't just look at files; it looks at telemetry. If a process exhibits "Ransomware-like" behavior (high CPU + rapid file modifications), the system triggers an immediate alert.
Behavioral Analysis: While this heuristic-based approach can catch some threats, modern malware is designed to evade such detection by operating slowly, using legitimate system processes, or encrypting files in ways that mimic normal system activity. Advanced behavioral analysis requires machine learning models trained on vast datasets of both malicious and benign behavior.
Building a security tool with AI provides a unique perspective on the "Defender's Dilemma." However, it's essential to understand both the capabilities and limitations of AI-assisted security development.
While AI tools like Cursor are incredibly powerful for rapid prototyping and learning, creating a production-ready antivirus requires capabilities that AI cannot yet provide:
Production antivirus systems use multiple detection layers:
Each of these layers requires deep expertise that goes far beyond what AI can generate from prompts.
Commercial antivirus solutions are developed by:
No AI tool can replicate this level of expertise and infrastructure.
Modern malware uses sophisticated techniques to evade detection:
Detecting these threats requires advanced techniques that are beyond the scope of simple signature-based or heuristic-based systems.
This project proves that AI tools like Cursor are force multipliers. They allow a single developer to build tools that would have previously required a small team of security engineers—for prototyping and learning purposes.
Educational Value: The real value of this project lies in the learning experience. By building a basic antivirus prototype, I gained deep insights into:
What's next? I plan to integrate a YARA-rule engine for better pattern matching and perhaps a local LLM (like Llama 3) to analyze suspicious PowerShell scripts in real-time. However, these enhancements will still be for educational purposes, as building a truly production-ready security solution requires expertise and resources far beyond what AI-assisted development can currently provide.
Final Note: This project demonstrates the exciting potential of AI in software development, but it also highlights the critical importance of understanding the limitations. Security is not an area where "good enough" is acceptable—production security tools must be built by experts with deep knowledge of threats, defenses, and the ever-evolving landscape of cybersecurity.
If you have specific questions about this project, please send me an email or leave a comment. I will reply as soon as I will be available.
If you need additional information about this topic or if you want to discuss it personally, please write an email.