Nuclei Architecture Document
A brief overview of Nuclei Engine architecture. This document will be kept updated as the engine progresses.
pkg/templates
Template
Template is the basic unit of input to the engine which describes the requests to be made, matching to be done, data to extract, etc.
The template structure is described here. Template level attributes are defined here as well as convenience methods to validate, parse and compile templates creating executers.
Any attributes etc. required for the template, engine or requests to function are also set here.
Workflows are also compiled, their templates are loaded and compiled as well. Any validations etc. on the paths provided are also done here.
Parse
function is the main entry point which returns a template for a filePath
and executorOptions
. It compiles all the requests for the templates, all the workflows, as well as any self-contained request etc. It also caches the templates in an in-memory cache.
Preprocessors
Preprocessors are also applied here which can do things at template level. They get data of the template which they can alter at will on runtime. This is used in the engine to do random string generation.
Custom processor can be used if they satisfy the following interface.
pkg/model
Model package implements Information structure for Nuclei Templates. Info
contains all major metadata information for the template. Classification
structure can also be used to provide additional context to vulnerability data.
It also specifies a WorkflowLoader
interface that is used during workflow loading in template compilation stage.
pkg/protocols
Protocols package implements all the request protocols supported by Nuclei. This includes http, dns, network, headless and file requests as of now.
Request
It exposes a Request
interface that is implemented by all the request protocols supported.
Many of these methods are similar across protocols while some are very protocol specific.
A brief overview of the methods is provided below -
Compile - Compiles the request with provided options.
Requests - Returns total requests made.
GetID - Returns any ID for the request
Match - Used to perform matching for patterns using matchers
Extract - Used to perform extraction for patterns using extractors
ExecuteWithResults - Request execution function for input.
MakeResultEventItem - Creates a single result event for the intermediate
InternalWrappedEvent
output structure.MakeResultEvent - Returns a slice of results based on an
InternalWrappedEvent
internal output event.GetCompiledOperators - Returns the compiled operators for the request.
MakeDefaultResultEvent
function can be used as a default for MakeResultEvent
function when no protocol-specific features need to be implemented for result generation.
For reference protocol requests implementations, one can look at the below packages -
Executer
All these different requests interfaces are converted to an Executer which is also an interface defined in pkg/protocols
which is used during final execution of the template.
The ExecuteWithResults
function accepts a callback, which gets provided with results during execution in form of *output.InternalWrappedEvent
structure.
The default executer is provided in pkg/protocols/common/executer
. It takes a list of Requests and relevant ExecuterOptions
and implements the Executer interface required for template execution. The executer during Template compilation process is created from this package and used as-is.
A different executer is the Clustered Requests executer which implements the Nuclei Request clustering functionality in pkg/templates
We have a single HTTP request in cases where multiple templates can be clustered and multiple operator lists to match/extract. The first HTTP request is executed while all the template matcher/extractor are evaluated separately.
For Workflow execution, a separate RunWorkflow function is used which executes the workflow independently of the template execution.
With this basic premise set, we can now start exploring the current runner implementation which will also walk us through the architecture of nuclei.
internal/runner
Template loading
The first process after all CLI specific initialisation is the loading of template/workflow paths that the user wants to run. This is done by the packages described below.
pkg/catalog
This package is used to get paths using mixed syntax. It takes a template directory and performs resolving for template paths both from provided template and current user directory.
The syntax is very versatile and can include filenames, glob patterns, directories, absolute paths, and relative-paths.
Next step is the initialisation of the reporting modules which is handled in pkg/reporting
.
pkg/reporting
Reporting module contains exporters and trackers as well as a module for deduplication and a module for result formatting.
Exporters and Trackers are interfaces defined in pkg/reporting.
Exporters include Elasticsearch
, markdown
, sarif
. Trackers include GitHub
, GitLab
and Jira
.
Each exporter and trackers implement their own configuration in YAML format and are very modular in nature, so adding new ones is easy.
After reading all the inputs from various sources and initialisation other miscellaneous options, the next bit is the output writing which is done using pkg/output
module.
pkg/output
Output package implements the output writing functionality for Nuclei.
Output Writer implements the Writer interface which is called each time a result is found for nuclei.
ResultEvent structure is passed to the Nuclei Output Writer which contains the entire detail of a found result. Various intermediary types like InternalWrappedEvent
and InternalEvent
are used throughout nuclei protocols and matchers to describe results in various stages of execution.
Interactsh is also initialised if it is not explicitly disabled.
pkg/protocols/common/interactsh
Interactsh module is used to provide automatic Out-of-Band vulnerability identification in Nuclei.
It uses two LRU caches, one for storing interactions for request URLs and one for storing requests for interaction URL. These both caches are used to correlated requests received to the Interactsh OOB server and Nuclei Instance. Interactsh Client package does most of the heavy lifting of this module.
Polling for interactions and server registration only starts when a template uses the interactsh module and is executed by nuclei. After that no registration is required for the entire run.
RunEnumeration
Next we arrive in the RunEnumeration
function of the runner.
HostErrorsCache
is initialised which is used throughout the run of Nuclei enumeration to keep track of errors per host and skip further requests if the errors are greater than the provided threshold. The functionality for the error tracking cache is defined in hosterrorscache.go and is pretty simplistic in nature.
Next the WorkflowLoader
is initialised which used to load workflows. It exists in pkg/parsers/workflow_loader.go
The loader is initialised moving forward which is responsible for Using Catalog, Passed Tags, Filters, Paths, etc. to return compiled Templates
and Workflows
.
pkg/catalog/loader
First the input passed by the user as paths is normalised to absolute paths which is done by the pkg/catalog
module. Next the path filter module is used to remove the excluded template/workflows paths.
pkg/parsers
module's LoadTemplate
,LoadWorkflow
functions are used to check if the templates pass the validation + are not excluded via tags/severity/etc. filters. If all checks are passed, then the template/workflow is parsed and returned in a compiled form by the pkg/templates
's Parse
function.
Parse
function performs compilation of all the requests in a template + creates Executers from them returning a runnable Template/Workflow structure.
Clustering module comes in next whose job is to cluster identical HTTP GET requests together (as a lot of the templates perform the same get requests many times, it's a good way to save many requests on large scans with lots of templates).
pkg/operators
Operators package implements all the matching and extracting logic of Nuclei.
A protocol only needs to embed the operators.Operators
type shown above, and it can utilise all the matching/extracting functionality of nuclei.
The core of this process is the Execute function which takes an input dictionary as well as a Match and Extract function and return a Result
structure which is used later during nuclei execution to check for results.
The internal logics for matching and extracting for things like words, regexes, jq, paths, etc. is specified in pkg/operators/matchers
, pkg/operators/extractors
. Those packages should be investigated for further look into the topic.
Template Execution
pkg/core
provides the engine mechanism which runs the templates/workflows on inputs. It exposes an Execute
function which does the task of execution while also doing template clustering. The clustering can also be disabled optionally by the user.
An example of using the core engine is provided below.
Adding a New Protocol
Protocols form the core of Nuclei Engine. All the request types like http
, dns
, etc. are implemented in form of protocol requests.
A protocol must implement the Protocol
and Request
interfaces described above in pkg/protocols
. We'll take the example of an existing protocol implementation - websocket for this short reference around Nuclei internals.
The code for the websocket protocol is contained in pkg/protocols/others/websocket
.
Below a high level skeleton of the websocket implementation is provided with all the important parts present.
Almost all of these protocols have boilerplate functions for which default implementations have been provided in the providers
package. Examples are the implementation of Match
, Extract
, MakeResultEvent
, GetCompiledOperators
, etc. which are almost same throughout Nuclei protocols code. It is enough to copy-paste them unless customization is required.
eventcreator
package offers CreateEventWithAdditionalOptions
function which can be used to create result events after doing request execution.
Step by step description of how to add a new protocol to Nuclei -
Add the protocol implementation in
pkg/protocols
directory. If it's a small protocol with fewer options, considering adding it to thepkg/protocols/others
directory. Add the enum for the new protocol topkg/templates/types/types.go
.Add the protocol request structure to the
Template
structure fields. This is done inpkg/templates/templates.go
with the corresponding import line.
Also add the protocol case to the Type
function as well as the TemplateTypes
array in the same templates.go
file.
Add the protocol request to the
Requests
function andcompileProtocolRequests
function in thecompile.go
file in same directory.
That's it, you've added a new protocol to Nuclei. The next good step would be to write integration tests which are described in integration-tests
and cmd/integration-tests
directories.
Profiling and Tracing
To analyze Nuclei's performance and resource usage, you can generate CPU & memory profiles and trace files using the -profile-mem
flag:
This command creates three files:
nuclei.cpu
: CPU profilenuclei.mem
: Memory (heap) profilenuclei.trace
: Execution trace
Analyzing the CPU/Memory Profiles
View the profile in the terminal:
Display overall CPU time for processing targets:
Display top memory consumers:
Visualize the profile in a web browser:
Analyzing the Trace File
To examine the execution trace:
These tools help identify performance bottlenecks and memory leaks, allowing for targeted optimizations of Nuclei's codebase.
Project Structure
pkg/reporting - Reporting modules for nuclei.
pkg/reporting/exporters/sarif - Sarif Result Exporter
pkg/reporting/exporters/markdown - Markdown Result Exporter
pkg/reporting/exporters/es - Elasticsearch Result Exporter
pkg/reporting/dedupe - Dedupe module for Results
pkg/reporting/trackers/gitlab - GitLab Issue Tracker Exporter
pkg/reporting/trackers/jira - Jira Issue Tracker Exporter
pkg/reporting/trackers/github - GitHub Issue Tracker Exporter
pkg/reporting/format - Result Formatting Functions
pkg/parsers - Implements template as well as workflow loader for initial template discovery, validation and - loading.
pkg/types - Contains CLI options as well as misc helper functions.
pkg/progress - Progress tracking
pkg/operators - Operators for Nuclei
pkg/operators/common/dsl - DSL functions for Nuclei YAML Syntax
pkg/operators/matchers - Matchers implementation
pkg/operators/extractors - Extractors implementation
pkg/catalog - Template loading from disk helpers
pkg/catalog/config - Internal configuration management
pkg/catalog/loader - Implements loading and validation of templates and workflows.
pkg/catalog/loader/filter - Filter filters templates based on tags and paths
pkg/output - Output module for nuclei
pkg/workflows - Workflow execution logic + declarations
pkg/utils - Utility functions
pkg/model - Template Info + misc
pkg/templates - Templates core starting point
pkg/templates/cache - Templates cache
pkg/protocols - Protocol Specification
pkg/protocols/file - File protocol
pkg/protocols/network - Network protocol
pkg/protocols/common/expressions - Expression evaluation + Templating Support
pkg/protocols/common/interactsh - Interactsh integration
pkg/protocols/common/generators - Payload support for Requests (Sniper, etc.)
pkg/protocols/common/executer - Default Template Executer
pkg/protocols/common/replacer - Template replacement helpers
pkg/protocols/common/helpers/eventcreator - Result event creator
pkg/protocols/common/helpers/responsehighlighter - Debug response highlighter
pkg/protocols/common/helpers/deserialization - Deserialization helper functions
pkg/protocols/common/hosterrorscache - Host errors cache for tracking erroring hosts
pkg/protocols/offlinehttp - Offline http protocol
pkg/protocols/http - HTTP protocol
pkg/protocols/http/race - HTTP Race Module
pkg/protocols/http/raw - HTTP Raw Request Support
pkg/protocols/headless - Headless Module
pkg/protocols/headless/engine - Internal Headless implementation
pkg/protocols/dns - DNS protocol
pkg/projectfile - Project File Implementation
Notes
The matching as well as interim output functionality is a bit complex, we should simplify it a bit as well.