Generators
A Generator is a generic plugin that queries data and creates new nodes and relationships based on the result.
- Within your schema you could create an abstract service object that through a Generator creates other nodes.
- Want to read how Generators can be used to create a service catalog? See our blog post on How to Turn Your Source of Truth into a Service Factory.
High level design​
Generators are defined as a Generator definition within an .infrahub.yml file. A Generator definition consists of a number of related objects.
- Group of targets - Objects that the Generator will act upon
- Generator class - Python code that defines the generation logic
- GraphQL Query - Data collection specification
Running a Generator definition will create new nodes as defined by the Generator, or remove old ones that are no longer required. The removal of obsolete objects is handled using the SDK tracking feature
The targets point to a group that will consist of objects that are impacted by the Generator. The members of this group can be any type of object within your schema, service objects, devices, contracts or anything you want the Generator to act upon. Generator groups (CoreGeneratorGroup) serve as target collections that define which objects trigger Generator execution, while the actual tracking of generated objects is handled by individual Generator instances.
The GraphQL query defines the data that will be collected when running the Generator. Any object identified in this step is added as a member to a GraphQL query group (CoreGraphQLQueryGroup). The membership in these groups are then used to determine which Generators need to be executed as part of a proposed change during the pipeline run.
The Generator itself is a Python class that is based on the InfrahubGenerator class from the SDK. Just like Transformations and checks, the Generators are user defined.
Generators can be executed in several ways, depending on your workflow and where you are in the lifecycle (local development vs. in Infrahub):
-
During development with infrahubctl
Use the
infrahubctl generatorcommand to iterate locally while building and testing your Generator. -
Manually from the UI
From the Infrahub UI, open the Generator Definition detail page (Actions > Generator Definitions) and click Run to trigger the Generator on demand.
-
Automatically via Proposed Changes
When you open a Proposed Change that affects the Generator's targets, the Generator runs as part of Infrahub's CI checks. Review the results in the Checks and Data tabs of the Proposed Change. This behavior can also be disabled per Generator in the repository configuration file.
-
Automatically via Events and Actions
You can configure Infrahub Event rules and Actions to trigger Generators automatically based on changes in your data. This enables fully automated execution aligned with your workflows.
Per-target execution model​
Infrahub does not run a Generator once for the entire target group. Instead, it creates one independent run per member of the target group.
When you trigger a Generator definition, Infrahub:
- Fetches the target group and enumerates its members.
- For each member, extracts scoped variables from the target object using the
parametersmapping. - Creates an independent Generator run for that member, passing the scoped variables to the GraphQL query.
A Generator definition targeting a group with 10 members produces 10 separate runs. Each run sees only the data relevant to its specific target object.
Generator Definition
│
â–¼
Target Group
├── Member A → Run A (variables from A)
├── Member B → Run B (variables from B)
└── Member C → Run C (variables from C)
Each run is fully independent — it has its own query variables, its own query results, and its own Generator instance. Runs do not share state.
Query parameter mapping​
The parameters field in .infrahub.yml controls how Infrahub extracts variables from each target object and passes them to the GraphQL query. This is the mechanism that scopes each run to its target.
How it works​
Given this Generator definition:
generator_definitions:
- name: widget_generator
file_path: "generators/widget_generator.py"
targets: widgets
query: widget_query
class_name: WidgetGenerator
parameters:
name: "name__value"
And this GraphQL query:
query Widgets($name: String!) {
TestWidget(name__value: $name) {
edges {
node {
name { value }
count { value }
}
}
}
}
For each member of the widgets group, Infrahub:
- Reads the parameter mapping:
name→"name__value" - Extracts the value from the target object using the defined path
- Passes it as a query variable
For example:
| Target object | Extraction path | Extracted value | Query variable |
|---|---|---|---|
widget1 | widget1.name.value | "widget1" | $name = "widget1" |
widget2 | widget2.name.value | "widget2" | $name = "widget2" |
Each run's GraphQL query only returns data for its specific target, keeping runs independent.
Double-underscore notation​
The double-underscore (__) in parameter values traverses the object hierarchy:
name__value: attributename, propertyvaluelocation__name__value: relationshiplocation(cardinality-one), then attributename, propertyvalue
The first segment is checked against the object's schema. If it matches an attribute, the remaining segments traverse the attribute's properties. If it matches a cardinality-one relationship, Infrahub fetches the related node and continues the traversal recursively.
Only cardinality-one relationships are supported in parameter paths. Cardinality-many relationships cannot be traversed this way.
Parallel execution​
Because each run is independent — scoped to one target object with no shared state — Infrahub dispatches all runs for a Generator definition concurrently.
This means:
- All members of a target group are processed in parallel, not sequentially.
- Performance scales with available workers, not with target count. A group with 100 members doesn't take 100x longer than a group with 1 member.
- Different Generator definitions can also run concurrently when triggered independently.
What this means for Generator design​
Because runs are concurrent:
- Your Generator code should not depend on side effects from other runs of the same Generator.
- Each run should be self-contained — it reads its scoped data, creates its objects, and finishes.
- If you need ordering (layer A must complete before layer B starts), use separate Generator definitions with a trigger mechanism rather than relying on execution order within a single definition. See modular Generators for this pattern.
Generator instances​
Each per-target run creates or updates a CoreGeneratorInstance — a tracking object that links three things together:
- The Generator definition that was run
- The target object (the specific group member)
- The status of that run (
pending,ready, orerror)
Generator instances enable:
- Per-target status tracking: you can see which targets succeeded and which failed, without needing to inspect logs.
- Selective re-runs: you can re-run the Generator for a single target object without affecting others. Only the instance for that target gets updated.
- Object lifecycle management: the instance links the Generator to the objects it created, enabling cleanup when a target is removed.
You can view Generator instances in the Infrahub UI under the Generator Definition detail page.
Designing groups for parallelism​
Since group structure determines execution structure, how you organize your target groups directly affects parallelism and operational flexibility.
The principle​
Group at the level where you want independent execution. If racks should generate independently, make racks the target — not pods. If entire sites should generate as a unit, make sites the target.
More members in the target group means more parallel runs and better utilization of available workers.