Back to Media
guide

The Rules Engine: applying business logic to extracted data

Extraction gets the data out of your documents. The Rules Engine makes it right for your systems. Set conditions once — Harold applies them on every document, automatically.

Harold Team·7 March 2026·6 min read

What is the Rules Engine?

The Rules Engine is where you encode your business logic into Harold. Extraction tells you what is on the document. The Rules Engine tells Harold what to do with it.

A rule is a condition plus an action. If a field value matches a condition, Harold applies an action to a target field. Rules are applied automatically at processing time — you write them once and Harold uses them forever.

Why rules matter

Raw extracted data is rarely ready to go straight into a downstream system. The supplier's document says one thing; your ERP expects something different. The gap between those two states — supplier language versus your internal standard — is where the Rules Engine lives.

Some common examples: a supplier always sends a payment term of "net 30" but your system requires "NET30". A delivery note always contains a site code in a reference field that should be used to populate a cost centre. A particular supplier always omits the currency symbol and it needs to default to GBP. Every one of these is a rule.

How rules work

Each rule has three parts.

A condition — the trigger. You specify a field (for example, payment_terms), an operator (equals, contains, starts with, is empty, matches pattern), and a value to match against.

A target field — the field the rule should write to. This can be the same field as the condition or a different one.

An action — what to write. You can set a fixed value, transform the existing value (uppercase, lowercase, trim whitespace, replace substring), copy a value from another field, or set a default when a field is empty.

Rules are evaluated in order. If multiple rules match the same document, they are applied sequentially. You can reorder rules by dragging them.

Conditional chaining

Rules can be chained. You can create a rule that only fires if a previous rule also fired, or that fires when a field reaches a certain state after another rule has run. This lets you build multi-step logic without needing a developer.

For example: Rule 1 sets payment_terms to "NET30" if the raw value contains "30". Rule 2 then checks if the supplier_name is blank after extraction and defaults it to "Unknown Supplier" if so. Both rules run in sequence, in order, on every document.

What rules cannot do

Rules operate on individual documents. They cannot compare values across documents — for example, "flag this invoice if the same invoice number appeared last week". Cross-document logic belongs in a downstream system.

Rules also cannot make network calls or look up external data. They work on the extracted data that is already in Harold.

Ease of use

The Rules Engine is the most technically demanding part of Harold — not because it requires code, but because it requires you to think clearly about your data requirements. Users who spend time on this upfront find it saves enormous amounts of downstream manual work. Users who skip it find themselves doing the same corrections repeatedly.

The interface uses plain-language condition builders with dropdown selectors rather than code. Most rules take under two minutes to create. We recommend building your rules incrementally as you encounter each new edge case rather than trying to anticipate everything upfront.

Limitations

Rules are applied in memory at processing time. They do not retroactively update historical documents — they only apply to documents processed after the rule was created. If you create a new rule and want it to apply to existing documents, you will need to reprocess them.

The aim

The aim of the Rules Engine is to eliminate the gap between what Harold extracts and what your systems need. Once your rule library is built, Harold's output should require no manual adjustment before it enters your ERP, accounting system, or Zapier workflow. Extraction gets the data. Rules make it right.

Ready to automate your supplier documents?

Start free — no credit card, no setup calls, no supplier changes required.