Developer Documentation

Harold API

Harold’s public API is the surface exposed to external platforms — primarily Zapier — allowing authenticated clients to list user resources, receive document-completion webhooks, and exchange OAuth credentials.

Base URL: https://www.useharold.com
Protocol: HTTPS only
Format: JSON (except OAuth token exchange which uses application/x-www-form-urlencoded per RFC 6749)

1. Authentication

Harold uses OAuth 2.0 with the Authorization Code + PKCE flow. This is the flow Zapier’s Platform UI uses by default.

OAuth endpoints

PurposeMethodURL
Authorization URLGEThttps://www.useharold.com/oauth/authorize
Access Token URLPOSThttps://www.useharold.com/api/oauth/token
Test URL (auth test)GEThttps://www.useharold.com/api/oauth/me

Scopes

Harold issues scoped access tokens. Clients should request only the scopes they need:

  • read — Access processed documents, extracted fields, and run history
  • schemas — Read the user’s DocuTrain schemas (required for dynamic dropdowns)
  • webhooks — Register REST Hook subscriptions for document-completion events

A typical Zapier integration requests all three: scope=read schemas webhooks.

Authorization request

Redirect the user to:

https://www.useharold.com/oauth/authorize
  ?client_id={your_client_id}
  &redirect_uri={your_callback_url}
  &response_type=code
  &scope=read schemas webhooks
  &state={opaque_state_string}
  &code_challenge={pkce_challenge}
  &code_challenge_method=S256

The user signs in to Harold and approves access. Harold redirects back to redirect_uri with ?code={auth_code}&state={state}.

Token exchange

POST https://www.useharold.com/api/oauth/token
Content-Type: application/x-www-form-urlencoded

grant_type=authorization_code
&code={auth_code}
&redirect_uri={same_as_authorize}
&client_id={your_client_id}
&client_secret={your_client_secret}
&code_verifier={pkce_verifier}

Success response (200):

json
{ "access_token": "harold_xxxxx", "token_type": "bearer", "scope": "read schemas webhooks" }

Error responses:

  • 400 invalid_request — missing required field
  • 400 invalid_grant — code is invalid, expired, or already used
  • 400 unsupported_grant_typegrant_type is not authorization_code
  • 401 invalid_client — client_id/client_secret mismatch
  • 500 server_error — token issuance failed

Access tokens do not currently expire but may be revoked by the user at any time.

Using the access token

All authenticated endpoints require:

Authorization: Bearer {access_token}

Test endpoint — verify the connection works

GET/api/oauth/me
GET https://www.useharold.com/api/oauth/me
Authorization: Bearer {access_token}

Response:

json
{ "id": "user-uuid", "email": "user@example.com", "name": "Charlie Example", "plan": "starter", "scopes": ["read", "schemas", "webhooks"] }

Zapier uses the email field as the connection label (shown to users in the Zap editor as “Connected as …”).

2. Rate Limiting

Harold applies per-endpoint rate limits. The relevant limits for external integrations:

  • OAuth token exchange: 30 per 15 minutes per IP
  • GET endpoints (schemas, inboxes, runs): 100 per 15 minutes per user
  • Webhook subscribe / unsubscribe: 60 per 15 minutes per user

Rate limit headers are returned on every response:

  • X-RateLimit-Limit — maximum requests in the window
  • X-RateLimit-Remaining — requests left in the current window
  • X-RateLimit-Reset — Unix timestamp when the window resets

Exceeding the limit returns 429 Too Many Requests. Clients should implement exponential backoff.

3. Error Format

All error responses return JSON with at least an error field:

json
{ "error": "Unauthorized" }

Some endpoints include additional context in an error_description field (OAuth endpoints follow RFC 6749).

Standard status codes

StatusMeaning
200Success
204Success, no response body (used by unsubscribe)
400Bad request — validation error, malformed JSON
401Unauthorized — missing, invalid, or revoked token
404Not found — resource does not belong to user
429Too many requests
500Internal server error

4. Resource Endpoints — Dynamic Dropdowns

These endpoints return lists of user-owned resources and are used by Zapier’s dynamic dropdowns in the Zap editor.

List schemas

GET/api/zapier/schemas
GET /api/zapier/schemas
Authorization: Bearer {access_token}

Returns the user’s active DocuTrain schemas (invoice templates, receipt templates, PO templates, and any user-trained supplier schemas).

Response (200):

json
[ { "id": "3f2b…-uuid", "name": "UK Standard Invoice" }, { "id": "a91d…-uuid", "name": "Acme Corp Monthly PO" } ]

Zapier input-designer configuration:

  • Type: Dynamic Dropdown
  • Label field: name
  • Value field: id

List inboxes

GET/api/zapier/inboxes
GET /api/zapier/inboxes
Authorization: Bearer {access_token}

Returns the user’s active inboxes that have Zapier enabled.

Response (200):

json
[ { "id": "inbox-uuid", "name": "Purchase Invoices", "address": "harold-inbox+xxxx@useharold.com", "document_type": "invoice", "confidence_threshold": 80 } ]

5. Trigger Endpoint — Document Completion

Harold uses Zapier REST Hooks for triggers (not polling). The flow is:

  1. User enables a Zap — Zapier calls POST /api/zapier/subscribe
  2. Harold processes a document — Harold POSTs the payload to Zapier’s stored hookUrl
  3. User disables the Zap — Zapier calls DELETE /api/zapier/unsubscribe

Subscribe

POST/api/zapier/subscribe
POST /api/zapier/subscribe
Authorization: Bearer {access_token}
Content-Type: application/json

{
  "hookUrl": "https://hooks.zapier.com/hooks/standard/xxxx/yyyy/",
  "schemaId": "3f2b…-uuid",
  "event": "inbox.document.complete",
  "zapId": "optional-zapier-zap-id"
}

Harold creates (or activates a pending placeholder) a webhook_connection scoped to the given schemaId. One connection per schema — multiple sender rules feeding the same schema share one Zap.

Response (200):

json
{ "id": "webhook-connection-uuid" }

Zapier stores this id and passes it back on unsubscribe.

Required fields:

  • hookUrl — Zapier’s callback URL (must be valid HTTPS URL)
  • schemaId — UUID of a Harold schema the user owns

Error responses:

  • 400hookUrl missing or malformed, or invalid JSON body
  • 401 — token invalid
  • 404schemaId does not belong to the authenticated user

Unsubscribe

DELETE/api/zapier/unsubscribe
DELETE /api/zapier/unsubscribe
Authorization: Bearer {access_token}
Content-Type: application/json

{
  "id": "webhook-connection-uuid"
}

Deactivates the connection. No further payloads are sent to the Zap.

Response: 204 No Content on success. Returns 200 { "ok": true } even if the connection has already been removed (idempotent).

Sample / preview payload

GET/api/zapier/runs

Zapier calls GET /api/zapier/runs?schemaId={schemaId} during Zap setup to fetch a sample payload for field mapping. Harold returns either:

  • Recent real documents processed under that schema (if any exist), OR
  • A synthetic sample built from the schema’s full set of output fields (header fields, line item fields, formula rule outputs, KeyMatch fields, GL match suggestions) so users can map every column immediately after setup without needing to process a document first.
GET /api/zapier/runs?schemaId={schemaId}&limit=3
Authorization: Bearer {access_token}

Response shape (flattened — all schema fields at top level):

json
[ { "id": "document-uuid", "run_id": "run-uuid", "file_name": "invoice-acme-2026-0412.pdf", "processed_at": "2026-04-22T12:34:56Z", "inbox_id": "inbox-uuid", "schema_id": "schema-uuid", "confidence": 0.94, "status": "complete", "supplier_name": "Acme Corp", "invoice_number": "INV-2026-0412", "invoice_date": "2026-04-12", "total_amount": "1234.56", "currency": "GBP", "vat_amount": "246.91", "net_amount": "987.65", "line_items": [ { "description": "Widgets", "quantity": "10", "unit_price": "98.76", "line_total": "987.60" } ] } ]

Webhook payload (when Harold fires the Zap)

When a document finishes processing and the schema has an active subscription, Harold POSTs to the stored hookUrl with the same flattened payload as above. Headers include:

Content-Type: application/json
User-Agent: Harold-Webhook/1.0
X-Harold-Event: inbox.document.complete
X-Harold-Signature: sha256={hmac-of-body-with-connection-secret}

Consumers should verify the signature.

6. Connection Management

Disconnect (user-initiated, called from Harold’s UI)

POST/api/zapier/disconnect

Revokes all of the current user’s active OAuth tokens. The user must be logged in to Harold; this is not an OAuth-authenticated endpoint.

Connection status (internal)

GET/api/zapier/status

Returns whether the current logged-in user has an active Zapier connection. Used by Harold’s Settings page to render connect/disconnect UI.

7. Field Types and Data Format

Harold returns all extracted values as strings by default — currency values have symbols, commas, and whitespace stripped, leaving a plain decimal string (e.g. "1234.56" not "£1,234.56"). This is deliberate: downstream Zap actions handle type coercion differently per destination (Sheets, QuickBooks, Xero) and keeping values as strings avoids lossy conversions.

Line items are always returned as an array under the line_items key, even when the schema has no line items configured (empty array).

Confidence scores are floats between 0 and 1:

  • 0.9–1.0 — clearly visible, unambiguous
  • 0.7–0.9 — readable but slightly unclear
  • 0.5–0.7 — partially obscured or ambiguous label match
  • Below 0.5 — not found or inferred

Per-field confidence is available as {field_key}_confidence. Document-level confidence is available as overall_confidence.

8. Changelog

  • 2026-04 — Initial public API release for Zapier integration.

9. Support

For API questions, integration help, or to report an issue:

For Zapier-specific support: the Zap editor’s built-in support channels route to Zapier, who may escalate to Harold where relevant.