State, Lifecycle & Deployment

Understanding Workflow State & Lifecycle

Workflows and their individual steps transition through various states during execution. Understanding these helps in debugging and managing your automated processes.

Workflow Instance States

A workflow instance represents a single, end-to-end execution of a workflow definition. Its status reflects the overall progress of that execution. The primary instance states, as reflected in the GraphQL API (WorkflowInstanceStatus), are:

ACTIVE: The workflow instance has started and is currently processing, or is waiting for one of its steps to complete (e.g., a step is SLEEPING, PENDING, REQUESTING, or RELYING). It has not yet reached a final state.
PAUSED: The instance was explicitly paused via an API call (e.g., GraphQL pause mutation). It can be resumed later via the resume mutation, returning it to its previous operational state (which would typically be ACTIVE while its current step continues).
COMPLETED: The workflow’s execute function has successfully run to completion and returned a value. This is a final state.
FAILED: An unrecoverable error occurred during the workflow’s execution (e.g., an error thrown after exhausting all retries for a step, or an internal engine error). This is a final state.
TERMINATED: The instance was explicitly stopped via an API call (e.g., GraphQL terminate mutation). This is a final state.

You can query the current status of any workflow instance using the GraphQL API.

Workflow Step States

Each activity invoked within a workflow (like flow.do, flow.sleep, flow.dialog, etc.) becomes a “step” with its own lifecycle. The state of the current step often dictates why an ACTIVE instance is waiting. Key step states (WorkflowStepStatus) include:

ACTIVE: The step’s main logic (e.g., the task function in flow.do) is currently executing.
SLEEPING: The step was initiated by flow.sleep() and is paused until the specified time. The instance remains ACTIVE.
PENDING: The step was initiated by flow.dialog() and is waiting for an external continuation (e.g., user input via UI). The instance remains ACTIVE.
REQUESTING: The step was initiated by flow.request() and is waiting for an external callback or for its polling condition to be met. The instance remains ACTIVE.
RELYING: The step was initiated by flow.start() and is waiting for the initiated sub-workflow instance to complete. The instance remains ACTIVE.
CONTINUING: A transient state when a PENDING or REQUESTING step receives a continuation request and is processing the incoming data before resolving or potentially failing.
COMPLETED: The step’s logic has finished successfully.
ERRORED: The step’s logic encountered an error. If retries are configured, it might transition back to ACTIVE or a waiting state. If all retries are exhausted, the step remains ERRORED, which may then cause the parent workflow instance to transition to FAILED.

Understanding both instance and step states is key. For example, a workflow instance can be ACTIVE because its current step is SLEEPING.

Lifecycle & Event Sourcing

Every transition between these states (for both instances and steps), every activity completion, every error, and every variable update (flow.vars) is recorded as an immutable event in the instance’s history.

When a workflow needs to resume (e.g., after a flow.sleep step completes or a flow.dialog step is externally continued), the engine replays the relevant events to reconstruct the exact state (including flow.vars and dependency instances from flow.use) where it left off.
This event sourcing mechanism is what provides the durability, auditability, and resilience of IdentityFlow workflows.
You can query the full event history for any instance via the GraphQL API (instance.events) for detailed debugging and auditing.

While you don’t typically interact with the event log directly within your workflow code, understanding its existence and purpose helps grasp how the engine manages state and resumes execution reliably.

Workflow Metadata (`meta`)

You can attach arbitrary JSON-serializable metadata to workflow definitions and instances:

Definition Metadata: Set in defineWorkflow config. See Defining Workflows.

defineWorkflow(
  {
    // ... other config
    meta: { ownerTeam: 'billing', criticality: 'high' },
  } /* ... */,
);

Instance Metadata: Passed via the meta property when starting an instance (e.g., via the GraphQL start mutation) or when starting a sub-workflow with flow.start.

// Inside flow.start activity function
(ctx) => ({
  workflow: 'sub-workflow-name',
  params: {
    /* ... */
  },
  meta: {
    correlationId: flow.instance.meta.correlationId || flow.instance.id,
    parentId: flow.instance.id,
  },
});

Accessing Metadata: Read via flow.definition.meta and flow.instance.meta.
System Metadata: IdentityFlow automatically includes system metadata like traceId, spanId, correlationId, causationId in flow.instance.meta for observability and tracing across instances and steps. While optional according to the type definitions, these fields are typically populated or propagated by the standard IdentityFlow server implementation based on engine configuration and incoming requests.

Metadata is useful for categorization, routing, reporting, and tracing workflows.

Deployment & Execution (High-Level)

This guide focuses on developing workflows using the @identity-flow/sdk. Once you have defined your workflow in a TypeScript file, you need to make it available to the IdentityFlow engine and then manage its execution (instances).

Details on workflow definition discovery by the engine (compilation, bundling, search paths) can be found in the Defining Workflows guide.

Managing Instances via GraphQL API:

While the SDK defines how a workflow behaves, you typically manage its execution—starting new instances, querying status, and handling interactions—through the IdentityFlow GraphQL API (usually available at http://localhost:4000/graphql locally).

Here are some key operations you’ll perform via the API:

Starting a Workflow Instance: Use the workflowDefinition mutation to find your registered definition and then call start.

mutation StartSimpleFlow($params: JSON!) {
  workflowDefinition(by: { name: "@carv/simple-flow", releaseChannel: "latest" }) {
    start(input: { params: $params, label: "My Simple Instance" }) {
      success
      instance {
        id
        status
      }
      error {
        message
      }
    }
  }
}

Querying Instances: Find instances by ID, status, definition, etc., using the workflowInstance or workflowInstances queries.

query GetInstanceStatus($instanceId: ID!) {
  workflowInstance(by: { id: $instanceId }) {
    id
    label
    status
    message
    createdAt
    updatedAt
    finishedAt
    vars # Access exposed variables
    steps {
      nodes {
        id
        name
        status
        kind
      }
    }
    events {
      nodes {
        id
        type
        message
        createdAt
      }
    }
  }
}

Continuing Pending Steps (dialog/request): When a workflow is PENDING or REQUESTING after flow.dialog or flow.request, find the step’s token (often included in step.data or instance.vars depending on your define function) and use the workflowActivity mutation.

mutation ContinueDialog($token: ID!, $responseData: JSON!) {
  workflowActivity(token: $token) {
    continue(input: { status: COMPLETED, data: $responseData }) {
      success
      instance {
        id
        status # Should now be ACTIVE or another state
      }
      error {
        message
      }
    }
  }
}

Controlling Instances (pause, resume, terminate): Use mutations on workflowInstance or workflowInstances to control the lifecycle.

mutation PauseInstance($instanceId: ID!) {
  workflowInstance(by: { id: $instanceId }) {
    pause(input: { reason: "Manual intervention needed" }) {
      success
      instance {
        status
      }
    }
  }
}
# Similar mutations exist for resume and terminate

Summary:

SDK (@identity-flow/sdk): You write .ts files using defineWorkflow to define the logic.
Engine: Registers these definitions (details in Defining Workflows).
GraphQL API: You interact with the engine to start, query, continue, and control instances of those definitions.

Keep this separation in mind as you build and operate your workflows.

Troubleshooting & Best Practices

Here are some tips for developing and debugging IdentityFlow workflows:

For a comprehensive set of guidelines, also refer to our Rules of Workflows document.

Common Pitfalls:

Non-Deterministic Code: Avoid logic within your execute function that relies on non-deterministic sources (like Math.random(), new Date() without caching, or external API calls without flow.do or flow.request). The engine relies on deterministic replay for resilience. (See Workflow Rule #5).
Incorrect flow.use() Usage: Using inline anonymous functions for bindings (flow.use(() => new Client())) will bypass caching and potentially create resource leaks. Always use stable function references. (See Dependencies & Logging).
Missing await: Forgetting await on flow.do, flow.sleep, etc., will cause unexpected behavior as the workflow won’t pause correctly.
Unclear Step Names: Using generic or duplicate names for steps (flow.do('step1', ...) multiple times) breaks idempotency and makes debugging harder. (See Workflow Rule #5).
Large flow.vars: Avoid storing excessive or complex data in flow.vars. Use it primarily for exposing essential status or progress information externally. (See Workflow Rule #3).

Debugging Tips:

Use Logging Extensively: Add flow.log, flow.debug, flow.info, flow.warn, flow.error calls throughout your workflow logic. These logs are associated with the instance and step in the engine. (See Dependencies & Logging).
Query Instance History: Use the GraphQL API to inspect the status, vars, steps, and especially the events history of a workflow instance. The event log provides a detailed trace of execution.
Check Engine Logs: Consult the logs of the IdentityFlow engine itself for lower-level errors or processing details.
Isolate Issues: If a complex workflow fails, try commenting out sections or simplifying logic to pinpoint the problematic step or interaction.
Test Individual Steps: Write unit tests for complex logic within flow.do tasks where possible.

Best Practices Summary:

Keep Workflows Focused: Aim for workflows that model a single, well-defined business process.
Use Sub-Workflows (flow.start): Break down complex processes into smaller, reusable sub-workflows. (See Core Workflow Activities).
Validate Inputs/Outputs: Use schema validation (schema option) for workflow params and critical activity results. (See Defining Workflows).
Handle Errors Gracefully: Use try...catch for expected errors and configure sensible retries. (See Error Handling & Retries).
Use Descriptive Names: Give clear, unique names to workflow definitions and steps.
Manage Dependencies Correctly: Use flow.use with stable binding functions for external services.
Leverage flow.vars Appropriately: Expose meaningful status or progress via flow.vars, but manage internal processing state with standard variables.

By following these guidelines, you can build robust, maintainable, and debuggable workflows with IdentityFlow.

Next Steps

Now that you have a comprehensive understanding of developing guides, explore advanced topics and practical examples:

Deep Dive Topics Explore advanced concepts like Workflow Rules, Testing, and Observability.

Workflow Examples See practical implementations of common workflow patterns.