Back to Research
RESEARCH PAPER

InvoiceMix

A full-stack finance workflow studio that uses subset search to match grouped invoices to a target payment.

May 15, 2026
Complexity: Bounded exponential search with pruning
Math Topics
Subset SumBacktrackingSearch PruningComplexity Analysis
Application
AlgorithmsSpring BootReactFinance Workflow

Problem Evidence

Finance pain turned into search constraints

InvoiceMix is not a generic calculator. Each product surface exists because reconciliation has messy inputs, expensive search, and audit expectations.

Constraint

One payment covers multiple invoices

Finance teams manually guess which invoices match the bank total.

System response

Exact combination search

Constraint

Spreadsheet exports vary by source

Amounts arrive as XLSX, XLS, CSV, TSV, or formatted strings.

System response

Upload parser and validation boundary

Constraint

Large invoice sets explode combinatorially

Naive subset generation becomes too expensive.

System response

Backtracking with pruning and result limits

Constraint

Auditors need reproducible answers

A match is not useful unless IDs and amounts are clear.

System response

Combination results and CSV export

Search Pipeline

From finance input to bounded subset search

1
Normalize input

IDs, amounts, target, and optional filters.

2
Validate guardrails

Unique IDs, positive amounts, request size, result limits.

3
Sort invoices

Use ordered amounts so impossible branches stop early.

4
Backtrack

Explore each invoice at most once per combination.

5
Return matches

Expose IDs plus amount lookup for display and export.

Complexity

Bounded exponential search

Reality

Exact subset matching can grow quickly as invoice count increases.

Mitigation

Sorting, pruning, filters, and result caps make normal cases practical.

Boundary

The backend validates uploads before search logic runs.

The full app wraps this search core with uploads, saved scenarios, and CSV export.
Input modes

Manual entry for small sets; upload for XLSX, XLS, CSV, and TSV exports.

Output

Matching invoice IDs, amount lookup, combination count, and exportable results.

Interactive Model

Subset Search Workspace

A compact version of the InvoiceMix engine. It sorts invoices, explores combinations once, and prunes branches when the running total exceeds the remaining target.

Invoices
Matches
1
Nodes
35
Pruned
23
Combination 1$5,320.50
INV-1041 $1,756.50INV-1024 $3,564.00
search trace
visit [root] remaining $5,320.50
visit [INV-1108] remaining $4,880.50
visit [INV-1108, INV-1099] remaining $4,000.00
visit [INV-1108, INV-1099, INV-1083] remaining $2,680.00
visit [INV-1108, INV-1099, INV-1083, INV-1041] remaining $923.50
prune [INV-1108, INV-1099, INV-1083, INV-1041] remaining $923.50
visit [INV-1108, INV-1099, INV-1083, INV-1077] remaining $640.00
prune [INV-1108, INV-1099, INV-1083, INV-1077] remaining $640.00
prune [INV-1108, INV-1099, INV-1083, INV-1024] remaining $2,680.00
visit [INV-1108, INV-1099, INV-1041] remaining $2,243.50
visit [INV-1108, INV-1099, INV-1041, INV-1077] remaining $203.50
prune [INV-1108, INV-1099, INV-1041, INV-1077] remaining $203.50
prune [INV-1108, INV-1099, INV-1041, INV-1024] remaining $2,243.50
visit [INV-1108, INV-1099, INV-1077] remaining $1,960.00

Matching Payments To Invoice Groups

InvoiceMix solves a common reconciliation problem: a finance team receives one payment, deposit, or accounting total, but the payment may cover several invoices. The system needs to find every unique invoice group that exactly reaches the target amount without reusing the same invoice twice.

The full application supports manual entry, spreadsheet upload, saved scenarios, CSV export, min and max invoice counts, and required invoice IDs.

Product Surface

The visual model above shows the product boundary: InvoiceMix turns messy finance inputs into a bounded subset-search workflow, then presents matches in a way finance teams can inspect or export.

The portfolio demo keeps only the core search model. The production app adds spreadsheet parsing, API validation, saved scenarios, CSV export, deployment, and UI workflows around the same problem.

Why This Is A Search Problem

The naive solution is to generate every possible invoice subset and test whether its total equals the target. That search space grows exponentially in the worst case, which becomes expensive quickly.

InvoiceMix makes the problem practical for normal finance workflows by applying guardrails and pruning:

  1. Validate invoice IDs, amounts, target, and filters before searching.
  2. Sort invoices by amount so branches can stop early.
  3. Walk the search tree with backtracking.
  4. Stop a branch when the next invoice is larger than the remaining amount.
  5. Apply min, max, and required-ID filters before returning a match.
  6. Cap request size and result count to avoid runaway searches.

Backend Design

The backend is a Spring Boot API with a focused service boundary:

CombinationController
  -> validates JSON or multipart upload requests
  -> calls CombinationService
  -> returns combinations, counts, and invoice amount lookup
 
ExcelInvoiceParser
  -> reads XLSX, XLS, CSV, and TSV inputs
  -> extracts invoice ID and amount columns
 
CombinationService
  -> sanitizes invoices
  -> enforces unique IDs and positive amounts
  -> runs the backtracking search

The important decision is that file parsing and combination search are separate concerns. Parsing turns messy finance exports into a clean list of { id, amount }; the search service only handles validated domain input.

API Shape

The JSON workflow sends a target, optional filters, and invoice rows:

{
  "target": 150,
  "minInvoices": 1,
  "maxInvoices": 3,
  "requiredInvoiceIds": ["INV-001"],
  "invoices": [
    { "id": "INV-001", "amount": 70 },
    { "id": "INV-002", "amount": 80 },
    { "id": "INV-003", "amount": 50 },
    { "id": "INV-004", "amount": 100 }
  ]
}

The response returns combinations by ID and a lookup table for display/export:

{
  "combinations": [
    ["INV-003", "INV-004"],
    ["INV-001", "INV-002"]
  ],
  "combinationCount": 2,
  "invoiceAmounts": {
    "INV-001": 70,
    "INV-002": 80,
    "INV-003": 50,
    "INV-004": 100
  }
}

Trade-Offs

The algorithm remains exponential in the worst case. That is honest and unavoidable for exact subset matching. The engineering work is about reducing practical cost and protecting the system:

  • Request guardrail: maximum invoices per request.
  • Result guardrail: maximum combinations returned.
  • Domain filters: min invoice count, max invoice count, and required IDs.
  • Early pruning: stop branches that cannot reach the target.
  • Exact arithmetic: backend uses BigDecimal to avoid floating-point accounting errors.

What The Project Demonstrates

InvoiceMix is not only an algorithm exercise. It connects mathematical reasoning to a real finance workflow:

  • Model an accounting problem as subset search.
  • Turn NP-complete theory into a practical bounded tool.
  • Separate parsing, validation, search, and presentation.
  • Build a full-stack product around a narrow business pain.
  • Preserve auditability through clear combinations and CSV export.

Future Improvements

The current roadmap includes custom column mapping, multiple worksheet support, locale-aware currency formatting, partial-match mode, progress and cancel support for large searches, and API authentication for production deployments.