Data Flow
This page documents the end-to-end data flows for core operations in PIP AI.
PIP Upload → Auto-Match
mermaid
flowchart LR
U[User uploads PIP PDF] --> ST[Supabase Storage]
U --> DOC[Insert document record]
DOC --> JOB[Create upload job]
JOB --> N8N[N8N consumes job]
N8N --> PARSE[OCR/LLM extraction]
PARSE --> UPSERT[Upsert pip_items]
UPSERT --> VEC["Vector search<br/>(brand + area filters)"]
VEC --> MATCH[Insert matches<br/>auto, pending]
MATCH --> UI[UI review + approve]Steps
- Upload — User uploads PIP PDF, stored in
pips/{project_id}/{document_id}/ - Queue —
pip_ai_documentsrow created,pip_ai_upload_jobsentry queued - Process — N8N claims job, downloads PDF via signed URL, runs OCR/LLM extraction
- Extract — Structured items upserted into
pip_ai_pip_itemswith(project_id, code)uniqueness - Match — For each item, query
pip_ai_search_specs()filtered bybrand_nameandareas[] - Store — Top candidates stored in
pip_ai_matcheswith scores and evidence - Review — User approves, rejects, or manually assigns matches in the UI
Spec Upload (Admin)
mermaid
flowchart LR
A[Admin uploads SPEC PDF] --> ST[Supabase Storage]
A --> DOC[Insert document record]
DOC --> SU[Insert spec_upload]
SU --> JOB[Create upload job]
JOB --> N8N[N8N consumes job]
N8N --> EX[Extract spec sections]
EX --> UP[Upsert spec_sections]
UP --> EMB[Generate embeddings]
EMB --> DONE[Mark completed]Steps
- Upload — Admin uploads spec PDF with brand and area metadata
- Queue — Document and spec upload records created, job queued
- Process — N8N extracts structured sections (spec number, title, vendor, category, keywords)
- Embed — OpenAI
text-embedding-3-smallgenerates 1536-dim vectors - Store — Sections upserted with
(spec_upload_id, spec_number)uniqueness - Index — HNSW index enables fast vector similarity queries
Floor Plan Processing
mermaid
flowchart LR
U[User uploads floor plan] --> ST[Storage]
U --> DOC[Document record]
DOC --> FP[Floor plan record]
FP --> JOB[Upload job]
JOB --> N8N[N8N processes]
N8N --> RENDER[Render PNG + thumbnail]
RENDER --> OCR[OCR spec codes]
OCR --> LINK[Link to spec sections]
LINK --> DONE[Mark completed]Manual Matching Flow
mermaid
flowchart TD
SELECT[User selects PIP item] --> SEARCH[Search specs by text]
SEARCH --> FILTER["Filter by brand + area<br/>(enforced)"]
FILTER --> RESULTS[Show candidates with scores]
RESULTS --> PICK[User selects spec]
PICK --> UPSERT["Upsert match<br/>(manual, approved)"]
UPSERT --> LEARN["Append to past_pip_requests<br/>(learning loop)"]Document Builder Export
mermaid
flowchart LR
BUILD[Arrange blocks on canvas] --> PREVIEW[Preview pages]
PREVIEW --> EXPORT[Trigger PDF export]
EXPORT --> SERVER[Server renders pages]
SERVER --> STORE[Store PDF in storage]
STORE --> DOWNLOAD[User downloads]Processing Status Model
All documents follow a consistent status lifecycle:
mermaid
stateDiagram-v2
[*] --> pending
pending --> processing : Job claimed
processing --> completed : Success
processing --> failed : Error
failed --> processing : RetryStatus is tracked in processing_status columns and surfaced in the UI via Supabase Realtime subscriptions.
Brand Isolation
All data flows enforce brand isolation:
- Spec queries must include
brand_name = project.brand_name - Area filtering narrows results when
pip_item.areasis available - RLS policies prevent cross-brand data access at the database level
- User scoping — users can only read specs for brands they have projects in