Technical deep-dive · ADAIS

Building AI Data Analysis
& Indexing System
from Scratch

Joycee Catamora Paragas May 2026 15 min read

A step-by-step guide to building ADAIS — a full-stack AI system that analyses images, video, and documents using a provider switcher for OpenAI, Google Cloud, and HuggingFace. FastAPI backend · React + Vite + Tailwind frontend · fully typed with TypeScript.

Python FastAPI React Vite Tailwind CSS OpenAI HuggingFace Google Cloud TypeScript Docker
// table of contents
  1. Why I built this
  2. Architecture overview
  3. Backend: FastAPI + provider pattern
  4. Implementing the three AI providers
  5. Frontend: React + Vite + Tailwind
  6. TypeScript types from Pydantic schemas
  7. Running it locally
  8. Lessons learned

Why I Built This

Content libraries grow fast. Manually tagging thousands of video, image, and audio files is slow, error-prone, and doesn't scale — every file needs a human to describe it before it becomes searchable. I built ADAIS to automate that entirely.

ADAIS is a full-stack system that accepts any file, detects its type automatically, routes it through an AI provider of your choice, and returns structured, searchable metadata — labels, transcripts, entities, sentiment, dominant colours, and more. No manual tagging required.

// source code

All code is available at github.com/foobearer/ai-content-pipeline. Clone it and follow along or build from scratch with this guide.

Architecture Overview

ADAIS has two parts: a FastAPI backend that handles file ingestion, provider routing, and analysis, and a React + Vite frontend that provides the upload UI and renders results. The key architectural decision was the Provider Pattern — an abstract interface that all three AI providers implement.

React + Vite + Tailwind (port 5173)
         │
         │  POST /analyse/auto  (multipart/form-data)
         ▼
FastAPI Backend (port 8000)
  ┌──────────────────────────────────┐
  │  Route → FileHandler → Provider  │
  │                                  │
  │  ┌──────────────────────────┐    │
  │  │    Provider Factory      │    │
  │  └────┬──────────┬─────┬───┘    │
  └────────┼──────────┼─────┼───────┘
           │          │     │
     OpenAI    Google  HuggingFace
     GPT-4o   Vision   BLIP+BART
     Whisper  Speech   Whisper-base
Project structure shell
adais/
├── backend/
│   └── src/
│       ├── main.py              # FastAPI app + all routes
│       ├── config.py            # Settings from .env
│       ├── providers/
│       │   ├── base.py          # Abstract interface
│       │   ├── __init__.py      # Provider factory
│       │   ├── openai_provider.py
│       │   ├── google_provider.py
│       │   └── huggingface_provider.py
│       ├── models/
│       │   └── schemas.py       # Pydantic models
│       └── utils/
│           └── file_handler.py  # Upload validation
└── frontend/
    └── src/
        ├── App.tsx
        ├── components/          # One file per component
        ├── hooks/useAnalysis.ts # All state logic
        ├── types/api.ts         # TypeScript types
        └── utils/api.ts         # fetch wrapper

Backend: FastAPI + the Provider Pattern

FastAPI is an excellent choice for AI pipelines — it's async by default, auto-generates Swagger docs from type hints, and Pydantic handles all validation. Start with the data models — everything else is built around them.

Define your schemas first

All request and response shapes live in schemas.py — the single source of truth. The frontend TypeScript types mirror these exactly.

backend/src/models/schemas.py Python
from pydantic import BaseModel, Field
from enum import Enum

class Provider(str, Enum):
    OPENAI      = "openai"
    GOOGLE      = "google"
    HUGGINGFACE = "huggingface"

class Label(BaseModel):
    name: str
    confidence: float = Field(..., ge=0.0, le=1.0)

class ImageAnalysis(BaseModel):
    labels:           list[Label] = []
    extracted_text:   str | None = None
    dominant_colours: list[str] = []
    description:      str | None = None
    tags:             list[str] = []

class AnalysisResult(BaseModel):
    job_id:         str
    content_type:   str
    provider:       ProviderInfo
    image_analysis: ImageAnalysis | None = None
    video_analysis: VideoAnalysis | None = None
    text_analysis:  TextAnalysis  | None = None
// tip

Using str | None (Python 3.10+ union syntax) instead of Optional[str] is cleaner and now standard. Pydantic v2 supports both.

The abstract base class

This is the heart of the architecture. All three providers implement the same interface — routes never need to know which provider they're calling.

backend/src/providers/base.py Python
from abc import ABC, abstractmethod
from pathlib import Path

class BaseProvider(ABC):

    @property
    @abstractmethod
    def name(self) -> str: ...

    @abstractmethod
    async def analyse_image(self, path: Path) -> ImageAnalysis: ...

    @abstractmethod
    async def analyse_video(self, path: Path) -> VideoAnalysis: ...

    @abstractmethod
    async def analyse_text(self, text: str) -> TextAnalysis: ...

The provider factory

The factory is the only place that knows which class maps to which provider. Lazy imports mean missing optional dependencies don't crash the app on startup.

backend/src/providers/__init__.py Python
def get_provider(provider: Provider) -> BaseProvider:
    if provider == Provider.OPENAI:
        from src.providers.openai_provider import OpenAIProvider
        return OpenAIProvider()

    if provider == Provider.GOOGLE:
        from src.providers.google_provider import GoogleProvider
        return GoogleProvider()

    if provider == Provider.HUGGINGFACE:
        from src.providers.huggingface_provider import HuggingFaceProvider
        return HuggingFaceProvider()

The main route

With the provider pattern in place, each route is a thin wrapper — validate, get provider, call method, return result.

backend/src/main.py Python
@app.post("/analyse/auto", response_model=AnalysisResult)
async def analyse_auto(
    file: UploadFile = File(...),
    provider: Provider = Form(Provider.HUGGINGFACE),
):
    path, content_type = await save_upload(file)
    ai = get_provider(provider)

    if content_type == ContentType.IMAGE:
        result = await ai.analyse_image(path)
    elif content_type == ContentType.VIDEO:
        result = await ai.analyse_video(path)
    else:
        text = await extract_text(path)
        result = await ai.analyse_text(text)

    cleanup(path)
    return result

Implementing the Three AI Providers

Each provider implements the same three methods. Here's how each works internally.

Feature OpenAI Google Cloud HuggingFace
Image analysis GPT-4o Vision Vision API BLIP + ViT
Video transcript Whisper-1 Speech-to-Text Whisper-base (local)
Text summary GPT-4o-mini Not supported BART-large-CNN
Named entities GPT-4o-mini Natural Language API BERT-NER
API key required Yes Yes No — runs locally
Cost Pay per use Pay per use Free

OpenAI — structured JSON from GPT-4o

The key technique is response_format: {"type": "json_object"} which forces GPT-4o to always return valid parseable JSON. Combined with a structured system prompt, you get reliable output every time.

backend/src/providers/openai_provider.py Python
async def analyse_image(self, path: Path) -> ImageAnalysis:
    image_b64 = base64.standard_b64encode(path.read_bytes()).decode()

    resp = await self._client.chat.completions.create(
        model="gpt-4o",
        response_format={"type": "json_object"},  # always valid JSON
        messages=[
            {"role": "system", "content": "Return JSON with keys: labels, "
             "objects, extracted_text, description, tags, is_safe"},
            {"role": "user", "content": [{
                "type": "image_url",
                "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}
            }]}
        ]
    )
    raw = json.loads(resp.choices[0].message.content)
    return ImageAnalysis(**raw)

HuggingFace — lazy-loading local models

Loading a transformer model takes 5–10 seconds. Cache pipelines after first load so subsequent calls are instant. Use run_in_executor to avoid blocking FastAPI's async event loop.

backend/src/providers/huggingface_provider.py Python
class HuggingFaceProvider(BaseProvider):

    def __init__(self):
        self._pipelines: dict = {}  # cache — load each model only once

    async def _load_pipeline(self, task: str, model: str):
        if task not in self._pipelines:
            from transformers import pipeline
            # run_in_executor: blocking load in thread,
            # doesn't block the async event loop
            self._pipelines[task] = await asyncio.get_event_loop().run_in_executor(
                None, lambda: pipeline(task, model=model)
            )
        return self._pipelines[task]
// key insight

Never call blocking code directly in an async function. Always wrap with run_in_executor so FastAPI can continue serving other requests while a model loads.

Frontend: React + Vite + Tailwind

The frontend follows a clear separation: one custom hook owns all state and API logic, components are purely presentational. Here's the component tree:

Component hierarchy shell
App.tsx
├── ProviderSelector    # three provider cards
├── DropZone            # drag-and-drop file input
├── AnalyseButton       # submit + loading state
└── ResultsPanel        # routes by content type
    ├── MetaCard        # job ID, provider, time
    ├── ImageResults    # labels, OCR, colours
    ├── VideoResults    # transcript + segments
    └── TextResults     # summary, sentiment, entities

The useAnalysis hook — all state in one place

Every piece of state lives in a single custom hook. App.tsx just calls useAnalysis() and passes values to components — zero business logic in the component tree.

frontend/src/hooks/useAnalysis.ts TypeScript
type Status = 'idle' | 'analysing' | 'done' | 'error'

export function useAnalysis() {
  const [status, setStatus] = useState<Status>('idle')
  const [result, setResult] = useState<AnalysisResult | null>(null)
  const [file, setFile]     = useState<File | null>(null)
  const [error, setError]   = useState<string | null>(null)

  const analyse = useCallback(async (provider: Provider) => {
    if (!file) return
    setStatus('analysing')
    try {
      const data = await analyseFile(file, provider)
      setResult(data); setStatus('done')
    } catch (e) {
      setError(e instanceof Error ? e.message : 'Failed')
      setStatus('error')
    }
  }, [file])

  return { status, result, error, file, setFile, analyse }
}

Vite proxy — no CORS headaches in development

Configure Vite to proxy API calls to FastAPI. Your frontend calls /analyse/auto and Vite silently forwards it to localhost:8000.

frontend/vite.config.ts TypeScript
export default defineConfig({
  plugins: [react()],
  server: {
    proxy: {
      '/analyse':   'http://localhost:8000',
      '/health':    'http://localhost:8000',
      '/providers': 'http://localhost:8000',
    }
  }
})

TypeScript Types from Pydantic Schemas

Keep your backend Pydantic schemas and frontend TypeScript types in sync. When the API shape changes, TypeScript immediately tells you every component that needs fixing.

frontend/src/types/api.ts TypeScript
// Mirrors backend/src/models/schemas.py exactly

export type Provider = 'openai' | 'google' | 'huggingface'

export interface Label {
  name: string
  confidence: number  // 0.0 – 1.0
}

export interface AnalysisResult {
  job_id:         string
  content_type:   ContentType
  provider:       ProviderInfo
  image_analysis: ImageAnalysis | null
  video_analysis: VideoAnalysis | null
  text_analysis:  TextAnalysis  | null
}
// scaling tip

For larger projects, use openapi-typescript to auto-generate types from FastAPI's OpenAPI spec at /openapi.json.

Running It Locally

You only need Python 3.10+ and Node 18+. Open two terminal windows and follow these steps.

1

Clone the repo

Get the code from GitHub and navigate into the project.

2

Start the backend

Create a virtual environment, install dependencies, copy .env.example to .env, start FastAPI.

3

Start the frontend

Install Node dependencies and start the Vite dev server in a second terminal.

4

Open the app

UI at localhost:5173 · Auto-generated API docs at localhost:8000/docs

Terminal 1 — Backend shell
git clone https://github.com/foobearer/ai-content-pipeline.git
cd ai-content-pipeline/backend
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
uvicorn src.main:app --reload
Terminal 2 — Frontend shell
cd ../frontend
npm install
npm run dev
// first run warning

Using the HuggingFace provider for the first time downloads approximately 2GB of model weights. This is a one-time download — models are cached locally and subsequent runs start instantly.

Lessons Learned

Use the provider pattern from day one

A common mistake is coupling your code directly to one AI provider. The AI landscape moves fast — a model that is best today may not be in six months, and you might need to swap for cost, performance, or compliance reasons. The abstract base class pattern solves this from day one: add a new provider by implementing four methods, and nothing else in the codebase changes. This was the single most valuable architectural decision in this project.

Validate file types by content, not extension

A user can rename virus.exe to photo.jpg. Always read the file's magic bytes — the first few bytes that identify the format — rather than trusting the extension. The file_handler.py utility does this before any file touches an AI model.

run_in_executor is non-negotiable for blocking calls

FastAPI is async — calling a blocking function directly (model loading, synchronous SDK calls) blocks the entire server. Every blocking operation in ADAIS runs in a thread pool via asyncio.get_event_loop().run_in_executor(None, ...).

Keep TypeScript types in sync with Pydantic

For a project this size, manually mirroring the types is fast and readable. The discipline of updating both files when a schema changes is worth it — TypeScript's compiler immediately surfaces every broken component.


The full project is on GitHub at github.com/foobearer/ai-content-pipeline — clone it, run it, and use it as a starting point for your own data analysis and indexing projects. Questions or want to extend it with a new provider? Open an issue or reach out directly.

JC
Joycee Catamora Paragas
Full Stack & AI Engineer with 11 years of experience across React, Angular, TypeScript, Python, and cloud platforms. Currently studying Applied AI & Data Science at MIT Professional Education.