Reddit Discovery & Analysis with Claude

Build a working information pipeline: discover content, score it, cluster it, extract what matters. All on your GPU.

fundamentalsai-devml-pipeline

Getting Started

Course orientation and environment setup

Frontmatter

The seven-stage pipeline, why each tool was picked for what it gives you beyond this course, and what 25 real AI sessions show about development.

8 min

Setting Up

Prerequisites, environment setup, and Reddit API access for the RDAC pipeline.

15 min

Pipeline & First Result

Reproducible pipelines, scoring, feedback loops

Bootstrap & Pipeline

Build a multi-stage data pipeline where each stage declares its inputs and outputs. Run ML inference locally on consumer hardware.

25 min

Improving the Score

Score content across independent quality dimensions where each axis tells you something different. Graduated classification over binary keep/discard.

20 min

Growing the Pipeline

Build feedback loops that measure accuracy, identify blind spots, and improve a system through operation.

25 min

Discovery & Interface

Discovery mechanisms, interactive review, scale, extraction

Beyond Keywords

Discovery mechanisms that find what you didn't know to search for. Measure the unexpected systematically.

30 min

Dashboard & Review

Build interactive review tools where human judgments feed back into the pipeline. Browsing becomes quality improvement.

25 min

Scale & Evidence

Scale pipelines to larger data with selective execution control. Freeze what's stable, iterate on what's changing.

25 min

Quality & Resources

Extract structured resources from scored content. Catch data integrity problems before they propagate downstream.

20 min

Clustering & Hierarchy

Clustering, architectural evaluation, visualization, domain transfer

Why Cluster?

Choose the right dimensionality reduction and clustering algorithms for your data. UMAP, PCA, HDBSCAN, k-means, and why parameters matter as much as algorithm choice.

30 min

The Doubt Phase

Evaluate whether architectural decisions produce meaningful results or structural artifacts. Strip complexity while preserving what works.

30 min

Hierarchy & Sunburst

Build interactive hierarchical visualizations for navigating large classified datasets, from product taxonomies to knowledge bases.

30 min

What It's For

Apply the full pipeline to any domain by swapping the source adapter. The architecture is the reusable part.

20 min