Topic Cluster Generator

Free · No account needed

Free keyword clustering tool

Upload a CSV export or paste raw text. Then adjust the settings and hit "Generate clusters" to see your topic map.

Add your data

Upload CSV

Adjust settings

0.7
Vectorization mode

Cluster visualization

Your cluster map renders here. Export to PNG or CSV, or share a link that stays live for 30 days.

Upload a CSV or paste some text above, then click "Generate clusters" to see your topic map here. Click a cluster to zoom in.

Workflow

How to cluster keywords from a CSV or pasted list

Upload a spreadsheet or paste lines directly — Lexical and Semantic modes both run locally until you choose to share.

  1. Step 1

    Upload a CSV from your crawler or CMS, or paste one title, keyword, or URL per line.

  2. Step 2

    Pick Lexical for spreadsheet-style overlap, or Semantic when similar intent uses different wording.

  3. Step 3

    Adjust similarity, cluster size limits, and optional AI labels, then run Generate clusters.

  4. Step 4

    Swap visualizations, export CSV, or publish a sanitized share link for your team.

Controls

Settings available in the keyword clustering tool

Fine-tune grouping tightness, cluster sizes, and how results appear on screen.

Similarity threshold

Controls how tight each keyword group is in Lexical mode, or in Semantic threshold mode.

Min / max cluster size

Prevents tiny noise clusters or oversized catch-all buckets.

Lexical vs Semantic

Lexical runs TF-IDF locally; Semantic sends text to Gemini embeddings — nothing is stored unless you share.

Show cluster labels

Overlays group names on charts for screenshots and decks.

K-means & AI labels

In Semantic mode, pick a target cluster count and let the model name each group.

Visualization

Treemap, force tree, tidy tree, icicle, circle packing, or 3D — same clusters, different lens.

Keyword clustering that respects confidentiality

Ideal when procurement blocks SaaS uploads but leadership still expects polished charts. Pair this flow with AI generation when you need net-new angles.

Six chart styles for spreadsheet skeptics

Treemaps communicate weightings; trees expose overlap between buckets — toggle freely without recomputing Lexical scores.

Treemap

See cluster sizes at a glance. The nested rectangles make it immediately obvious which topic groups dominate your site.

Force-directed tree

Explore how pages connect to each other. Drag nodes and zoom in to trace relationships across the full content graph.

Tidy tree

A clean hierarchical view that shows the depth and branching of every cluster, ideal for presenting site architecture.

Icicle chart

Space-efficient stacked bars that let you click into any cluster and zoom through the hierarchy level by level.

Circle packing

Circles within circles. A visually rich layout that makes nested cluster relationships intuitive to read at a glance.

3D knowledge graph

An immersive three-dimensional network where nodes float in space and edges flow between them. Rotate, zoom, and orbit to navigate clusters from any angle.

Practical safeguards for enterprise keyword lists

Ship stakeholder-ready visuals without sacrificing compliance — clustering stays client-side until you opt into sharing.

Keyword clustering without handing data to a third party

Lexical TF-IDF vectors are computed inside a dedicated worker thread on your machine. Your spreadsheets stay offline unless you publish a sanitized snapshot.

Flexible inputs from CMS or crawler exports

Mix Title, Meta Description, URL, Keywords, or Unique Inlinks columns — only one textual column is required to seed similarity scoring.

Share cluster outlines without leaking URLs

Generate a time-bound share URL so teammates review grouping logic without exposing proprietary titles.

Frequently asked questions

Keyword uploads, CSV semantics, and collaboration guardrails.

What CSV columns do you accept?

Title, Meta Description, URL, Keywords, and Unique Inlinks may appear in any combination. Provide at least one text-bearing column.

How large can my keyword list be?

Roughly ten thousand rows remain responsive thanks to worker threading — larger lists simply take proportionally longer.

What is the difference between Lexical and Semantic mode?

Lexical builds sparse vectors with TF-IDF over the text we derive from titles, keywords, meta descriptions, and URL paths from your CSV or pasted text. Items land in the same cluster when they literally share important words or stems — great for messy spreadsheets, overlapping product names, or URLs that encode topics in slugs. Semantic sends combined text snippets to Gemini embeddings so similarity reflects meaning, not spelling. Different wording about the same intent can still merge. Start with Lexical when you want deterministic, offline-friendly grouping; switch to Semantic when synonyms and paraphrases split the Lexical map too finely.

What does K-means do in Semantic mode?

K-means partitions embedding vectors into groups automatically. Roughly speaking, cluster count scales with how many documents you have versus your minimum cluster size — raising minimum size yields fewer, broader themes without tuning a cosine cutoff by hand. Use it when you want the algorithm to infer group boundaries while embeddings stay fixed for your dataset.

What does Custom threshold do in Semantic mode?

Custom threshold skips automatic K-means and instead merges pages only when their embedding cosine similarity meets your cutoff — you effectively dial how aggressively clusters fuse. Higher values demand tighter semantic matches (fewer merges); lower values join looser neighborhoods. Around 0.95 cosine similarity is a sensible starting point with Gemini embeddings before you widen or tighten by hand.