
Wordlist generator graph based, Wordnet + LLM and WikipediaApi
Hi all, I built a wordlist-generator that uses a semantic knowledge graph instead of pure string manipulation.
You feed it a list of keywords and it builds a hypernym DAG using WordNet, expands it with LLM-generated hyponyms, scores leaf pairs by semantic similarity (Wu-Palmer), and permutes synonyms to produce the final wordlist. For terms WordNet doesn't know (brand names, games, slang) an LLM iteratively finds a valid hypernym using a Wikipedia summary as context.
The wordlist use case is the obvious one, but honestly the core engine is just a semantic expander: given a few seed words, it grows a contextually coherent vocabulary around them. I can see it being useful for:
- NLP / ML — data augmentation, building domain-sp ecific vocabularies, corpus enrichment
- Ontology / knowledge graphs* quick concept mapping from a small seed set.
Supports OpenAI or local models via llama.cpp.
Code: https://github.com/ivegotanheadache/WonaBee
Curious if anyone sees other uses for this kind of approach and likes it
There is a Proof of Concept in the repo that’s shows how, when given [“Cyberpunk2077”, “Rabbit”] as input, it completely excludes the combination between the two due to semantic incorrelation