LLM Tooling for Portfolio Writing

6th article in Building My Portfolio Into a Publishing System

Software Writing Ai Tooling Project Solutions Land

2026-5-27

Built LLM-backed editorial tooling that turns repo history, notes, and existing markdown into concrete portfolio-writing workflows instead of generic text generation.

This is the point where the repo started to feel like more than a site generator and more like an operating system for technical writing.

I got a local graphics card I could use seriously, started investing in local LLM workflows, and realized I could apply those capabilities to a problem I had been carrying around for a long time.

The problem was not that I lacked material. I had years of project artifacts, commit history, notes, photos, half-written articles, and a lot of stored context about why particular projects mattered. The problem was that none of that was shaped into a reliable workflow for producing public writing.

What changed was not just speed. It was the shape of the work.

Before this, the hardest part of writing was often turning scattered evidence into a visible next step. The useful LLM pattern was not “small prompt in, giant article out.” It was closer to this:

start with real project evidence
define a constrained task
use the model to extract structure or draft from that evidence
inspect the output
iterate until the next step is obvious

That feels much more like tool building than content vending.

Git-narrative turned repository history into article scaffolding

One of the first useful LLM tools in this phase was git-narrative, a project skill I added in ~/projects/local-llm/.pi/skills/git-narrative/SKILL.md.

The task definition in the original session was very direct:

Create a new skill for “git narrative”. The goal is to create a blog post describing the challenges encountered and problems solved, and the input is a git repo. First analyze the git repo to determine the ‘unique value’ delivered. Then look through each commit, files changed, and lines changed per file to determine the ‘development arc’ of the project. Create a skeleton of a markdown blog post. A successful post doesn’t just describe the work, it tells a story of insight and capability, with utility in a final result. Do not make up details, instead keep it sparse and let follow up work provide details.

That prompt captures what I actually wanted from the model.

I did not want it to hallucinate motivation or turn the repo into polished marketing copy. I wanted a repeatable way to extract a narrative frame from repository evidence.

The resulting skill is opinionated in a useful way. It explicitly tells the agent to:

determine the repo’s unique value before reading the whole history
reconstruct a development arc from commits, changed files, and churn
synthesize recurring problem arcs instead of just listing chronology
produce either a sparse skeleton or a grounded first-person article
avoid inventing details the repo cannot support

That design matters because git history on its own is not yet a story. It is evidence. The tool’s job is to shape that evidence into something reviewable.

A good example from the skill itself is this rule:

Evidence first. Base every claim on repository structure, commit history, diffs, docs, scripts, configuration, tests, examples, or assets.

And this one:

Do not invent details. If the repo does not prove motivation, impact, or user outcomes, leave placeholders or write neutral observations.

That is very close to how I want all LLM-assisted writing tools to behave.

The utility of git-narrative is that it converts a repo from “too much history to hold in my head” into a skeleton with usable sections:

unique value delivered
development arc
challenges encountered
problems solved
what the final result enables
open questions to fill with firsthand detail

That is exactly the kind of narrowing move I need. It lets me start from evidence-backed structure and then add the missing human context, instead of staring at hundreds of commits and trying to infer an article shape from scratch.

It also reflects a value I care about a lot in this repo: if a problem is mostly about repeated natural-language transformation, I want a tool for it, not just a one-off prompt.

Quality reporting turned the writing backlog into ranked work

The next step was building a quality-report workflow for the articles themselves.

If git-narrative helps turn repo history into an article scaffold, the quality tools help decide which existing article deserves attention next.

The intent here was not to grade writing for its own sake. It was to make backlog work discussable and actionable.

The base weights live in ops/quality/article-quality.js:

const DEFAULT_WEIGHTS = {
  mixture: 25,
  captions: 15,
  specificity: 10,
  structure: 10,
  links: 5,
  intent: 15,
  trust: 10,
  utility: 10,
}

Those dimensions are how I narrowed the problem.

Instead of asking a model whether an article is “good,” I broke the question into inspectable parts:

does the article have a useful mix of explanation and evidence?
are images actually explained?
does the piece feel focused, grounded, and useful?

The prompt refinement work was part of the engineering. One of the explicit prompt changes was:

The soft rubric seems to be too focused on a specific goal. Let’s change the soft rubric to be more broad, with “intent” - how coherent and cohesive it is and delivers on the intended purpose of the article. “trust” - confidence in the competence and content. and “utility” - how useful the article is while respecting the readers time.

That is the pattern I keep finding useful: name the dimensions, tighten the interface, and move the model toward a task that can be inspected.

The quality checker then combines two layers:

deterministic structural scoring from the markdown itself
constrained LLM scoring for the softer editorial dimensions

The softer prompt in buildSoftPrompt() is intentionally rigid:

"Score this article and return exactly one JSON object."
"Audience: semi-technical reader."
"Goal: reward articles that are coherent, trustworthy, useful, and respectful of the reader's time."

That is not accidental. I do not want a vibe check. I want a bounded interface that can be wrapped with code.

The workflow got better once it was built around iteration

The quality tool became genuinely useful only after I tested it against real articles and hardened the failure modes.

One of the evaluation prompts in the session was:

Run the report and sample a few low/high/mid quality articles and judge the fitness of this workflow to drive the quality of the website up.

That was the important move. I was not just verifying that the script ran. I was checking whether the output actually helped me choose work.

That led to the next refinement: low score alone was not enough. Some weak articles were easy wins. Others were barely articles yet.

So I added a priority layer in ops/quality/priority.js:

const priority = Math.round(importance * ease * 100) / 10

That logic came from a very explicit requirement:

Add a new “priority” indicator that uses the score and frontmatter to indicate which article requires effort next. Balance “ease of improvement” with “importance to overall goal”.

This is the part that turned the tool from measurement into workflow support.

The report no longer just says which articles are weak. It tries to answer a more useful question: if I spend an hour here, how much does the portfolio improve?

I also ended up hardening the wrapper around the model because JSON output was not reliable enough to trust directly. The tool now extracts the last JSON object, validates it, coerces malformed output when needed, and can step up to a stronger model. I also moved caching into ops/quality-check and made ops/quality-report resumable with progress reporting.

That work was driven by another concrete prompt:

Move the caching to quality-check so that I can quickly view the detailed results by re-running quality-check. Make the quality report have a progress bar and be resumable.

That is a good summary of my values in this space. I do not just want something clever enough to work once. I want something inspectable, rerunnable, and cheap enough to stay in the loop.

Result

This is the phase where LLM tooling stopped feeling like a novelty and started feeling like an engineering material.

The most useful outputs were not finished articles. They were workflow tools:

a repo-history tool that turns commits into article scaffolding
a quality tool that turns a writing backlog into ranked work
a growing set of prompts and wrappers that convert vague writing tasks into bounded transforms

That is the larger shift I care about.

The repo now does more than hold content and publishing rules. It also contains the natural-language tools I use to shape, prioritize, and draft the work itself.

That is what made the backlog feel divisible for the first time.

Previous Next

Related Building My Portfolio Into a Publishing System:

From Static Site to Publishing System Inventing a Content Model for Real Projects Publishing Dense Technical Work Without Watering It Down Owning the Seams: Plugin Forks and Toolchain Drift Turning the Site into a Portfolio Editorial Machine