Skip to content
Back to Blog
1 min read

LLM Cost and Latency Notes: reducing token waste without hurting answer quality

I worked on smoothing the handoff between data engineering and AI teams—standardizing feature contracts, embedding validation, and adding lightweight integration tests.

The friction I kept seeing was simple: performance conversations are often really architecture conversations.

Instead of adding more moving parts, I tested a review pass focused on maintainability over novelty.

March for me has been about tightening execution after an idea-heavy February.

What I changed today

  • I replaced a vague process step with a concrete, testable checkpoint.
  • I reduced unnecessary variability by standardizing one recurring pattern.
  • I cut one source of rework by tightening upstream validation.

What I want to keep doing

I came away convinced that constraint clarity beats optimization tricks most days. Good systems feel calm because decision paths are explicit before incidents happen.

Tomorrow’s focus

Tomorrow I want to verify this pattern under a busier workload before I call it stable.

References

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.