Day 37

Day 37 - June 7, 2026: Turning Thai Dictionary Friction into Product and Data Decisions

Using manual UseThai testing to separate app-tier UX improvements from core search work, assess Volubilis ingestion readiness, and establish tone as a governed product requirement.

Day 37 moved UseThai from showing lookup screens to producing evidence that can guide real product and data decisions.

Manual testing exposed concrete learner friction. The first UX friction review separated app-tier improvements from fixture limitations and unwarranted core search work. A real-data spike turned Volubilis from a vague candidate into a dataset with measurable ingestion challenges. Tone also moved from a possible enhancement into a product requirement that needs a governed technical path.

The shape of the day was:

Test the app -> record friction -> authorize only the narrow app-tier fix
-> inspect real dictionary data -> define tone and licensing gates
-> prepare the next governance decision

The most important result was discipline. The application is now useful enough to reveal meaningful friction, but observed friction does not automatically authorize a new core capability.

Turning Manual Lookup Tests Into Evidence

The day began with manual testing through:

pnpm --filter usethai dev

The tests confirmed several important lookup states:

These results showed both product progress and useful friction.

The successful กิน lookup demonstrated the working happy path. The old results showed that returning multiple valid entries is not enough when a learner cannot easily distinguish between them. The to eat! miss showed how an exact-key rule that is technically predictable can still surprise a user.

The app is no longer only an abstract shell. It can now surface specific dictionary behavior that can be observed, discussed, and triaged.

Building The First Meaningful UX Friction Batch

The manual tests fed the first substantial batch of observations into the UX friction log.

The evidence covered both lookup directions:

The categories matter because these observations do not all describe the same problem.

A missing fixture entry is not evidence that lookup needs fuzzy matching. Two valid results with insufficient learner context are not a tokenizer defect. A punctuation-attached query miss may reveal product friction without proving which layer should eventually address it.

The friction log keeps those distinctions visible.

Issuing No Core Search Warrant

The warrant review produced an important non-decision:

No core search warrant was issued.

The evidence remained fixture-based. That was not enough to authorize fuzzy search, stemming, punctuation normalization, tokenizer expansion, or a wider search architecture.

This restraint protects the platform from converting every observed app friction into durable core behavior. Before search capability expands, the project needs stronger evidence from real data and repeated learner behavior.

The review did identify one narrow issue that was clearly actionable at the application tier: the page chrome should communicate the selected lookup direction.

Completing Direction-Aware App Chrome

The document title and on-page heading now react to the selected lookup direction.

That resolves the earlier mismatch where the page chrome did not clearly reflect whether the user was performing Thai-to-English or English-to-Thai lookup.

The implementation stayed deliberately within the application tier:

This closed the first immediately actionable issue from the fixture friction batch without promoting presentation behavior into core.

Moving From Fixture UX To Real-Data Readiness

The day then shifted from app behavior to the data that could eventually power it.

Earlier research had characterized Volubilis as a weak or mostly Thai-French candidate. Project-specific review and a direct data-shape spike showed that conclusion was unreliable.

The spike inspected VOLUBILIS Database.xlsx version 25.3 from November 2025. The file contains 114,177 rows across 15 columns.

Several findings make Volubilis a serious Thai-English candidate:

These findings replaced a vague data-source question with bounded engineering questions.

Duplicate headwords will need a merge policy. Whitespace-bearing headwords conflict with the current Thai lexical-key policy. Multi-sense gloss cells need explicit parsing rules. The source POS vocabulary needs a governed mapping rather than being forced into the smaller core enum.

Volubilis is not drop-in ready, but it is realistic enough to keep evaluating.

Making Tone A Product Requirement

The day’s most important learner-facing decision was that tone should be treated as required for UseThai.

Romanization without tone is significantly less useful to a non-native Thai learner. A learner may be able to approximate consonants and vowels from a romanized form while still missing the tone that determines how the word should actually be pronounced.

Visual tone information can support pronunciation learning now and create a stronger foundation for future pronunciation or text-to-speech features.

The feasibility review clarified a likely technical direction:

This turns tone generation into more than a display enhancement. It creates a new data-governance responsibility.

Before implementation, the platform likely needs an ADR or equivalent architecture grounding for derived linguistic artifacts. At minimum, that model should preserve:

generator identity
generator version
input headword lineage

That provenance would make generated pronunciation data versioned, reproducible, and reviewable rather than an unexplained field committed beside source data.

Preserving Commercial Optionality

The data-source review also clarified the project’s licensing posture:

UseThai should remain commercial-capable and ShareAlike-cautious.

The product may eventually support ads, memberships, donations, sponsorships, or another revenue path. A CC BY-SA source such as Volubilis may still be usable, but ingestion should not begin until the legal and product implications are explicitly reviewed and accepted.

That posture does not reject open data. It prevents the project from accidentally committing to obligations before understanding their effect on the product and its derived artifacts.

The current decision remains narrow:

Separating The Decisions

Day 37 reinforced that several adjacent concerns must remain separate:

App-tier presentation can improve quickly.
Fixture friction does not automatically authorize core search work.
Real-data ingestion remains gated by shape, policy, and licensing.
Tone is product-critical but must be derived through a governed process.

That separation is what allows the project to move without confusing motion with authorization.

The direction-aware heading could be completed because the evidence clearly supported an app-tier fix. Search expansion stopped because fixture evidence could not justify it. Volubilis remained a candidate because the spike showed real potential, but ingestion stayed closed because important transformation and licensing questions remain. Tone became a requirement without pretending that the generation and validation model is already settled.

Why The Day Mattered

Day 37 crossed an important product boundary.

The question is no longer only:

Can the app show lookup results?

It is now:

What does real use reveal, and what does that evidence actually justify?

The app shell is mature enough to expose meaningful learner friction. The first friction cycle produced a real application improvement. Dataset research turned uncertainty into concrete ingestion questions. Tone moved from a nice-to-have idea into a product requirement with a plausible governed path.

The most important result was not a large code change. It was a clearer method for making the next decision.

Outcome

Day 37 moved UseThai from fixture demonstration toward evidence-driven product discovery and real-data readiness.

Manual lookup testing confirmed the happy path, exposed exact-key punctuation friction, and showed the learner-context problem in multi-entry English results. The UX friction review kept those observations separate and declined to authorize speculative core search work.

Direction-aware page chrome resolved the first clearly actionable app-tier issue without changing lookup behavior. The Volubilis spike established a serious candidate dataset while identifying duplicate-headword, whitespace, multi-sense, POS-mapping, and licensing gates.

Tone became a product requirement. The likely path is offline generation with explicit provenance, validation, and an override strategy rather than runtime inference.

The project is now positioned for its next major decision: how to govern derived pronunciation and tone data before moving toward real dictionary ingestion.

Definition Of Done

Day 37 reached an evidence, data-readiness, and product-governance checkpoint:

The day closed with a more useful application, a better understanding of real dictionary data, and a clearer boundary between what the evidence supports now and what still needs governance.