F.2 — Term Harvesting & Normalisation
Preface node
heading:f-2-term-harvesting-normalisation:47965
Content
“Harvest words inside Contexts, name them in the Context’s own idiom, and stop there.” Status. Architectural pattern. Depends on. E.10.D1 Lexical Discipline for “Context” (D.CTX); F.0.1 Contextual Lexicon Principles (Source - Local Meaning - Bridge‑Only Crossing); A.7 Strict Distinction; A.11 Ontological Parsimony. Coordinates with. F.1 Context Map via Context Cards; F.3 Intra‑Context Sense Clustering; F.4 Role Description; F.9 Alignment & Bridge Across Contexts. Aliases (informative). context‑local harvesting; Local normalisation.
Intent & applicability
Intent. Provide a conceptual (notation‑free) discipline for turning Context‑internal usage into context‑local lexical units ready for later reasoning—without Cross‑context merging and without slipping into governance or tooling. The result is a small, auditable set of context‑local names and glosses that faithfully reflect how the canon speaks.
Applicability. Use whenever a unification line (from F.1) needs actual words to be referenced by patterns in Part C (Extention patterns) or by Role Descriptions (F.4). Re‑enter F.2 when a canon/edition changes or when a new Context is admitted in F.1.
Non‑goals. No global labels; no Cross‑context equivalence; no workflow or role descriptions; no storage/API talk. F.2 specifies how to think, not how to “run a pipeline”.
Problem Frame
Even with Contexts fixed (F.1), three mistakes recur:
- Word‑centrism. Treating a string as if it carried its meaning across Contexts (process, role, service).
- Over‑normalisation. Forcing one spelling/morphology across different canons, erasing Context‑specific cues.
- Premature structure. Smuggling behaviour, deontics, or type structures into what should remain lexical.
F.2 prevents these by localising meaning and naming strictly inside each Context.
Forces
Core idea (didactic)
Harvest inside each Context; name in that Context’s idiom; do not cross Contexts. For every Context (a U.BoundedContext from F.1), you gather attested phrases as thought‑cues, choose a Local Normal Form (LNF) that matches the Context’s idiom, attach a two‑register label (Tech/Plain), and write a one‑sentence gloss. That’s all. You do not claim sameness with any other Context; you do not embed behaviour or deontics; you do not mint U.Types here. These local lexical units will become Local‑Senses in F.3 and later addressable SenseCells (Context × Local‑Sense).
Minimal vocabulary (this pattern only)
- Context — Tech‑register alias for U.BoundedContext (per E.10.D1).
- Attested phrase — A short, verbatim cue from the canon that shows how a word is used in this Context (citation idea, not a record format).
- Local Normal Form (LNF) — The Context‑specific canonical surface you will use when referring to the term in this Context (minimal editing: spelling/hyphenation/casing per the canon).
- Two‑register label — Tech (engineer‑facing) and Plain (pedagogic) forms for the same Context‑local meaning.
- Gloss (one‑sentence) — A Context‑faithful description of how the canon uses the term, at minimal generality.
- Local lexical unit — The quintet (Context, LNF, Tech, Plain, Gloss). This is F.2’s only outcome.
- Homonymy (signal) — Awareness that the same string has different local lexical units across Contexts (no relation asserted).
- SenseCell (appears downstream) — Address (Context × Local‑Sense) minted in F.3; mentioned here so you know what you’re preparing.
Everything above is a way of thinking. None of it implies a database, statuses, or roles.
Solution — three mental moves (notation‑free)
Move A — Localise the word
Question to ask. “In which Context am I hearing this word?” Action (mental). Point to a specific Context (from F.1). Grab 1–2 attested phrases that are representative in this Context. Outcome. You stop thinking “global word” and start thinking “context‑local usage”.
Micro‑cue. If you cannot name the Context, do not harvest the word.
F.2:6.2 -Move B — Name it in the Context’s idiom
Question to ask. “How would this Context itself write it?” Action (mental). Choose the LNF (Context‑conformant spelling/hyphenation). Then write the two‑register label and a one‑sentence gloss that says what the canon means here—nothing more. Outcome. You have a local lexical unit (Context, LNF, Tech, Plain, Gloss).
Micro‑cues. • Prefer the canon’s head noun; keep canonical hyphens; avoid invented compounds. • The Plain label should help a non‑specialist; the Tech label should match engineers’ eyes. • The Gloss must fit on a single line; defer details to F.3.
Move C — Fence it off
Question to ask. “What must I refuse to conclude here?” Action (mental). Explicitly refuse to: (1) compare across Contexts, (2) fold morphology that the canon treats as meaningful, (3) embed behaviour, deontics, or type structure. Outcome. A clean, context‑local lexical unit that will be safe to cluster in F.3 and safe to bridge (or not) in F.9.
Guard‑rails (normative, lightweight)
- context‑locality. Every local lexical unit MUST cite a Context (U.BoundedContext from F.1).
- Context‑idiom normalisation. LNF MUST respect the Context’s idiom (spelling/hyphenation/casing) and use minimal edits.
- Two registers. Each unit SHOULD carry both Tech and Plain labels for didactics; if one is missing, justify.
- Minimal generality (G‑1). The gloss MUST be as specific as the Context’s canon requires—no broader.
- I/D/S layer hygiene (A.7). MUST NOT include behaviour equations, deontic rules, measurement math, or type axioms; those belong to patterns.
- No Cross‑context claims. MUST NOT assert equivalence, subsumption, or similarity with terms in other Contexts (F.9 only).
- Edition honesty. If the Context’s canon has multiple editions with shifting usage, treat them as distinct Contexts in F.1 before harvesting.
- Parsimony. Prefer few, telling lexical units over long tails; keep head terms that will power F.3/F.4/F.9.
Micro‑examples (illustrative, context‑local)
Each line is one local lexical unit. No relations are implied across lines.
-
Context: BPMN 2.0 (2011) — LNF:
processTech:process- Plain:workflow processGloss: “Directed graph of flow nodes and sequence flows enacted by participants.” -
Context: PROV‑O (2013) — LNF:
activityTech:activity- Plain:temporal occurrenceGloss: “Time‑bounded occurrence that uses and generates entities and is linked to agents.” -
Context: ITIL 4 (2020) — LNF:
service‑level‑objectiveTech:service‑level‑objective- Plain:service targetGloss: “Target value for a service characteristic within a service promise vocabulary.” -
Context: NIST RBAC (2004) — LNF:
roleTech:access‑role- Plain:permission roleGloss: “Named grouping of permissions assignable via sessions.” -
Context: SOSA/SSN (2017) — LNF:
observationTech:observation- Plain:measurement actGloss: “Act applying a procedure to a feature of interest to produce a result.” -
Context: IEC 61131‑3 — LNF:
taskTech:task- Plain:runtime program executionGloss: “Cyclic or event‑driven execution unit for control programs.”
Didactic heuristics (informative)
- Keep the Context prefix in your inner speech. Say “process (BPMN)”, “activity (PROV)”.
- Prefer head nouns. If the canon says “service‑level objective”, do not shorten it to “objective”.
- Resist elegance that erases signal. Hyphens and case often carry the Context’s culture; keep them.
- Gloss from use, not from opinion. Quote in your mind, then compress; avoid importing definitions from neighbouring Contexts.
Anti‑patterns & remedies
Worked examples (context‑local only)
Each line is a local lexical unit (Context, LNF, Tech, Plain, Gloss). No Cross‑context relation is implied. Later clustering (F.3) and bridges (F.9) may connect them.
F.2:11.1 Enactment + sensing
-
Context: BPMN 2.0 (2011) — LNF:
processTech:process- Plain:workflow processGloss: “Directed graph of flow nodes and sequence flows enacted by participants.” -
Context: PROV‑O (2013) — LNF:
activityTech:activity- Plain:temporal occurrenceGloss: “Time‑bounded occurrence that uses and generates entities and links to agents.” -
Context: SOSA/SSN (2017) — LNF:
observationTech:observation- Plain:measurement actGloss: “Act applying a procedure to a feature of interest to produce a result.” -
Context: ITIL 4 (2020) — LNF:
service‑level‑objectiveTech:service‑level‑objective- Plain:service targetGloss: “Target value for a service characteristic within a service promise vocabulary.”
Thinking pay‑off: you can phrase “compare observation to service‑level‑objective” without importing workflow or provenance semantics.
F.2:11.2 Sys‑CAL / LCA‑CAL + services
-
Context: State‑space control texts — LNF:
actuationTech:actuation- Plain:control outputGloss: “Signal applied to the plant to influence state/output.” -
Context: IEC 61131‑3 — LNF:
taskTech:task- Plain:runtime program executionGloss: “Cyclic or event‑driven execution unit for control programs.” -
Context: ITIL 4 (2020) — LNF:
incidentTech:incident- Plain:reported disruptionGloss: “Unplanned interruption or reduction in the quality of a service.”
Thinking pay‑off: avoids calling a plant fault an “incident” unless you cross Contexts later with an explicit bridge.
F.2:11.3 Kind-CAL + Method‑CAL + KD‑CAL
-
Context: OWL 2 (profiles) — LNF:
subclass‑ofTech:subclass‑of- Plain:is‑a (type hierarchy)Gloss: “C ⊑ D: every instance of C is an instance of D.” -
Context: FCA corpus — LNF:
formal‑conceptTech:formal‑concept- Plain:extent–intent nodeGloss: “Maximal (objects, attributes) pair under a Galois connection.” -
Context: SPEM 2.0 / ISO 24744 — LNF:
methodTech:method- Plain:abstract way of doingGloss: “Abstract how‑to independent of specification or execution.” -
Context: SOSA/SSN (2017) — LNF:
procedureTech:procedure- Plain:measurement recipeGloss: “Specification guiding how an observation is produced.”
Thinking pay‑off: discourages treating an FCA “concept” as a U.Type, or a procedure as a method without later proof.
Reasoning primitives (judgement schemas, notation‑free)
Read each as a permitted mental move over the outcomes of F.2. Symbols:
R= Context (U.BoundedContext),u= local lexical unit,s= surface string.
-
Localisation
heard(s) ∧ R chosen ⊢ localize(s,R)You decide to hearsonly in ContextR. -
Context‑idiom normalisation
localize(s,R) ⊢ LNF_R(s) = ℓWithinR, the Local Normal Form forsisℓ. -
Unit formation
LNF_R(s)=ℓ ∧ labelTech=t ∧ labelPlain=p ∧ gloss=g ⊢ unit(u) = ⟨R,ℓ,t,p,g⟩A local lexical unit is formed (quintet). -
Lexical‑only guard
unit(u) ⊢ lexicalOnly(u)No behavioural/deontic/type math is attached to the gloss. -
Homonymy signal (Cross‑context)
LNF_Ra(s)=ℓa ∧ LNF_Rb(s)=ℓb ∧ Ra≠Rb ⊢ homonymy(s) ⊇ {Ra,Rb}Same string across Contexts is flagged as different by default. -
Minimal generality check
unit(u) ⊢ minimal(u) ⇔ gloss(u) says no more than the Context’s usage requiresThe gloss fits the Context; broader claims are withheld. -
Two‑register adequacy
unit(u) ⊢ didactic(u) ⇔ (tech(u) faithful) ∧ (plain(u) explanatory)Tech stays canonical; Plain helps non‑specialists. -
No Cross‑context conclusion
unit(u@Ra), unit(v@Rb), Ra≠Rb ⊢ ¬(u ≡ v) (within F.2)F.2 never asserts Cross‑context equivalence. -
Ready‑for‑F.3 signal
lexicalOnly(u) ∧ minimal(u) ∧ didactic(u) ⊢ readyF3(u)A unit is suitable input for intra‑Context clustering in F.3.
Relations
Builds on: F.1 (Contexts fixed; heterogeneity/parsimony in place). E.10.D1 D.CTX (Context ≡ U.BoundedContext; “Problem Frame” reserved for narrative). F.0.1 (Source - Local Meaning - Bridge‑Only Crossing).
Constrains: F.3 (Intra‑Context Sense Clustering): operates only on units from one Context; produces Local‑Senses and addressable SenseCells. F.4 (Role Description Definition): may cite SenseCells, not raw strings. F.9 (Alignment & Bridge): consumes homonymy signals; declares explicit Cross‑context mappings with loss policies.
Used by. Extention patterns in Part C when referencing domain idioms (labels stay context‑local).
Migration notes (conceptual)
- New edition appears. Add a Context in F.1; harvest afresh in F.2 using that Context; do not overwrite earlier units.
- Idiomatic update discovered. If your LNF fought the canon’s idiom, re‑LNF within the same context; keep labels/glosses steady unless the canon itself differs.
- Ambiguity inside a Context. If use splits, mint two units with distinct glosses; F.3 will sort their relation (same/different Local‑Sense).
- Language split. Treat each language canon as its own Context; resist cross‑language merges in F.2.
- Tail pruning. If units accumulate without feeding F.3/F.4/F.9, drop them from the working set; keep head terms that carry bridges.
- DSL quarantine. If a tool dialect is unavoidable, keep it as one context among others; never let it define the idiom for other Contexts.
Acceptance tests (SCR/RSCR — concept‑level)
Static conformance (SCR)
- SCR‑F2‑S01 (context‑locality). Every unit cites a Context from F.1.
- SCR‑F2‑S02 (Idiomatic LNF). Each LNF reflects the Context’s spelling/hyphenation/casing with minimal edits.
- SCR‑F2‑S03 (Two registers). Each unit carries both Tech and Plain labels; if not, a reason exists tied to didactics.
- SCR‑F2‑S04 (Lexical‑only). No gloss contains behaviour, deontics, measurement math, or type axioms.
- SCR‑F2‑S05 (No Cross‑context claims). Nowhere does F.2 assert equivalence/similarity/subsumption across Contexts.
- SCR‑F2‑S06 (Minimal generality). Glosses match the Context’s use; no globalisation.
- SCR‑F2‑S07 (Temporal honesty). For Contexts with fixed DesignRunTag, units and glosses respect it.
Regression (RSCR)
- RSCR‑F2‑E01 (Edition split). Introducing a new edition yields new units under a new Context; earlier units persist unchanged.
- RSCR‑F2‑E02 (Normaliser stability). Adjusting an LNF does not silently widen/narrow the gloss.
- RSCR‑F2‑E03 (Language split). Adding a second language yields a second Context; no bilingual collapse in F.2.
- RSCR‑F2‑E04 (No stealth bridges). After updates, F.2 still contains zero Cross‑context identity claims; any mapping appears only in F.9.
- RSCR‑F2‑E05 (Head‑term focus). Periodic check shows the unit set remains small and oriented to F.3/F.4/F.9 needs.
Didactic distillation (60‑second script)
“In F.2 you harvest inside Contexts. For each Context, pick the canon’s own phrasing, choose a Local Normal Form in that idiom, add Tech and Plain labels, and write a one‑sentence Gloss that matches how that Context talks. Stop there. No bridging, no behaviour, no equations. If the same string appears in another Context, treat it as a different unit. These units feed F.3, where you’ll sort senses within a Context, and F.9, where you’ll relate Contexts explicitly. This keeps meaning local, names faithful, and later reasoning clean.”