SoTA Harvester & Synthesis

Pattern G.2 · Stable · Architectural (A) · Normative (unless explicitly marked informative) Part G - Discipline SoTA Patterns Kit

Type: Architectural (A) Status: Stable Normativity: Normative (unless explicitly marked informative)

Purpose. Provide a repeatable, auditable way to discover, triage, and synthesize state‑of‑the‑art (SoTA) across competing Tradition lineages before minting CHR/CAL/LOG assets for a CG‑Frame. The primary output is a SoTA Synthesis Pack@CG‑Frame that feeds:

  • naming/publication (UTS),
  • CHR authoring (G.3),
  • CAL authoring (G.4),
  • method/generator registries and dispatch (G.5).

Scope note. This pattern owns the harvesting + synthesis generator in Part G. Shipping ownership is in G.10, refresh orchestration ownership is in G.11.

Terminology note (normative). In normative clauses below, Tradition refers to the Tech token Tradition (a plural lineage with internally coherent commitments). Plain “tradition” is allowed only as a 1:1 synonym.

A team extends FPF into a new CG‑Frame. The relevant literature is typically:

Keywords

  • SoTA harvest
  • synthesis
  • SoTA Synthesis Pack@CG-Frame
  • SoTA_Set@CG-Frame
  • SoTAPaletteDescription
  • Tradition
  • ClaimSheets
  • CorpusLedger
  • PRISMA Flow Record
  • BridgeMatrix
  • describedEntity
  • micro-examples
  • hand-off manifests
  • RSCRTriggerKindId.

Relations

G.2explicit referenceEvidence Graph Referring (C-4)
G.2explicit referenceUnified Term Sheet (UTS)
G.2explicit referenceCG-Frame-Ready Generator

Content

Problem frame

A team extends FPF into a new CG‑Frame. The relevant literature is typically:

  • plural (multiple Tradition lineages with incompatible commitments),
  • context‑sensitive (results depend on U.BoundedContext and declared describedEntity),
  • method‑heterogeneous (different evidence styles, operator sets, and validity regions),
  • time‑sensitive (rapid drift post‑2015; frequent benchmark/protocol shifts).

Downstream Part‑G work (CHR/CAL/selection/shipping/refresh) depends on the team producing consumable, citation‑ready artefacts without collapsing semantic boundaries across contexts or planes.

Problem

How can we systematically assemble a SoTA view that is:

  1. pluralist but comparable (plurality preserved; comparability is achieved only via explicit crossings),
  2. evidence‑addressable (claims cite auditable evidence surfaces and anchors),
  3. actionable (produces inventories and cards that G.3/G.4/G.5 can consume),
  4. refreshable (editions/policies/windows are pinned so RSCR/refresh can re‑audit and re‑run without semantic drift)?

Forces

  • Pluralism vs. consolidation. Consolidation is valuable, but unqualified fusion destroys meaning.
  • Breadth vs. load‑bearing depth. Too broad becomes shallow; too deep misses rival lineages.
  • Recency vs. stability. Freshness matters, yet durable “backbone” claims must be identified and kept visible.
  • Pedagogy vs. rigour. Outputs must be teachable enough to support review, while remaining audit‑ready.
  • Authoring vs. operations. This pattern lives in the authoring plane; operational runs and decisions belong to Work planes and to owner patterns.

Solution

G.Core linkage (normative)

Builds on: G.Core (Part‑G core invariants; routing hub)

GCoreLinkageManifest (normative). (Canonical form, Nil‑elision, and Expansion rule are defined in G.Core.)

`GCoreLinkageManifest := ⟨ CoreConformanceProfileIds := { GCoreConformanceProfileId.PartG.AuthoringBase, GCoreConformanceProfileId.PartG.UTSWhenPublicIdsMinted }, RSCRTriggerSetIds := {GCoreTriggerSetId.SoTAHarvestSynthesis}, CorePinSetIds := {GCorePinSetId.PartG.CrossingVisibilityPins},

CorePinsRequired := { // Scope pins (G.2‑specific) CG-FrameContext, Tradition[], describedEntity := ⟨GroundingHolon, ReferencePlane⟩, SoTA_SetId, SoTAPaletteDescriptionId,

// Evidence / provenance pins (G.2‑specific)
CorpusLedgerId,
FlowRecordId,
EvidenceAnchorRef[],
EvidenceGraphId?,

// Crossing / synthesis pins (delta beyond CorePinSetIds; only when used)
GammaEpistSynthId[]?,

// Edition / policy pins (only when used)
HarvestPolicyRef?,
DistanceDefRef.edition?,
InclusionCriteriaId?,
ScreeningRubricId?

},

DefaultsConsumed := ∅, TriggerAliasMapRef := ∅ ⟩`

(RSCR payload pins: ClaimSheetId[], SoTA_SetId, SoTAPaletteDescriptionId, BridgeMatrixId?, GammaEpistSynthId[]?, UTSRowId[]?, DistanceDefRef.edition?, HarvestPolicyRef?, InclusionCriteriaId?, ScreeningRubricId?, PathId/PathSliceId? when path‑citable evidence or a stable freshness window is pinned.)

Pattern‑local default rules (owned by this pattern; not a Part‑G‑wide DefaultId).

FamilyCoverageFloorK := 3 (unless explicitly overridden by HarvestPolicyRef and recorded in FlowRecord)

Kit: SoTA Synthesis Pack@CG‑Frame (pattern‑owned surface)

A conforming G.2 publication produces a notation‑independent pack whose internal organisation is free, but whose exported named components / views are stable and citable:

Each named component is addressable via a stable pack‑local identifier (e.g., CorpusLedgerId, ClaimSheetId, FlowRecordId) for citation and RSCR scoping. If any component is minted/evolved as a public id, it is published and cited via UTSRowId[] per CC‑GCORE‑UTS‑1 (delegation).

  1. SoTA_Set@CG‑Frame (export view; “M2 output” consumed downstream)
    A read‑optimised view over the harvested candidate set that downstream generator/selector work treats as the “harvester output set”.
    Constraint (normative): SoTA_Set@CG‑Frame MUST be reconstructible from pack components by id (no “hidden extra set”).

  2. G.2a CorpusLedger Ledger of candidate sources with Context and triage status (e.g., include / park / retire) and explicit rationale hooks.

  3. G.2b ClaimSheets[Tradition] Typed Claim Sheets per Tradition, each with:

  • explicit home context and describedEntity,
  • explicit evidence anchors/citations (A.10 and/or EvidenceGraph refs when available),
  • explicit freshness window notes and risk/trust cues (cite B.3 owners when using trust/decay language).
  1. G.2c OperatorAndObjectInventory Inventory of candidate CHR terms (characteristics/scales/coordinates) and candidate CAL operators/flows as stubs for downstream authoring.

  2. G.2d BridgeMatrix A citable alignment/divergence surface across Tradition×Tradition, with explicit losses and row scopes. If any row asserts cross‑source / cross‑Tradition substitution or fusion, the pack MUST attach a GammaEpistSynthId record (alias: G.2‑F) per G.2:Ext.GammaEpistSynthesis (no silent fusion).

  3. G.2e MicroExamples Worked micro‑examples for load‑bearing claims, each citing A.10 carriers, declaring context + describedEntity, and annotating assurance type(s) (TA/VA/LA, where applicable).

  4. G.2f UTSProposals Draft Name Cards + Minimal Definitional Sheets (MDS) + alias proposals (incl. concept‑set linkage where applicable), with the required publication pins.

  5. G.2g describedEntity Map Map from key terms/claims/public ids to GroundingHolon, ReferencePlane, and minimal reference cues for later CHR/CAL authoring.

  6. G.2h PRISMA Flow Record A screening/eligibility trail for how sources entered the pack (method‑profile is allowed; see Extensions).
    (Name is historical; the artefact remains notation‑independent.)

  7. G.2i SoSIndicatorFamilies Indicator families as variants (windows/constraints/assumptions) with explicit Acceptance branches per variant (branch ids/labels only; threshold semantics belong to CAL owners).

  8. G.2j MethodFamilyCards Candidate method families with a shared signature and a plurality of implementations, each with validity regions, cost/complexity notes, and known failure modes. When the pack targets downstream registry/dispatch, MethodFamily cards SHOULD include wiring stubs needed by G.5 (eligibility predicate refs, assurance profile cues, and the pack ids that justify the family).

  9. G.2k GeneratorFamilyCards (if applicable) Candidate generator families for environment/task generation with declared validity regions and transfer hooks.

  10. G.2l Annexes (optional; owner‑routed; see Extensions) For example: QD/NQD annexes, discipline‑specific indicator annexes, interop forms.

SoTAPaletteDescription (export view; required downstream)
A view‑friendly description object (pack‑local SoTAPaletteDescriptionId) that binds together:

  • the SoTA_Set@CG‑Frame view,
  • ClaimSheetId[], OperatorAndObjectInventory, BridgeMatrixId?,
  • SoSIndicatorFamilies (with variant/branch structure),
  • MethodFamilyCards / GeneratorFamilyCards?,
  • MicroExamples, UTSProposals,
  • and the describedEntity Map for citation and later CHR/CAL authoring.
    Note (normative intent): this is the primary “consumable surface” for G.3/G.4/G.5; it prevents downstream patterns from scraping free prose.

Editorial template: 1‑page “SoTA Sheet” per Tradition (informative).
When authoring ClaimSheets[Tradition], teams often benefit from a single‑page template: scope + claims + evidence anchors + validity region + failure modes + freshness window + cross‑Tradition reuse notes + pointers to micro‑examples.

Harvester loop (conceptual choreography; pattern‑owned)

A conforming G.2 work product is built by iterating the following conceptual loop until the declared gates are satisfied:

  1. Declare scope and plurality. Declare CG-FrameContext, the initial Tradition set, and the describedEntity surface for each intended claim region. Record these declarations in the pack pins (not as implicit assumptions).

  2. Discover and triage sources (ledger‑first). Populate CorpusLedger via:

  • seed sources,
  • expansion via citation chaining and keyword family exploration,
  • pruning using load‑bearing relevance tests tied to the declared CG‑Frame scope.
  1. Distill claims per Tradition. For each Tradition, author a Claim Sheet that preserves internal commitments and cites evidence anchors. Do not fuse cross‑Tradition claims at this stage.

  2. Inventory operators/objects for downstream authoring. Extract candidate measurement terms and operator stubs for later CHR/CAL authoring (without asserting legality or thresholds locally).

  3. Build alignment/divergence surfaces. Where reuse across Tradition is desired, author Bridge‑backed alignment records and explicit loss notes in BridgeMatrix. Any consolidation is explicitly marked as requiring alignment proof.

  4. (Alias: G.2‑F) Produce Γ_epist synthesis records when fusion/substitution is asserted. If a work product asserts cross‑source / cross‑Tradition fusion or substitution (beyond mere “parallel divergent claims”), it MUST emit GammaEpistSynthId records per G.2:Ext.GammaEpistSynthesis (provenance union + explicit object alignment refs + assurance tuple refs), and it MUST keep penalties routed to R_eff only by delegation (CC‑GCORE‑PEN‑1).

  5. Publish teachable micro‑groundings. Attach worked micro‑examples to load‑bearing claims, each tied to A.10 carriers and declaring context + describedEntity.

  6. Apply gates and record repairs. Enforce FamilyCoverageFloorK (and any optional diversity‑by‑distance gate). If a gate fails, the pack MUST:

    • record the failure and the repair iteration in FlowRecord and CorpusLedger,
    • pin the updated HarvestPolicyRef / criteria ids (if changed),
    • iterate the loop rather than silently weakening the gate.
  7. Emit hand‑off manifests and export views. Produce explicit manifests to:

  • G.3 (CHR authoring),
  • G.4 (CAL authoring),
  • G.5 (registry/dispatch), so that downstream work can cite pack components by id rather than re‑authoring them. The pack MUST also export SoTA_Set@CG‑Frame and SoTAPaletteDescription as the default downstream consumption surfaces (ids pinned).

Interfaces (minimal I/O Standard)

InterfaceConsumesProduces
G.2‑1 HarvestCG-FrameContext, initial Tradition[], HarvestPolicyRef?SoTA Synthesis Pack@CG‑Frame (G.2a–G.2l)
G.2‑2 Extendexisting Pack + new sources/anchors + updated policy pinsupdated Pack + RSCR‑relevant trigger emissions (canonical kinds)
G.2‑3 HandOffPackCHR‑handoff (to G.3), CAL‑handoff (to G.4), Registry‑handoff (to G.5)

Note: Orchestration of re‑runs is owned by G.11; this pattern only defines what a conforming (re)harvest produces and what pins it must expose.

Extensions (pattern‑scoped; non‑core)

Extensions are pattern‑scoped modules. They do not introduce Part‑G‑wide norms; they provide wiring/pins and cite semantic owners.

GPatternExtension: GammaEpistSynthesis

PatternScopeId: G.2:Ext.GammaEpistSynthesis
GPatternExtensionId: GammaEpistSynthesis
GPatternExtensionKind: GeneratorSpecific
SemanticOwnerPatternId: G.2 (this pattern owns synthesis semantics; module exists for modularity + later extraction)
Uses: {G.Core, B.3, F.9, G.6} (penalty routing + trust/decay cues + bridges/CL + evidence path citation when used)
⊑/⊑⁺:
RequiredPins/EditionPins/PolicyPins (minimum):

  • GammaEpistSynthId[] (pack‑local ids of synthesis records; emitted iff fusion/substitution is asserted)
  • EvidenceAnchorRef[] (provenance union; A.10 carriers)
  • BridgeMatrixId and BridgeCardId[] (explicit object alignment references when crossing is involved)
  • CL/CL^plane + Φ/Ψ/Φ_plane policy-ids (ids only; semantics routed via owners; penalties → R_eff only by delegation)
  • PathId/PathSliceId? (only when citing via G.6)

RSCRTriggerKindIds: {RSCRTriggerKindId.EvidenceSurfaceEdit, RSCRTriggerKindId.CrossingBundleEdit, RSCRTriggerKindId.ReferencePlaneEdit, RSCRTriggerKindId.PenaltyPolicyEdit, RSCRTriggerKindId.PolicyPinChange, RSCRTriggerKindId.EditionPinChange}

Notes (normative intent; duplication‑avoidant):

  • Γ_epist^synth is an auditable record that binds: (i) provenance union, (ii) explicit object alignment refs, (iii) assurance tuple refs (via existing owners) for each asserted fusion/substitution.
  • This module does not redefine Γ‑fold, Φ, or penalty semantics; it only requires the pins/refs needed for replayability and auditability (see G.Core delegations).
GPatternExtension: HarvestProtocols

PatternScopeId: G.2:Ext.HarvestProtocols GPatternExtensionId: HarvestProtocols GPatternExtensionKind: Phase3Seed SemanticOwnerPatternId: owner TBD (Phase‑3 seed: harvesting protocol taxonomy not yet extracted into a dedicated owner) Uses: {B.3, A.10} (for freshness/decay and provenance anchors, when protocol requires them explicitly) ⊑/⊑⁺: RequiredPins/EditionPins/PolicyPins (minimum):

  • HarvestPolicyRef (declares the chosen protocol family and its parameters)
  • FlowRecordId (protocol‑specific profile id or rubric id may be attached here)
  • InclusionCriteriaId / ScreeningRubricId (ids only; semantics remain local to the protocol family)

RSCRTriggerKindIds: {RSCRTriggerKindId.PolicyPinChange, RSCRTriggerKindId.EditionPinChange, RSCRTriggerKindId.FreshnessOrDecayEvent}

Notes (wiring‑only):

  • This module binds a declared protocol profile to the pack’s FlowRecord without redefining evidence semantics.
GPatternExtension: DHCAlignmentHooks

PatternScopeId: G.2:Ext.DHCAlignmentHooks GPatternExtensionId: DHCAlignmentHooks GPatternExtensionKind: DisciplineSpecific SemanticOwnerPatternId: C.21 (DHC semantics are owned by C.21) Uses: {C.21, G.6, G.7} (DHC series + evidence path citations + bridge/CL regimes when alignment density is claimed) ⊑/⊑⁺: RequiredPins/EditionPins/PolicyPins (minimum):

  • DHCMethodRef.edition
  • WindowRef? (if the DHC series is windowed)
  • DHCSenseCellId[] (pack‑local ids for emitted DHC SenseCells; if any are public, cite via UTSRowId[])
  • UTSRowId[]? (only if any DHC SenseCells / series ids are minted/evolved as public ids)
  • PathId[] / PathSliceId[] (when alignment summaries cite evidence paths via G.6)

RSCRTriggerKindIds: {RSCRTriggerKindId.EditionPinChange, RSCRTriggerKindId.EvidenceSurfaceEdit, RSCRTriggerKindId.TelemetryDelta}

Notes (wiring‑only):

  • If DHC alignment summaries are emitted, this module ensures the DHC method edition and the cited evidence paths are visible.
  • Units/constraints (semantic owner: C.21) must be pinned, not redefined here (e.g., bridges_per_100_DHC_SenseCells, CL_min = 2 for cross‑Context counting, and the “CL=3 implies free substitution” interpretation when used).
GPatternExtension: NQDAnnex

PatternScopeId: G.2:Ext.NQDAnnex GPatternExtensionId: NQDAnnex GPatternExtensionKind: MethodSpecific SemanticOwnerPatternId: C.18 (NQD‑CAL semantics owned by C.18; explore/exploit logging by C.19 when used) Uses: {C.18, C.19} ⊑/⊑⁺: RequiredPins/EditionPins/PolicyPins (minimum):

  • DescriptorMapRef.edition
  • DistanceDefRef.edition
  • InsertionPolicyRef (policy‑id/ref)
  • EmitterPolicyRef (policy‑id/ref)
  • TaskSignatureRef? (when QD mode is trait‑gated)

RSCRTriggerKindIds: {RSCRTriggerKindId.EditionPinChange, RSCRTriggerKindId.PolicyPinChange, RSCRTriggerKindId.TelemetryDelta, RSCRTriggerKindId.FreshnessOrDecayEvent}

Notes (wiring‑only):

  • This module only pins the required references for replayability; it does not redefine QD semantics, dominance, or acceptance rules.
GPatternExtension: InteropForms

PatternScopeId: G.2:Ext.InteropForms GPatternExtensionId: InteropForms GPatternExtensionKind: InteropSpecific SemanticOwnerPatternId: G.13 Uses: {G.13} ⊑/⊑⁺: RequiredPins/EditionPins/PolicyPins (minimum):

  • ExternalIndexRef.edition
  • ClaimMapperRef.edition
  • MappingPolicyRef (policy‑id/ref)
  • UTSRowId[] (for published external ids/aliases where relevant)

RSCRTriggerKindIds: {RSCRTriggerKindId.EditionPinChange, RSCRTriggerKindId.PolicyPinChange, RSCRTriggerKindId.TokenizationOrNameChange, RSCRTriggerKindId.EvidenceSurfaceEdit}

Notes (wiring‑only):

  • Interop affects only representation and citation routes; it must not introduce alternate legality gates or acceptance semantics.

Archetypal Grounding (System / Episteme)

Template elementU.System illustrationU.Episteme illustration
TellA safety engineering team needs to choose a control stack across multiple engineering “schools” (robust control, learning‑based control, formal verification), under a declared operational context and a concrete describedEntity (the vehicle + operating envelope).A research group must synthesize SoTA on “decision quality” across competing lineages (causal decision theory, evidential variants, bounded rationality, and active‑inference‑style formalisms), each with distinct evidence norms and semantics.
Show (failure)The team merges terms across contexts, treats incompatible test protocols as comparable, and collapses multiple partially ordered trade‑offs into one unqualified score. The resulting design cannot explain why a later safety review disagrees.The group produces a single “best” metric of decision quality and retrofits definitions to fit it. Later, conflicting claims cannot be traced because evidence anchors and crossing losses were never made explicit.
Show (repair)A conformant G.2 pack keeps parallel Claim Sheets per Tradition, publishes explicit alignment/loss notes where reuse is attempted, and emits hand‑offs so CHR/CAL/selection can be authored without re‑inventing semantics.A conformant G.2 pack preserves plural claims, publishes explicit bridge‑backed alignment where justified, represents indicators as families/variants, and makes evidence anchors and freshness windows visible so downstream re‑audits are possible.

Bias-Annotation (informative)

Lenses tested: Gov, Arch, Onto/Epist, Prag, Did.

  • Selection bias (Gov/Onto). Any harvesting protocol can over‑represent certain venues, languages, or evidence styles. Mitigation: pluralism floor + explicit CorpusLedger + explicit protocol pins.

  • Consolidation bias (Onto/Epist). Pressure to “merge” lineages can erase incompatible commitments. Mitigation: keep Claim Sheets disjoint by default; require explicit alignment proof for fusion; preserve loss notes.

  • Recency bias (Prag). Overweighting newest papers can hide durable backbone results; underweighting them misses SoTA drift. Mitigation: publish freshness windows and make them RSCR‑relevant.

  • Didactic bias (Did). Micro‑examples can steer interpretation toward familiar domains. Mitigation: require heterogeneous substrates and explicit A.10 anchors.

Conformance Checklist (normative) — CC‑G2

ConformanceIdRequirementPurpose / Notes
CC‑G2‑CoreRefA conforming G.2 artefact MUST satisfy the effective core obligations declared by the GCoreLinkageManifest in G.2:4.1 (per G.Core Expansion rule).Phase‑2 bridge clause: ensures universal invariants are not redefined inside G.2.
CC‑G2‑Pluralism‑1A conforming pack MUST include at least two Tradition lineages and at least three distinct home U.BoundedContext entries across the corpus.Prevents single‑lineage “SoTA” from masquerading as synthesis.
CC‑G2‑Ledger‑1A conforming pack MUST include G.2a CorpusLedger with inclusion/triage status and explicit rationale hooks per entry.Makes discovery/triage auditable.
CC‑G2‑FlowRecord‑1A conforming pack MUST include G.2h FlowRecord that traces identification → screening → eligibility → included at a minimum granularity sufficient to reproduce the corpus boundary.Prevents “mystery inclusion” and supports refresh.
CC‑G2‑ClaimSheets‑1For each included Tradition, a conforming pack MUST include a ClaimSheetId that declares home context, describedEntity, evidence anchors, and freshness notes; it MUST NOT fuse cross‑Tradition claims by default.Keeps plurality explicit and prevents hidden crossings.
CC‑G2‑Palette‑1A conforming pack MUST export SoTA_Set@CG‑Frame and SoTAPaletteDescription as citable views (via SoTA_SetId, SoTAPaletteDescriptionId) and ensure both are reconstructible from pack components by id (no hidden extra structure).Prevents downstream scraping of prose; keeps “M2 output” explicit.
CC‑G2‑describedEntityMap‑1A conforming pack MUST include G.2g describedEntity Map, mapping (at minimum) each load‑bearing claim family and each minted/evolved public id to describedEntity := ⟨GroundingHolon, ReferencePlane⟩, and citing the relevant ClaimSheetId and evidence anchors (A.10 and/or G.6 paths when used).Keeps plane/holon boundaries explicit and citable.
CC‑G2‑Alignment‑1Any cross‑Tradition consolidation SHALL be presented as either (i) disjoint parallel claims with explicit divergence, or (ii) an explicitly justified alignment proof; any reuse across Tradition boundaries MUST use explicit crossing bundles per CC‑GCORE‑CROSS‑1 (delegation).Prevents silent semantic leakage.
CC‑G2‑GammaSynth‑1If the pack asserts cross‑source / cross‑Tradition fusion/substitution (not merely “parallel divergent claims”), it MUST emit GammaEpistSynthId records satisfying G.2:Ext.GammaEpistSynthesis (provenance union + explicit alignment refs + assurance tuple refs). If no fusion/substitution is asserted, the pack SHALL state so explicitly.Restores the load‑bearing synthesis artefact (alias: G.2‑F) without shadow specs.
CC‑G2‑Inventory‑1A conforming pack MUST include G.2c OperatorAndObjectInventory, sufficient for downstream CHR/CAL authoring to begin without re‑harvesting terms.Ensures the pack is actionable.
CC‑G2‑Inventory‑2G.2c OperatorAndObjectInventory entries MUST be treated as stubs for downstream authoring: they MUST NOT embed acceptance thresholds or claim legality decisions locally. If an entry is not a citation of an already‑owned CHR/CAL artefact, it MUST be explicitly marked as stub (typing/lawfulness TBD) and MUST NOT be used as if lawful. Legality/threshold semantics are routed to owner patterns (G.3 for CHR, G.4 for CAL) via explicit ids/pins.Prevents “shadow CHR/CAL” and preserves lawfulness discipline without redefining it locally.
CC‑G2‑MeasurementLawful‑1If any inventory entry is presented as non‑stub (i.e., already lawful/typed), the pack MUST cite the owning lawfulness discipline (e.g., A.17–A.19/C.16 as applicable) and provide the minimal evidence anchors needed to justify that typing claim.Prevents “quietly lawful” measurement claims inside the harvester pack.
CC‑G2‑MicroExamples‑1For every load‑bearing claim family, a conforming pack MUST include at least two worked micro‑examples on heterogeneous substrates, each with explicit A.10 carrier anchors, declared context + describedEntity, and an assurance tag (TA/VA/LA, where applicable).Makes the synthesis teachable and anchor‑grounded.
CC‑G2‑UTS‑1If the pack proposes or evolves any public ids, it MUST publish UTS proposals (Name Cards + MDS where applicable) and cite them via UTSRowId[], satisfying CC‑GCORE‑UTS‑1 (delegation).Keeps naming and evolution disciplined.
CC‑G2‑Families‑1SoS indicators and candidate evaluation constructs SHALL be represented as families/variants (windows/constraints/assumptions) with explicit Acceptance branch structure per variant (branch ids/labels only), not as single unqualified scalars; any scalar summary MAY be included only as report‑only unless explicitly promoted by owner patterns. (Set/portfolio discipline is delegated to CC‑GCORE‑SET‑1.)Prevents covert scalarization and keeps acceptance downstream-owned.
CC‑G2‑HandOff‑1A conforming pack MUST emit hand‑off manifests to G.3, G.4, and G.5 that cite pack components by id and identify which families/operators are intended for downstream formalisation or registry entry.Prevents downstream re‑authoring and drift.
CC‑G2‑CoverageGate‑1The pack MUST declare FamilyCoverageFloorK and enforce it as a harvesting gate. It MUST either (i) specify k explicitly in an explicit HarvestPolicyRef, or (ii) use the pattern‑local default rule owned by CC‑G2‑CoverageGate‑1. Default rule (owner‑local): k=3. If the gate fails, the pack MUST (a) record the repair iteration in FlowRecord, and (b) broaden the search radius (new venues/corpora/contexts/traditions) rather than silently weakening the gate; if an exploration policy is used for this broadening, it MUST be pinned as a policy id/ref.Makes “coverage floor” explicit and prevents “silent narrowing” under failure.
CC‑G2‑DistanceGate‑1If a diversity‑by‑distance gate is used, the pack MUST pin DistanceDefRef.edition and the declared threshold (δ), and treat edits as RSCR‑relevant per CC‑GCORE‑TRIG‑* (delegation). If no such gate is used, the pack SHALL explicitly state that it is not used.Avoids implicit distance defaults and improves refreshability.
CC‑G2‑RSCR‑1A conforming pack MUST emit canonical RSCRTriggerKindId causes (not free text) for edits to evidence surfaces, name/tokenization surfaces (e.g., UTS proposals/aliases), crossings, planes, edition pins, and harvesting policy pins (HarvestPolicyRef), per CC‑GCORE‑TRIG‑1…TRIG‑4 (delegation).Keeps refresh reason codes stable and typed.
CC‑G2‑Ext‑GammaEpist‑1If G.2:Ext.GammaEpistSynthesis is used (i.e., any fusion/substitution is asserted), the pack SHALL expose the required pins listed in that extension and SHALL NOT redefine Γ‑fold/Φ/penalty semantics locally (route via owners by delegation).Keeps synthesis auditable without creating shadow specs.
CC‑G2‑Ext‑HarvestProtocols‑1If G.2:Ext.HarvestProtocols is used, the pack SHALL expose the required pins/criteria ids listed in that extension and SHALL NOT redefine evidence/quality semantics outside the declared protocol profile.Keeps protocol variation modular and Phase‑3‑extractable.
CC‑G2‑Ext‑DHC‑1If G.2:Ext.DHCAlignmentHooks is used, the pack SHALL (a) expose the required pins listed in that extension, including DHCSenseCellId[], and (b) declare the unit/constraint pins required by C.21 (e.g., bridges_per_100_DHC_SenseCells, CL_min=2) without redefining their semantics locally (semantic owner: C.21).Keeps DHC wiring auditable and non‑shadowing.
CC‑G2‑Ext‑NQD‑1If G.2:Ext.NQDAnnex is used, the pack SHALL expose the required pins/editions/policies listed in that extension and SHALL NOT redefine QD semantics locally.Keeps QD/OEE wiring replayable and modular.
CC‑G2‑Ext‑Interop‑1If G.2:Ext.InteropForms is used, the pack SHALL expose the required interop pins and SHALL NOT introduce alternative legality/acceptance semantics.Prevents “foreign gate” shadowing.

Common Anti‑Patterns and How to Avoid Them

  • AP‑G2‑1: “One true SoTA score.” Avoid: selecting a single unqualified scalar metric as “the” SoTA. Do instead: represent evaluation constructs as families/variants; keep partial orders set‑returning (delegated).

  • AP‑G2‑2: Fusion without explicit alignment proof. Avoid: merging rival Tradition claims into one statement “by common sense.” Do instead: preserve parallel Claim Sheets; if consolidation is required, publish explicit alignment proof or keep a divergence record.

  • AP‑G2‑3: Hidden protocol drift. Avoid: changing the harvesting protocol (inclusion criteria, windowing, screening rubric) without pins. Do instead: pin harvesting policy/profile ids and treat changes as RSCR‑relevant.

  • AP‑G2‑4: Unanchored pedagogy. Avoid: micro‑examples without carriers (they become folklore). Do instead: bind micro‑examples to A.10 anchors and declare describedEntity.

Consequences

  • Positive: Downstream CHR/CAL/dispatch work becomes faster and less ambiguous because the pack is citable and structured.
  • Positive: Plurality is preserved while still enabling disciplined comparability through explicit crossings.
  • Positive: Refresh becomes tractable because pins and typed causes exist.
  • Negative: Adds authoring overhead (ledger, flow record, micro‑examples, explicit pins).
  • Negative: Requires governance discipline to prevent the pack from becoming an uncontrolled “everything bucket”.

Rationale

SoTA synthesis is a bottleneck for new CG‑Frame work: without a disciplined harvest, downstream formalization (CHR/CAL) and operational selection (G.5) either (i) inherit hidden semantic collisions, or (ii) re‑invent incompatible “mini‑standards.” G.2 resolves this by treating SoTA work as a publishable kit: explicit plurality, explicit crossings, explicit evidence anchors, and explicit hand‑offs.

SoTA-Echoing (informative)

This pattern aligns its method options (via Extensions and authoring practice) with widely used post‑2015 SoTA practices, while keeping FPF’s semantics stable and id‑based:

  1. PRISMA 2020 reporting discipline (Page et al., 2021) Status: Adopt (adapted) — we adopt the idea of a transparent screening trail as FlowRecord, but keep it notation‑independent and concept‑level.

  2. Living systematic reviews (Elliott et al., 2017 and subsequent living‑review practice) Status: Adopt (as optional protocol family) — the “living” stance is expressed as a harvesting protocol profile (Extension), with explicit freshness windows and RSCR‑relevant change causes.

  3. AMSTAR 2 critical appraisal (Shea et al., 2017) Status: Adapt — we adapt the idea of structured quality appraisal into Claim Sheet evidence cues, without turning it into a single scalar rating.

  4. Science of Science synthesis (Fortunato et al., 2018) Status: Adopt (as content discipline) — SoS indicators are treated as families/variants and wired as citable artefacts, not as a single “score”.

  5. Disruption / team‑structure indicators (Wu, Wang & Evans, 2019 and follow‑on work) Status: Adopt (as exemplar family) — useful as an example of a SoS‑indicator family with strong dependence on windowing and corpus definition.

  6. Quality‑Diversity and open‑ended generation (e.g., Fontaine et al., 2020 for CMA‑ME; Wang et al., 2019 for POET) Status: Adopt (as optional annex wiring) — when QD/OEE is relevant for the CG‑Frame, we include generator/method family cards and pin the required edition/policy surfaces via G.2:Ext.NQDAnnex, without embedding those semantics into the core pack.

Relations

  • Builds on:

    • G.Core (core invariants, typed RSCR causes, default ownership routing)
    • E.8 (pattern template discipline)
    • E.10 (lexical/ontological rules; strict distinction; kind‑suffix discipline)
    • E.19 (conformance discipline)
    • A.10 (provenance anchors / carriers)
    • B.3 (trust, freshness/decay as cited owners)
    • F.9 (bridges and CL as cited owners)
    • F.17 (UTS publication discipline; via delegation)
    • G.0 (CG‑Spec legality gate; cited when legality surfaces are referenced)
    • G.6 (EvidenceGraph / path citation surfaces when used)
  • Used by:

    • G.1 (generator chassis consumes harvested SoTA sets)
    • G.3 (CHR authoring consumes operator/object inventory and claim sheets)
    • G.4 (CAL authoring consumes operator stubs, acceptance branch scaffolding)
    • G.5 (registry/dispatch consumes MethodFamily/GeneratorFamily cards)
    • G.10 (shipping cites the pack as payload)
    • G.11 (refresh orchestration can re‑invoke harvest via typed causes)
  • Relates to:

    • G.13 (interop surfaces when external indices are used)

G.2:End