Zorthex — Methodology & Policy

// methodology & policy · public document · v2.0

How Zorthex measures
and what it promises

Every methodological choice, every data policy, every limitation — declared publicly. This is the implicit contract between Zorthex and anyone who uses, cites, or purchases its outputs. Nothing is hidden. Everything can be verified.

Quick start — before you read

Zorthex is a methodological framework. Claude (Anthropic) is a tool used within it — for the free app and report drafting. Every verified value in a Custom Report is human-checked. Read Section 9 for the exact boundary between AI-estimated and human-verified outputs.

Contents

00Theoretical Foundations
01Core Definitions — L, t_start, t_peak
02t_start Source Policy — Levels A–D
03Google Trends CSV Policy
04Cluster Renormalization & the Rock Rule
05Three-Source Verification
06The 25/100 Threshold
07Classifications
08Products — What Each Promises
09How AI Is Used in This System
10Declared Limitations
11Version Changelog

// 00

Theoretical Foundations

Zorthex builds on three bodies of work without claiming to replace them.

Everett Rogers — Diffusion of Innovations (1962)

Rogers established that innovations spread through social systems over time, identifying five adopter categories and making time an explicit variable in diffusion theory. His definition remains precise after sixty years: "Diffusion is the process by which an innovation is communicated through certain channels, over time, among the members of a social system."

Zorthex extends this by measuring the temporal gap empirically with digital proxies unavailable in 1962. Where Rogers describes who adopts and how they decide, Zorthex measures when a phenomenon becomes publicly visible — a question Rogers could not answer with the instruments of his time.

Frank Bass — A New Product Growth Model (1969)

Bass formalised diffusion mathematically, modelling time as a continuous variable driven by innovators (p) and imitators (q). His differential equation dN/dt gave diffusion a mathematical clock for the first time. The Bass model assumes endogenous, gradual diffusion — the wave builds from within, word of mouth does the work, the curve is smooth.

Zorthex documents cases where diffusion is discontinuous and exogenously triggered — a pattern the Bass model does not capture. In Policy-Trigger regimes, the curve is flat for years, then switches regime on a regulatory event. This is not a refutation of Bass; it is a regime distinction his model was not designed to accommodate.

Geoffrey Moore — Crossing the Chasm (1991)

Moore identified the structural gap between Early Adopters and the Early Majority — the Chasm that most innovations never cross. Pragmatists require peer references and a complete ecosystem before they move; visionari are not credible references for pragmatists.

The Dual-Velocity pattern in the Zorthex dataset is the empirical expression of Moore's Chasm: specialist attention (Wikipedia) running high for years while mainstream attention (Google Trends) stays near zero. In Policy-Trigger regimes, the Chasm is crossed not by peer references but by regulatory mandate — and the whole product ecosystem is built after the crossing, not before.

Zorthex is descriptive where Rogers is qualitative, empirical where Bass is mathematical, and data-driven where Moore is practitioner-observational. The three frameworks are complementary, not competing.

// 01

Core Definitions

These three definitions are locked across all Zorthex outputs. They cannot be modified between reports, between versions, or between clients. Any change requires a new version number and full dataset recalculation.

t_start

First documented public appearance of the phenomenon

The earliest traceable public signal — defined as the date of a seminal paper, documented commercial launch, recognized founding event, or first operational deployment. Must be justified with a primary source (see Section 2). t_start is not the date the phenomenon became widely known — it is the date it first existed publicly in any documented form.

t_peak

First month of the first window of 12 consecutive months where Google Trends ≥ 25/100 Worldwide

If 12 consecutive months above threshold are not reached, t_peak is defined as the month of the absolute maximum recorded score. t_peak is not the moment of highest awareness — it is the first month of sustained structural attention as defined by this framework.

L — Diffusion Lag

L = t_peak − t_start (months)

A descriptive lag metric. Not a physical constant. Not a universal law. Not a prediction. L represents the observable gap between first documented public emergence and sustained mainstream attention. L is sensitive to t_start definition — alternative t_start candidates produce different L values, which is why t_start source and rationale are always declared explicitly in every report.

// 02

t_start Source Policy — Levels A–D

Every t_start used in a Zorthex report must be justified with a documented primary source. Sources are classified into four defensibility levels. The level does not determine the choice — the date closest to the actual operational moment prevails. The level determines how defensible that choice is.

Level	Source Type	Example
A	Peer-reviewed academic paper with publication date	Nature, Science, arXiv with submission date
B	Official organizational announcement — government, regulatory, corporate	NIST publication, SEC filing, company press release
C	Primary journalism — named outlet, date, archived URL	Reuters, Financial Times, Nature News
D	Patent filing with registered date	USPTO, EPO, WIPO with filing date

Precedence Rule: Between two sources of different levels, the one closest to the actual operational moment prevails. Level determines defensibility — not chronological priority.

Declaration format in every report: "t_start defined as [date]. Source: [name + URL]. Level: [A/B/C/D]. Rationale: [why this source and not alternatives]."

Not acceptable as t_start sources: Wikipedia, undated web articles, AI-generated summaries, secondary aggregators, sources without a verifiable publication date.

// 03

Google Trends CSV Policy — Snapshot-Locked Data

Google Trends produces normalized scores (0–100) relative to peak search volume within the selected time period and geography. The same query run on different dates or with different windows produces different absolute values. Zorthex handles this through snapshot-locking.

Snapshot-Locking: Every Zorthex report locks the specific CSV snapshot used for analysis. The analysis is reproducible from that CSV — not from a future re-run. The CSV is available on GitHub for independent verification.

Partial-month exclusion: The current (in-progress) month is never used. Google Trends does not finalize a month until it closes; the last valid data point is the most recent fully consolidated month. For the v2.0 dataset the observation cut-off is May 2026.

Standard Citation Format — Google Trends Data

Google Trends data: query '[search term]', geography: [Worldwide / Country], period: [start date] – [end date], downloaded: [download date].

Note: Values are normalized scores (0–100) relative to peak search volume in the selected period. The CSV is available at github.com/zorthex2026/zorthex-diffusion-lag for independent verification.

Observation cut-off & revision policy: Every published figure is dated to an explicit observation cut-off (v2.0: May 2026) with the underlying CSV snapshots locked and archived. All values are subject to revision · re-verification recommended every 90 days, mirroring the dated and revisable nature of a credit rating. A value is never presented as permanent: it is the best reading at a declared date, revised on a fixed cycle.

// 04

Cluster Renormalization & the Rock Rule

Because Google Trends normalizes every series against the maximum of the downloaded window, a large recent surge in attention rescales the entire history of a term downward. A phenomenon that was already structurally active years earlier can appear, in a freshly downloaded CSV, to have "broken out" only recently. This is the cluster renormalization effect, and it is most pronounced for terms caught inside a synchronized attention wave (in this dataset, a broad August 2025 cluster across multiple domains).

The Rock Rule: A phenomenon whose all-time peak lies in the past has a stable L — the history is settled and renormalization cannot move it ("rock"). A phenomenon whose peak is recent (within the current attention wave) carries a provisional L and is marked OBSERVATION ("quicksand") — its position may shift on the next consolidated download. A provisional L is never blinded into a permanent value.

The rock rule is why the dataset separates stable structural cases from observation-stage cases on the basis of where the peak sits in time, not on the raw score alone. It is also why a second and third source are required: an independent signal that anchors a phenomenon earlier than its Google Trends breakout is direct evidence that the recent surge is a renormalization artifact, not a genuine first emergence.

// 05

Three-Source Verification

Google Trends alone is insufficient. Each case in the v2.0 dataset is positioned using three independent attention signals, each measuring a different kind of attention. They are not averaged into a single number; they are read together.

◉ Google TrendsOperational attention — who searches in order to act. L is computed here (the 12-consecutive-month rule). CSV snapshot-locked and archived.

◉ WikipediaInformational attention — who reads to understand. Absolute pageviews via the Wikimedia REST API. A corrective to search-index normalization.

◉ RedditCommunity attention — how a topic is discussed. A declared qualitative proxy (absent / nascent / mature), never converted into a number.

What the second source corrects

Wikipedia pageviews are the corrective to Google's renormalization. For phenomena such as generative AI and deep learning, Wikipedia shows high and constant readership since 2023 — revealing them as mainstream and structural even where a freshly downloaded Google Trends series, rescaled by a later surge, would understate them.

The Dual-Velocity finding

For B2B-infrastructure and policy-driven phenomena, Wikipedia and Reddit signals are systematically lower than Google Trends scores. This divergence is declared as a finding, not corrected as an error: it reflects the structural difference between operational attention (active search by professionals who must use the tool) and informational or community attention (the mainstream public, not yet reached). Attention moves at two velocities, and Zorthex frequently intercepts a phenomenon while it is still confined to technical departments — before it becomes mainstream news. Cases where this pattern holds are flagged as dual-velocity in the dataset.

Convergence reporting: Each case carries its convergence state. Full convergence = the three signals agree. Partial convergence = a declared B2B / policy divergence pattern. A small number of single-case anomalies (e.g. a topic stronger on Reddit than on Wikipedia) are declared individually rather than smoothed away.

// 06

The 25/100 Threshold

The threshold of 25/100 on Google Trends (Worldwide, normalized) is the activation point for structural classification. It is empirically set based on sensitivity testing across the dataset.

How it was tested

Tested at 20, 25, and 30 across the historical dataset. At 25, the separation between confirmed structural phenomena and confirmed transient peaks is most stable. At 20, confirmed bubbles pass the threshold. At 30, confirmed structural phenomena like Bitcoin fail the test.

Important limitation: The threshold is a tested heuristic, not a validated constant, and will be re-evaluated as the dataset grows toward n=100.

// 07

Classifications

STRUCTURAL

≥ 25/100 for 12+ consecutive months confirmed · peak historical (rock rule)

Permanent classification once acquired, provided the all-time peak is historical. Indicates sustained mainstream public attention. Current Status (Active/Receding/Dormant) tracks today's attention independently.

OBSERVATION

Above threshold but <12 consecutive months · OR recent peak (provisional L)

Signal forming, or a case whose peak sits inside the current attention wave and whose L is therefore provisional under the rock rule. Classification may change. If 12 months complete and the peak settles, upgrades to STRUCTURAL. If attention drops and does not recover, becomes BUBBLE.

BUBBLE

Peaked above threshold · dropped below 25 · no 12-month window

Permanent classification. BUBBLE does not mean the technology failed — only that public attention did not consolidate into structural phase.

A fourth state, PRE-dataset / non-stationary, is used for cases whose origin precedes the Google Trends data window (2004) or whose curve is dominated by an exogenous shock (e.g. a pandemic). These are retained for completeness but excluded from L averaging.

4-Regime Taxonomy

Beyond classification, each case is assigned a regime describing the shape of its diffusion: Policy-Trigger (slow ramp, breakout on a regulatory trigger), Institutional Mass (multi-year invisibility then institutional adoption), Market-Narrative (speculation- and narrative-driven cycles), and Shock/Spoke (single-event spike then collapse). The regime is a structural property of the thematic area: knowing the area lets one read the likely regime by analogy — a descriptive read of shape, never a prediction of an individual case.

// 08

Products — What Each Promises

Zorthex offers three distinct products. Each delivers a different type of output with a different data standard. The distinction between AI-estimated and manually-verified outputs is the foundation of this architecture.

Product	What it answers	Data source	Format
Free App	What is happening with this topic?	AI estimates from Claude training data — not verified	Instant · any topic
Signal Artifacts	Where is this phenomenon in the cycle?	Manually verified — Zorthex dataset cases only	Structured · published on zorthex.com/research.html
Custom Reports	What is the verified temporal position of this specific topic for my domain?	Manually verified from scratch · real CSV · primary source for t_start · three-source verification · robustness check on t_start alternatives	HTML + PDF · delivered on request · scope agreed before production begins

On published ZCR reports: The reports published on zorthex.com/research.html — ZCR-2026-001 (Stablecoins), ZCR-2026-002 (Post-Quantum Cryptography), ZCR-2026-003 (Real-World Asset Tokenization) — are demonstration reports produced by Zorthex to document the framework in practice. They are not commissioned reports. Commissioned Custom Reports are produced on request for the client's specific topic and domain. Contact: zorthex.official@gmail.com

The fundamental rule: A user of the Free App cannot reliably infer the verified temporal positioning without accessing the manually-verified tier. The paywall is semantic — the Free Layer is explicitly designed to provide observational estimates, not verified positioning.

// 09

How AI Is Used in This System

Zorthex uses Claude (Anthropic) as the analytical engine for the free app and as a support tool for report drafting. This is declared explicitly and always.

Free App — AI Estimated

The free app uses Claude with the Zorthex v2.0 framework applied to its training knowledge. Claude estimates t_start, t_peak, L, and current scores. These are estimates — not verified values. Output is clearly labelled as AI-generated.

Custom Reports — Human Verified

No AI estimates appear in Custom Report data. t_start is verified against primary sources. Google Trends CSVs are downloaded manually and locked. Wikipedia and Reddit signals are checked manually. Claude supports drafting of narrative sections — every factual value is human-verified before inclusion.

On "Powered by Claude"

Zorthex declares its use of Claude as a matter of transparency, not marketing. Claude is the tool. Zorthex is the method. This distinction is intentional and permanent.

// 10

Declared Limitations

These are not disclaimers. They are the most operationally honest part of this document.

Survival Bias

Only technologies that reached mainstream attention are included. Technologies that failed to diffuse are not represented.

Selection Bias

Technologies chosen for historical prominence, not random sampling. No statistical inference about the general population is valid from this dataset.

Small Sample

n=70 documented cases as of v2.0, structured as 7 domains × 10 cases. Sector averages are descriptive only and, where a domain has few comparable structural cases, the area read is declared provisional.

Proxy Limitation

Google Trends measures search interest — not understanding, adoption, or impact. All three sources are attention proxies, not adoption metrics. Reddit is an explicitly qualitative proxy.

t_start Subjectivity

Identifying the founding event involves judgment. Two analysts may choose different t_start candidates, producing different L values.

English-Language Bias

Google Trends Worldwide data is biased toward English-speaking search behavior. National reads are solid; sub-national and non-English reads are noisier.

No Predictive Claims

Zorthex does not predict. L is a descriptive historical metric. Positional and regime labels describe observed historical position by analogy — not what happens next for an individual case.

Renormalization & Provisional L

Recent-peak cases carry a provisional L under the rock rule and are marked OBSERVATION. Their position may change on the next consolidated download; they are not blinded into permanent values.

Positioning Statement

Zorthex is descriptive only and does not constitute investment advice.

Replication statement: All data and methodology are publicly available at github.com/zorthex2026/zorthex-diffusion-lag. If you follow this pipeline and obtain different results — that is a contribution, not a problem.
Contact: zorthex.official@gmail.com

Research direction. A second metric — institutional adaptation latency (L₂) — is in development as an extension of the diffusion-lag framework. It is not part of the public v2.0 dataset and is noted here only to record its standing within the broader research programme.

// 11

Version Changelog

v2.0

June 2026 · DOI: 10.5281/zenodo.20589503

Dataset expanded to n=70 public, structured as 7 domains × 10 cases
Three-source verification formalized (Google Trends + Wikipedia + Reddit)
Cluster renormalization effect documented; the rock rule formalized (historical peak = stable L; recent peak = provisional, OBSERVATION)
Dual-velocity finding declared for B2B / policy-driven phenomena
Observation cut-off (May 2026) and 90-day revision policy adopted
Dataset double-verified: by hand on CSV snapshots + automated pipeline on May-2026 consolidated data
Research direction recorded: institutional adaptation latency (L₂), in development
Theoretical Foundations section added (Rogers, Bass, Moore)

v1.3

May 2026 · DOI: 10.5281/zenodo.20374051

4-regime taxonomy formalized (Policy-Trigger, Institutional Mass, Market-Narrative, Shock/Spoke)
Layer A / Layer B separation introduced
Non-Stationarity Flag added to all cases
Dataset expanded to n=50 public · n=82 internal
Domain coherence validated across 7 domains
Product structure simplified — demonstration reports distinguished from commissioned reports
CLAUDE.md added to GitHub repository

v1.2

May 2026 · DOI: 10.5281/zenodo.20270575

t_start Source Policy (Levels A–D) formalized
Google Trends CSV snapshot-locking policy defined
Current Status classification added (Active/Receding/Dormant)
License updated to CC-BY-NC 4.0

v1.1

May 2026 · DOI: 10.5281/zenodo.20072999

12-month consecutive window introduced
Structural Bubble category added
n increased to 12 structural cases

v1.0

April 2026 · DOI: 10.5281/zenodo.20049068

Initial framework publication · n=11 structural cases