How does BrightCat verify the data is accurate?

Every weekly build runs through automated validation: address normalization, postal code verification, persistent property identifier matching, duplicate detection, and reclassification of relisted properties. BrightCat has run 600+ consecutive weekly builds since 2014 with documented coverage and explicit data lineage. Quality metrics are tracked per build.

What is a persistent property identifier?

A persistent property identifier is the unique ID BrightCat assigns to a Canadian property the first time it appears on market. The same identifier follows the property across MLS number changes, re-lists, withdrawals, conversions between sale and rental, and ownership changes. Without persistent identification, the same property re-listing under a new MLS number looks like a new property — which inflates inventory counts and resets DOM (days on market) artificially.

How does BrightCat handle relisted properties?

Between 2014 and 2021, BrightCat reclassified 1.84 million Canadian listings that were marked NEW in their source feeds but were actually re-lists of properties already known to the pipeline. The reclassification uses address normalization, persistent property matching, and a defined RELISTED status. This is one of the structural differences between BrightCat and snapshot-style property datasets — BrightCat tracks property continuity, not listing churn.

What is BrightCat's coverage across Canadian provinces?

BrightCat covers all 10 Canadian provinces with weekly residential coverage since 2014 and weekly rental coverage since July 2021. Every weekly build records coverage metrics. Commercial real estate coverage spans 314,884 properties nationally across office, retail, industrial, and multi-family asset classes.

How is BrightCat's data lineage documented?

Every dataset BrightCat publishes carries provenance metadata: which weekly build it was produced from, which source files contributed, address normalization version, persistent identifier version, and reclassification rules applied. AI agents using the MCP connector can retrieve lineage metadata alongside data. For enterprise buyers, methodology documentation is available on the data library and through the BrightCat Methodology page.

Where can I see the methodology in detail?

The /methodology/ page documents the pipeline architecture, data sources, address normalization (address validation), persistent property identifier construction, relist reclassification logic, weekly build cadence, and quality metrics. Enterprise customers receive additional documentation including data dictionaries, schema specifications, and lineage reports under NDA.

METHODOLOGY PROOF

How we verify Canadian property data.

Most property data providers ask you to trust the output. BrightCat publishes how the output is built — what gets matched, what gets reclassified, what gets flagged. Methodology transparency for buyers who need to defend the data internally.

Built since

2014

Weekly builds

600+

Provinces covered

10 of 10

Relists reclassified

1.84M

Proof 1

The persistent property identifier

Every Canadian property in the BrightCat pipeline carries a persistent identifier — assigned the first time the property appears on market, and carried through every subsequent event the pipeline observes.

When that property re-lists with a new MLS number two years later, the persistent identifier still matches it. When it converts from a sale listing to a rental, the identifier stays the same. When the postal code redivides, the identifier is updated through address normalization without breaking the chain.

Without persistent identification, every relist looks like new inventory. Days on market resets to zero. Price cuts disappear. Models trained on the data inherit the noise.

This is why we built the identifier first, in 2014, before any of the other tracks.

Proof 2

1.84 million listings reclassified

Between 2014 and 2021, BrightCat identified and reclassified 1.84 million Canadian listings that arrived in source feeds marked NEW — but were actually re-lists of properties already known to the pipeline.

The reclassification logic uses address normalization, postal code verification, persistent property matching, and a defined RELISTED status. Properties identified as relists retain their original DOM accumulator, original list price, and cumulative price change history — rather than appearing as fresh inventory.

Why this matters for buyers: most third-party Canadian property datasets do not perform this reclassification. The result is that “new listings” counts get inflated, DOM averages get artificially compressed, and price discovery signals get obscured. Models trained on unreclassified data systematically miss the relist pattern.

We document the reclassification rules in the methodology page. Customers receive the rule version with each weekly build.

Proof 3

600+ consecutive weekly builds

Every Sunday since 2014, BrightCat has produced a weekly build — a versioned snapshot of every tracked Canadian property, every active listing, every event captured in the prior 7 days, and every reclassification applied.

Each build carries metadata: total property count, new properties added, properties marked withdrawn, properties marked sold, properties marked converted, persistent identifier version, address normalization version, and reclassification rule version.

This isn’t marketing language. The cadence is the moat. You cannot retroactively produce 600 weekly snapshots of a market that already happened. The data either exists or it doesn’t.

Enterprise customers can request build-level documentation under NDA for due diligence and model validation.

Proof 4

Coverage in numbers, not adjectives

Most Canadian property datasets say “national coverage” and leave it there. Here are the actual numbers from the most recent build:

Dataset	Count	Update frequency	History since
Residential real estate	5,978,973	Weekly	2014
Rental properties	875,288	Weekly (241 snapshots)	July 2021
Commercial real estate	314,884	Weekly	2014
Sold events (matched)	899,189	Weekly	2014
Apartment units	969,130	Weekly	2014
Commercial dual-listed	10,941	Weekly	2014

Provinces covered weekly: AB, BC, MB, NB, NL, NS, NT, NU, ON, PE, QC, SK, YT — all 10 provinces plus territories where MLS coverage exists.

Proof 5

Data lineage by design, not afterthought

Every record BrightCat publishes carries provenance metadata: the weekly build it came from, the source files that contributed, the address normalization version applied, the persistent identifier version, and the reclassification rules in effect.

AI agents using the BrightCat MCP connector can retrieve lineage metadata alongside the data itself. This matters because grounding LLM outputs requires knowing not just what the data says, but where it came from and how it was processed.

For traditional enterprise integration, the lineage fields are surfaced in Snowflake Marketplace, the API, and flat file delivery — not buried in a separate documentation page.

Buyers running model risk reviews, regulatory data lineage assessments, or AI governance audits work with the lineage layer directly.

See it for yourself

Methodology in writing. Data in Snowflake.

Read the methodology page in full, or start with a free sample evaluation on Snowflake Marketplace.

Read the methodology Sample on Snowflake