Products
Industries
Delivery
Resources
Company
Get Sample Data
METHODOLOGY PROOF

How we verify Canadian property data.

Most property data providers ask you to trust the output. BrightCat publishes how the output is built — what gets matched, what gets reclassified, what gets flagged. Methodology transparency for buyers who need to defend the data internally.

Built since
2014
Weekly builds
600+
Provinces covered
10 of 10
Relists reclassified
1.84M
Proof 1

The persistent property identifier

Every Canadian property in the BrightCat pipeline carries a persistent identifier — assigned the first time the property appears on market, and carried through every subsequent event the pipeline observes.

When that property re-lists with a new MLS number two years later, the persistent identifier still matches it. When it converts from a sale listing to a rental, the identifier stays the same. When the postal code redivides, the identifier is updated through address normalization without breaking the chain.

Without persistent identification, every relist looks like new inventory. Days on market resets to zero. Price cuts disappear. Models trained on the data inherit the noise.

This is why we built the identifier first, in 2014, before any of the other tracks.

Proof 2

1.84 million listings reclassified

Between 2014 and 2021, BrightCat identified and reclassified 1.84 million Canadian listings that arrived in source feeds marked NEW — but were actually re-lists of properties already known to the pipeline.

The reclassification logic uses address normalization, postal code verification, persistent property matching, and a defined RELISTED status. Properties identified as relists retain their original DOM accumulator, original list price, and cumulative price change history — rather than appearing as fresh inventory.

Why this matters for buyers: most third-party Canadian property datasets do not perform this reclassification. The result is that “new listings” counts get inflated, DOM averages get artificially compressed, and price discovery signals get obscured. Models trained on unreclassified data systematically miss the relist pattern.

We document the reclassification rules in the methodology page. Customers receive the rule version with each weekly build.

Proof 3

600+ consecutive weekly builds

Every Sunday since 2014, BrightCat has produced a weekly build — a versioned snapshot of every tracked Canadian property, every active listing, every event captured in the prior 7 days, and every reclassification applied.

Each build carries metadata: total property count, new properties added, properties marked withdrawn, properties marked sold, properties marked converted, persistent identifier version, address normalization version, and reclassification rule version.

This isn’t marketing language. The cadence is the moat. You cannot retroactively produce 600 weekly snapshots of a market that already happened. The data either exists or it doesn’t.

Enterprise customers can request build-level documentation under NDA for due diligence and model validation.

Proof 4

Coverage in numbers, not adjectives

Most Canadian property datasets say “national coverage” and leave it there. Here are the actual numbers from the most recent build:

Dataset Count Update frequency History since
Residential real estate 5,978,973 Weekly 2014
Rental properties 875,288 Weekly (241 snapshots) July 2021
Commercial real estate 314,884 Weekly 2014
Sold events (matched) 899,189 Weekly 2014
Apartment units 969,130 Weekly 2014
Commercial dual-listed 10,941 Weekly 2014
Provinces covered weekly: AB, BC, MB, NB, NL, NS, NT, NU, ON, PE, QC, SK, YT — all 10 provinces plus territories where MLS coverage exists.
Proof 5

Data lineage by design, not afterthought

Every record BrightCat publishes carries provenance metadata: the weekly build it came from, the source files that contributed, the address normalization version applied, the persistent identifier version, and the reclassification rules in effect.

AI agents using the BrightCat MCP connector can retrieve lineage metadata alongside the data itself. This matters because grounding LLM outputs requires knowing not just what the data says, but where it came from and how it was processed.

For traditional enterprise integration, the lineage fields are surfaced in Snowflake Marketplace, the API, and flat file delivery — not buried in a separate documentation page.

Buyers running model risk reviews, regulatory data lineage assessments, or AI governance audits work with the lineage layer directly.

See it for yourself

Methodology in writing. Data in Snowflake.

Read the methodology page in full, or start with a free sample evaluation on Snowflake Marketplace.

Read the methodology Sample on Snowflake