Canadian Sold Data: Ground Truth of the Real Estate Market

Every property valuation model, every market index, every pricing estimate ultimately depends on one thing: what someone actually paid. Not what the listing price was. Not what a model estimated. Not what a comparable sale suggested. The confirmed transaction price. That is the ground truth.

In data terms, sold data represents confirmed property transaction prices — the factual record of what the market actually paid, linked to the full listing lifecycle that preceded the sale.

The difference between estimates and transactions

The Canadian property data market is full of estimates. Automated valuation models estimate what a property is worth. Assessment authorities estimate taxable value. Brokerages estimate market value using comparable sales. Indices estimate aggregate price movements by smoothing individual transactions into regional trends.

Estimates serve a purpose. They give you a directional view when a transaction has not yet occurred. But they are not facts. They are opinions derived from models, and models carry assumptions that may not hold in changing markets.

A sold price is not an opinion. It is the amount a buyer agreed to pay and a seller agreed to accept, verified through a legal transfer. It incorporates everything the models try to capture — condition, location, timing, negotiation dynamics — and resolves it into a single number. That number is the market speaking.

When an insurer needs to know what a property is worth for coverage purposes, the most reliable answer is the most recent sale price. When a lender needs to assess collateral, the most recent transaction is the anchor. When an investor needs to evaluate a portfolio, actual transaction prices are the denominator. Everything else is an approximation.

Why MLS numbers break property history

Most real estate data systems use MLS numbers as identifiers. This seems logical — every listing has a unique MLS number, so tracking properties by MLS number should work. Except it doesn't, because MLS numbers change every time a property relists.

If a property lists in January and doesn't sell, gets pulled in April, and relists in September with a new agent, it gets a new MLS number. The two listings are the same property, but they look like different properties in MLS-based systems. Any price change, any time-on-market calculation, any lifecycle analysis that spans both listing periods is broken.

This problem compounds when you try to connect listings to sales. A property might list under one MLS number, sell, and then appear in the sold record under the MLS number that was active at the time of sale. If the property relisted during the sales process, the sold record may reference an MLS number that the listing system no longer associates with the original listing.

BrightCat does not use MLS numbers as property identifiers. Instead, we match on a stable property identity that does not change when a property relists, changes agents, or moves between brokerages. One property. One identity. Full transaction history preserved.

What 899K+ sold records reveal

BrightCat has matched 899K+ sold events to their full listing lifecycles across Canada. Each record connects the transaction price to the property's complete history: when it first appeared, every time it was listed, every price change along the way, the final days on market, and the difference between listing price and sold price.

That connection is where the intelligence lives. A property that listed at $650,000 and sold at $620,000 after 90 days and two price drops tells a very different story than one that listed at $500,000 and sold at $530,000 in 14 days. The sold price alone doesn't capture the journey. The lifecycle-linked sold record does.

A substantial share of properties in BrightCat's coverage have sold more than once — generating repeat-sale data, the same property sold at two different points in time. Across those, 194K+ pairs have verified dual prices: a confirmed earlier sale price and a confirmed later sale price for the same physical property.

How repeat-sale pairs build a home price index

A repeat-sale home price index (HPI) measures property appreciation by comparing the same property's sale price over time. This removes the composition bias that plagues median-price indices — where a shift in the type of properties selling (more condos vs houses, for example) can move the median even if no individual property changed in value.

BrightCat's HPI is built from verified repeat-sale transaction pairs. Each pair represents a property that sold at a verified price, and then sold again at a later verified price. The price difference, adjusted for the time interval, measures actual property-level appreciation.

This is not a model. It is not an estimate. It is arithmetic applied to confirmed transactions. The only assumption is that the property itself is substantially the same between sales — which is why the HPI methodology also incorporates renovation signal detection at the property level.

Most publicly available HPIs in Canada are either assessment-based (CMHC) or sales-mix adjusted (CREA). Both have value. But neither measures the same thing as a property-level repeat-sale index built from confirmed transactions.

The sale-to-rent signal

When a property sells and then appears as a rental listing shortly afterward, that is an investor purchase. Not a guess. Not a model. The property was bought, and then it was rented out. The observation is in the data.

This signal matters for insurance underwriters who need to distinguish owner-occupied from investor-owned properties, because the risk profiles are different. It matters for lenders who want to understand whether borrowers are buying primary residences or investment properties. And it matters for anyone tracking institutional capital flowing into the residential housing market.

Cross-referencing sold data against rental listings requires both datasets at the property level, matched on address. If your sold data and your rental data use different identifiers — which they do in most systems — you cannot make this connection. BrightCat's proprietary property-level matching makes it possible.

Why backward-looking data matters for forward decisions

Sold data is backward-looking by nature. A transaction is recorded after it happens. But the patterns within sold data are predictive.

When the gap between listing prices and sold prices widens across a market, that signals pricing pressure — sellers are asking more than buyers will pay. When the volume of repeat sales increases, that may signal speculative activity. When properties are selling below their previous sale price for the first time in years, that is a correction forming in real time.

These patterns are only visible with longitudinal data — sold records stretching back years, linked to listing history, at the property level. A point-in-time snapshot of recent sales tells you what happened last month. A decade of property-level transaction data tells you what is happening to the market.

Sold data is the ground truth because it is the only data source that records what actually happened. Everything else — listings, assessments, models, indices — is either a leading indicator of what might happen or a lagging interpretation of what did. The transaction price is the fact.

Models estimate. Indices smooth. Reports lag. Transactions don't. Sold data is the factual foundation of every property decision — and the only source that tells you what the market actually paid.

Derived from BrightCat Sold data · 750K+ properties with sold history · View schema

Why sold data is the ground truth for pricing