Canadian property data is not a single thing. It is five distinct categories of information, sourced differently, structured differently, and licensed under different terms. Enterprise teams shopping for property data for the first time tend to assume there is one market; there are several, and picking the wrong one wastes procurement cycles. This is a guide to the categories, what each is good for, and how they are delivered.
Roughly 475,000 residential properties change hands each year through the cooperative listing systems run by Canadian real estate boards. The Canadian Real Estate Association, which compiles national figures, reported a national average sale price near the $675,000 mark in early 2026, with active inventory running below long-term averages for most of the past year. Those two numbers, transaction volume and average price, anchor every downstream property data product sold into the Canadian market.
Transaction activity is heavily concentrated by province. Ontario and British Columbia account for the largest share of national volume, with Alberta and Quebec behind them. The Canada Mortgage and Housing Corporation estimates a national housing supply shortfall of roughly 3.45 million units by 2030, concentrated in Ontario, Quebec, and British Columbia. These fundamentals shape why different property data categories exist and which ones are in demand.
Every property data product in Canada fits into one of five categories. The categories are not interchangeable. Each was built for a specific use case, and that original use case still shapes how the data is structured today.
Provincial land-registry systems record the legal status of each property: who owns it, what liens or charges are registered against it, what the legal description is. These records exist because ownership has to be documented for the property to be enforceable as collateral, inheritable, or transferable.
Registry data is authoritative for its own purpose. If the question is who legally owns a property right now, registry records are the answer. The limits of registry data become obvious when the question is anything else. Prices recorded in registries are the consideration at closing, which may or may not match the actual sale price; transfer data lags the market by weeks or months; there is no standard schema across provinces. For use cases like valuation, market analysis, or behavioural targeting, registry data is a necessary reference but not a sufficient source.
Registry data is typically accessed through provincial portals, title-insurance companies, or specialised aggregators. Licensing is usually per-query or per-record.
Appraisal data is produced by licensed appraisers for specific properties at specific moments, usually to support a mortgage origination, a legal proceeding, or a property sale. Assessment data is produced by municipal assessors to set property tax rates, typically on a multi-year revaluation cycle.
Both are valuation data, and both are point-in-time. An appraisal tells you what a single qualified professional estimated the property was worth on one date. An assessment tells you what a municipality decided the property was worth for tax purposes on one date. Neither refreshes at market speed; an assessment roll may be three or four years old, and an appraisal is current only on the day it was signed.
Appraisal data is primarily used by lenders and is rarely licensed outside that channel. Assessment data is publicly available in most provinces and is a common input into property databases, though coverage varies. For enterprise use cases requiring current valuation, appraisal and assessment data are inputs, not primary sources.
Multiple listing systems are the cooperative marketing platforms run by Canadian real estate boards. When a property is listed for sale, the listing enters an MLS; when it sells, the transaction is recorded in the MLS. This is the primary data source for most active-market questions: what is for sale right now, what recently sold, at what price.
MLS data is current, detailed, and structured. It is also fragmented. There are dozens of boards across Canada, each with its own schema, its own rules, and its own licensing terms. Accessing MLS data at scale involves negotiating with multiple boards, accepting different data formats, and reconciling field definitions that do not always align.
The deeper limit of MLS data is temporal. MLS numbers are reassigned when a property relists, which means a single property that sold, came back on the market, sold again, and was then leased out may appear as three or four unrelated records in an MLS-native system. Reconstructing a property's history across these events requires joining on something more stable than the listing number.
A repeat-sale series is a dataset of property pairs: two verified sales of the same property at different points in time. Repeat-sale methodology is the standard input for serious home-price indices and automated valuation models because it separates genuine price change at the property level from general market movement and property-attribute mix.
Building repeat-sale pairs well is harder than it sounds. Each pair requires two confirmed sale prices, two confirmed sale dates, and a stable link between the two records that survives relisting, address variation, and MLS number reassignment. That stable link is what the industry calls a persistent property identifier. Without one, most properties that sold more than once never get paired.
BrightCat's Canadian Home Price Index dataset contains 194,167 verified repeat-sale pairs across all ten provinces, drawn from sale events reconciled through BrightCat's own pipeline since 2014. Each pair satisfies four conditions: both sales are of the same property, linked by a persistent property identifier; both sale prices are verified; both transaction dates are confirmed; and a minimum ninety-day gap separates the two sales. The AVM training data page covers the methodology in more depth.
Listing-lifecycle data is the newest category and the one most closely aligned with analytics and AI use cases. A lifecycle record is not just the sale or the current listing; it is the full sequence of events a property goes through in the market. The original listing, every price change, every status transition, every relist, every drop, every completed sale. Lifecycle data captures market behaviour rather than market snapshots.
The value of lifecycle data sits in the patterns between transactions. A property that listed at one price, dropped through three reductions, delisted, waited six months, relisted at a lower price, and eventually sold is telling a different story than a property that sold cleanly on first exposure. MLS data, queried at the transaction layer, shows both as a single sale. Lifecycle data shows the difference.
Lifecycle data also enables cross-track signals. A property that sells and then appears as a rental listing at the same address within a short window is almost certainly an investment property, not an owner-occupier transaction. That signal requires joining sale events and rental events through a persistent property identifier, weekly or better. It is not visible in any single-track data source.
BrightCat's pipeline operates in the lifecycle category. It covers 5.8 million residential properties and 297,000 commercial properties, with weekly capture across all ten provinces since 2014. The methodology page describes how the pipeline is assembled.
Use-case mapping to category is straightforward once the categories are clear:
Most enterprise deployments combine at least two categories. An insurance carrier running a retention model on a policy book might combine registry data (to confirm ownership), MLS transaction records (to flag recent sales), and lifecycle data (to identify sale-to-rent conversions that change the underwriting profile). A bank running collateral monitoring might combine appraisal data, assessment data, and lifecycle signals for the same reason.
Delivery architecture has changed faster than the data categories themselves. Traditional property data arrived as nightly FTP files or monthly CSV extracts. Current enterprise delivery usually means one of three patterns:
The underlying data is the same regardless of channel. BrightCat ships all three patterns from the same weekly pipeline. The Snowflake Marketplace, MCP Connector, and Developer API pages cover the specifics.
Tell us what you are building. We will match the right dataset.