Back to Blog

Why the Best Real Estate Analytics Run on Quality Data

Published by Jordan Girard on October 18th, 2020

Why Real Estate Analytics Data Quality

Real estate analytics platforms need quality data to provide meaningful results. Here's how quality data can enhance your investment decisions.

When it comes to AI-based projects, you'll often hear the term "big data," as if the amount of data is the only thing that matters. While more information is always positive and often correlates with better results, the data's quality is often much more predictive of a real estate analytics platform's success than just having more of it.

Intuitively, this makes sense. Lousy data is going to lead to bad outcomes. However, the notion of quality has many nuances to it that we, at Whiterock, deal with regularly. Here's what you need to know about quality data, including what that means and how it powers the best AI real estate analytics platforms today!

Real Estate Analytics: How Does Quality Data Impact Results?

When people think of data quality, they often think of self-evident errors. For example, if a real estate analytics platform has comparable rents at $10,000 per month instead of $1,000 per month, the platform will return incorrect results. It would think that you'll likely get $10,000 in rent, which will throw off all subsequent calculations. Your IRR (Internal Rate of Return) will be wrong, your profit numbers will be off, etc.

There was a humorous example of this type of poor-quality data in Flight Simulator, a recently released Microsoft game. Users flying around Melbourne, Australia noticed an impossibly tall and narrow skyscraper amongst a suburb of houses. It turns out that OpenStreetMap had this one building as 212 floors, which was a typo. The developers pulled the data when this typo existed, which led to this bizarre artifact within the game itself.

However, while these types of egregious data violations can happen, they're also usually self-evident and self-correctable. At Whiterock, we use multiple data sources, 50% public, and 50% private, for cross-verification purposes. The above error example could have been mitigated by having a second data source, which, presumably, wouldn't have had the same typo.

Other Data Errors

Obvious data errors like the one above are not the only type of data problems that can arise. One can have correct data, but that doesn't necessarily make it high-quality. The input data for the real estate analytics platform should help it address the questions that investors have.

Data that have implicit biases, for example, can have real-world implications for investors. Fidelity had a fantastic report on behavioral biases in real estate investing and how it can have problematic consequences for investors. Investors seek an analytics platform to avoid those implicit biases, so the data must be free from those.

Poor-quality data can also have omissions. It's easy to spot the 212-floor data error, but it would be much tougher to spot if that building didn't exist at all in the data. Again, cross-referencing and continuous validation can remove this data error for real estate analytics platforms.


Without quality data, a real estate analytics platform is not only ineffective; it's downright dangerous. Insufficient or bad-quality data can lead to catastrophic investment decisions because the calculations you see will be completely false. You'll not only make bad investments, but you'll have missed opportunities as well.