top of page

Don't Overpay For That House!

Buying a home is a huge investment, the obvious one being $$$, and the Seattle housing market has been on fire for the past 5 years. Let's see what Zillow has to say:

And that's based on what Zillow thinks homes are worth on a given day. Having gone through the home buying process in recent history, my husband and I submitted what we thought were strong offers...and then getting turned down again and again. It turned into a weekly cycle of cautious optimism turning to disappointment and it was during this time that our rational, level-headedness got tangled with emotion. I mean, sure you DO want to fall in love with what might become your future home, but you don't want to end up living off ramen packets to do so (right?). It's hard NOT to be emotional during this process, as you write your love letter to the seller telling them how you'll care for the house, integrate with the neighborhood, create new memories in the home.

Those of you unfamiliar with the home buying process, obviously part of the offer includes what you want to pay for the house, but there's also this thing called an escalation clause. Essentially, it is a backup number for the maximum cost that you are willing to pay for the seller's home. This is a tool a buyer has available should s/he anticipate multiple bids. In Seattle, in this market, yes, you will be using the escalation clause. I wonder how many prospective buyers have an emotionally fueled ("I love this house and need to have it" vs "I've spent the last 4 months worth of free time looking at houses and can't take this anymore") number to write in that line.

The escalation clause is what I had in mind when I started this project. Earlier in July, I used data scraped from Zillow to predict housing prices in part of North Seattle. Using BeautifulSoup4, I looked at homes sold in the last 3 months. I ended up with a small dataset, but it was a valuable exercise in moving from project inception to analysis and presentation within a 2 week period.

Pretty standard so far, I think. Losing ~15% of my data was due to various reasons, most of the time it was because the entry came back as empty. Since I had the sold price and was predicting a house price based on a number of standard features, this was a straight-forward linear regression. Using the 4 features mentioned in the above slide predicted pretty well what the sold price would be.

I also looked at each feature individually and square footage correlated highest with sold price.

You can see that year built correlated weakly. Looking at the plot, it might be better to consider year built as part of a separate model, perhaps one looking at other period-specific home amenities, that can be combined with this one. I wonder if homes built in between 1960 and 1990 were undesirable, or the time period I happened to look at didn't contain any homes built during this time period. And wow, one old home sold for a lot of money!

If you would like to check out what I did, it lives on GitHub here.

bottom of page