Date of Graduation

Fall 2025

Degree

Master of Science in Mathematics

Department

Mathematics

Committee Chair

Songfeng Zheng

Abstract

ABSTRACT

Cities keep their own kind of ledger. Every block, bus stop, corner store, and year that slips by leaves a small entry about what homes are worth. That ledger is what we call socio-spatial data: simple facts about what a home is (its age), where it sits (latitude/longitude), how easy it is to get around (distance to the nearest MRT station), what’s nearby (number of convenience stores), and when it sold (transaction date). This thesis asks a practical question in that everyday language: given these common clues, can we predict home prices more accurately and explain why? Using 414 recorded transactions, we compared three approaches that rise in complexity: Ordinary Least Squares (OLS), Ridge regression, and a compact Multi-Layer Perceptron (MLP). We trained with a 70–15–15 split and early stopping, then judged performance out of sample using MAE, RMSE, and R², with visual checks (actual-vs-predicted, residual distributions), and a partial dependence view of how price changes with MRT distance. The results are consistent and practical. The MLP reduced typical error and cut large mistakes relative to linear baselines: MAE 5.01 vs. 5.49 (OLS) and 5.41 (Ridge); RMSE 6.59 vs. 7.20 and 7.14; R² 0.740 vs. 0.689 and 0.694 meaning tighter price bands where it matters. The distance story is clear and human: prices fall steeply as you move from the station out to roughly 600–700 m, then flatten to a gentle decline beyond ~1.5 km, a familiar pattern of convenience paying a premium. Two lessons follow. First, much of housing value really is written in the map: accessibility and neighborhood context carry weight even in small feature sets. Second, a modest neural model, used responsibly and explained with simple graphics, can turn those everyday clues into more reliable, fewer-surprise predictions. These balance strong linear baselines for transparency, a compact MLP for accuracy, and clear diagnostics offers a practical blueprint for valuation teams, planners, and lenders who work with the city’s ledger every day.

Keywords

socio-spatial data, real-estate valuation, deep learning, ridge regression, partial dependence, transit accessibility

Subject Categories

Applied Mathematics | Categorical Data Analysis | Numerical Analysis and Computation | Special Functions | Statistics and Probability

Copyright

© Gentle Engworo

Open Access

Share

COinS