In urban railway systems, if every commuter were perfectly rational, concentrated demand on optimal routes would undermine system performance. Understanding the actual degree of collective route choice determinism, and whether it remains stable over time, is essential for transportation policy. Smartphone GPS data offers both the scale and complete trajectory coverage that survey data and smart card data individually lack, yet its coarse spatial accuracy has limited its adoption for railway route identification. To overcome this, we develop a methodology to identify railway commuting routes from one year of GPS data covering over one million daily users in the Tokyo metropolitan area. We apply a multinomial logit (MNL) framework, utilizing a standardization convention to extract a comparative measure of collective determinism alongside relative attribute preferences. We find that collective route choices in Tokyo are strongly cost-sensitive but non-deterministic. Both components remain stable across all twelve months of 2023, confirming that this determinism represents a structural consistency of the commuting system. Leveraging the dataset’s scale, we further estimate parameters for each origin–destination (OD) pair individually, revealing systematic heterogeneity driven by departure time and transport complexity that smaller datasets treat as unobserved.



