Machine learning is increasingly used to rank catalysts and accelerate discovery, but in energy catalysis the practical value of a model is rarely determined by ranking alone. Activity, selectivity and stability emerge from working catalysts and interfaces that depend on potential, solvent, temperature, pressure, transport and degradation, and these can differ substantially from the nominal materials used to build models. In this Review, we examine when screening-based workflows remain sufficient and when machine learning adds value beyond catalyst screening. We organize the field around three linked shifts, from static descriptors to operando-relevant representations, from offline prediction to closed-loop discovery, and from single-objective activity optimization to sustainability-aware multi-objective optimization. Using CO2 electroreduction, acidic oxygen evolution and higher alcohol synthesis as case studies, we show how these shifts can change catalyst ranking, experimental priorities and validation at the device or reactor level. We finally outline practical criteria for data reporting, extrapolative testing, uncertainty quantification and validation on the platform that defines success.




