GeoAnalystBench: A GeoAI benchmark for assessing large language models for spatial analysis workflow and code generation
Published in arXiv, under review for Transaction in GIS, 2025
We introduce GeoAnalystBench, a benchmark of 50 Python-based geoprocessing tasks for evaluating large language models (LLMs) in geospatial analysis and GIS workflow automation. Our results reveal a significant performance gap between proprietary and open-source models, highlighting both the promise and current limitations of LLMs for GeoAI.