Hongkong Music Atlas
Explore Hong Kong’s cultural geography through Cantonese pop music
Explore Hong Kong’s cultural geography through Cantonese pop music
This project aims to visualize the endangered languages in the world in an interactive way.
This is a visualization tool for results from GNN FoodFlow Model
Explore the dynamic change of Queer Space in USA from 2000-2019
Published in 12th International Conference on Geographic Information Science (GIScience 2023), Leibniz International Proceedings in Informatics (LIPIcs), Volume 277, pp. 93:1-93:6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023
This paper investigates the ethics of using artificial intelligence (AI) in cartography, focusing on the generation of maps using DALL·E 2. We created an open-sourced dataset of synthetic (AI-generated) and real-world (human-designed) maps, examined four ethical concerns—namely inaccuracies, misleading information, unanticipated features, and irreproducibility—associated with DALL·E 2 generated maps, and developed a deep learning-based model to identify AI-generated maps. Our work emphasizes the importance of ethical considerations in AI-driven cartography and aims to raise public awareness and support the development of ethical guidelines for AI-generated maps.
Published in Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, 2023
We propose FLEE-GNN, a novel Federated Learning System for Edge-Enhanced Graph Neural Network, to analyze the geospatial resilience of multicommodity food flow networks. FLEE-GNN addresses challenges in generalizability, scalability, and data privacy, combining the strengths of graph neural networks and federated learning for robust, privacy-preserving analysis of food supply network resilience across regions.
Published in Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems, 2024
This study investigates the use of ChatGPT-4 to automate geospatial analysis workflows in GIS by generating ArcPy functions from structured instructions. The approach achieves an 80.5% task success rate, demonstrating its effectiveness and accessibility for domain scientists seeking to automate GIS workflows.
Published in ICLR 2025, 2025
We present ScienceAgentBench, a new benchmark for evaluating language agents for data-driven scientific discovery. ScienceAgentBench consists of 102 tasks extracted from 44 peer-reviewed publications across four disciplines, validated by nine subject matter experts. Each task requires generating a self-contained Python program, and is evaluated using multiple metrics on program correctness, execution, and cost. We assess five open-weight and proprietary LLMs with three frameworks, finding that the best-performing agent solves only 32.4% of tasks independently and 34.3% with expert knowledge. Our results highlight the need for rigorous, task-level assessment before making claims about end-to-end scientific automation.
Published in M.S. Thesis, University of Wisconsin–Madison, 2025
This thesis investigates the spatial and temporal distributions of urban coyotes in Los Angeles County by integrating citizen science data from iNaturalist with environmental, socioeconomic, and human mobility datasets. Using Random Forest, Geographically Weighted Regression, and structural equation modeling, the study reveals how ecological and anthropogenic factors, including real-time human mobility during the Covid-19 pandemic, shape coyote occurrence and visibility across neighborhoods.
Published in arXiv, under review for Transaction in GIS, 2025
We introduce GeoAnalystBench, a benchmark of 50 Python-based geoprocessing tasks for evaluating large language models (LLMs) in geospatial analysis and GIS workflow automation. Our results reveal a significant performance gap between proprietary and open-source models, highlighting both the promise and current limitations of LLMs for GeoAI.
Published in EMNLP 2025, 2025
We present AutoSDT, an automatic pipeline for collecting high-quality coding tasks in real-world data-driven scientific discovery workflows. AutoSDT-5K, the resulting dataset, contains 5,404 coding tasks across four scientific disciplines and 756 unique Python packages, enabling the training of LLM-based co-scientists. Models trained on AutoSDT-5K, dubbed AutoSDT-Coder, achieve state-of-the-art results on ScienceAgentBench and DiscoveryBench, closing the gap with proprietary models.
Published:
This talk explores the ethical considerations surrounding the use of AI-generated imagery in cartography, with a focus on DALL·E 2 and its implications for map-making. Presented at the 12th International Conference on GIScience in Leeds, UK (09/2023).
Published:
This talk presents novel approaches to improving wildlife classification accuracy in camera trap images by leveraging GIS-enhanced federated machine learning techniques. We discuss the integration of spatial data with distributed learning frameworks to address data privacy, heterogeneity, and scalability challenges in ecological monitoring.
Published:
This talk explores how ChatGPT-4 can be leveraged to automate and streamline geospatial analysis workflows. We discuss practical applications, integration strategies, and the potential for large language models to enhance productivity and reproducibility in spatial data science.
Published:
This talk presents a scalable approach for predicting inter-county food flows using Graph Neural Networks (GNNs). We discuss model architecture, data integration strategies, and the implications for food supply chain optimization at regional and national scales.