Information on soils’ composition and physical, chemical and biological properties is paramount to
elucidate agroecosystem functioning in space and over time. For this purpose, we developed a national Swiss
soil spectral library (SSL; n D 4374) in the mid-infrared (mid-IR), calibrating 16 properties from legacy measurements
on soils from the Swiss Biodiversity Monitoring program (BDM; n D 3778; 1094 sites) and the Swiss
long-term Soil Monitoring Network (NABO; n D 596; 71 sites). General models were trained with the interpretable
rule-based learner CUBIST, testing combinations of f5;10;20;50; and 100g ensembles of rules (committees)
and f2, 5, 7, and 9g nearest neighbors used for local averaging with repeated 10-fold cross-validation
grouped by location. To evaluate the information in spectra to facilitate long-term soil monitoring at a plot level,
we conducted 71 model transfers for the NABO sites to induce locally relevant information from the SSL, using
the data-driven sample selection method RS-LOCAL. In total, 10 soil properties were estimated with discrimination
capacity suitable for screening (R2 0:72; ratio of performance to interquartile distance (RPIQ)2.0), out
of which total carbon (C), organic C (OC), total nitrogen (N), pH and clay showed accuracy eligible for accurate
diagnostics (R2 > 0:8; RPIQ3.0). CUBIST and the spectra estimated total C accurately with the root mean
square error (RMSE)D8.4 gkg1 and the RPIQD4.3, while the measured range was 1–583 gkg1 and OC
with RMSED9.3 gkg1 and RPIQD3.4 (measured range 0–583 gkg1). Compared to the general statistical
learning approach, the local transfer approach – using two respective training samples – on average reduced the
RMSE of total C per site fourfold. We found that the selected SSL subsets were highly dissimilar compared to
validation samples, in terms of both their spectral input space and the measured values. This suggests that datadriven
selection with RS-LOCAL leverages chemical diversity in composition rather than similarity. Our results
suggest that mid-IR soil estimates were sufficiently accurate to support many soil applications that require a large
volume of input data, such as precision agriculture, soil C accounting and monitoring and digital soil mapping.
This SSL can be updated continuously, for example, with samples from deeper profiles and organic soils, so that
the measurement of key soil properties becomes even more accurate and efficient in the near future.