{"id":4183,"date":"2026-05-21T03:04:09","date_gmt":"2026-05-20T20:04:09","guid":{"rendered":"https:\/\/daiilynews.cu.ma\/?p=4183"},"modified":"2026-05-21T03:04:09","modified_gmt":"2026-05-20T20:04:09","slug":"a-more-efficient-family-of-earth-observation-models","status":"publish","type":"post","link":"https:\/\/daiilynews.cu.ma\/?p=4183","title":{"rendered":"A more efficient family of Earth observation models"},"content":{"rendered":"<p> <br \/>\n<br \/> <br \/>\n\ud83e\udde0 Models: https:\/\/huggingface.co\/collections\/allenai\/olmoearth | \ud83d\udcc4 Tech Report: https:\/\/allenai.org\/papers\/olmoearth_v1_1 | \ud83d\udcbb Code: https:\/\/github.com\/allenai\/olmoearth_pretrain<\/p>\n<p>We released OlmoEarth (v1) in November 2025. Since then, partners have applied it across a wide range of tasks, from tracking mangrove change to classifying drivers of forest loss to producing country-scale crop-type maps in days, scaling deployments to national, continental, and global areas. Every release moves us closer to our mission: bringing state-of-the-art AI to organizations and communities working to protect people and our planet.<br \/>\nWhen OlmoEarth processes satellite imagery to make predictions across tens to hundreds of thousands of square kilometers, efficiency shapes what\u2019s possible. Over the full lifecycle of running OlmoEarth \u2013 data export, preprocessing, inference, and post-processing \u2013 compute is by far the highest cost. A more efficient model means we can support more partners on the OlmoEarth Platform, and that anyone running OlmoEarth on their own can leverage this technology faster and at lower expense.<br \/>\nThat\u2019s why we built OlmoEarth v1.1: a new family of models that cuts compute costs by up to 3x while maintaining OlmoEarth v1&#8217;s performance on a mix of research benchmarks and tasks we\u2019ve constructed with partners.<\/p>\n<p>\t\tIncreasing efficiency by decreasing sequence lengths<\/p>\n<p>The OlmoEarth models are transformer-based models, one of the dominant architectures in machine learning today. To process remote sensing data, we first convert it into a sequence of tokens the model can ingest.<br \/>\nTwo important levers control efficiency in transformer-based models: model size (this is why we release a family of models, so users can pick the size that fits their compute budget) and token sequence length. Compute costs scale quadratically with the token sequence length, so even small reductions can meaningfully cut the cost of running the model.<\/p>\n<p>MACs, or multiply-accumulate operations, estimate the computation needed for one model forward pass; lower MACs generally mean cheaper, faster inference. The y-axis is inverted because lower average rank is better. Labels show model family and size. All plotted points use the pasted MAC\/rank values.<\/p>\n<p>\t\tDesigning the token<\/p>\n<p>This raises an important question for transformer-based remote sensing models: what should a token represent?<br \/>\nTake Sentinel-2 imagery, a common modality we process. A Sentinel-2 input will be some tensor with a height and width (H, W representing the latitudinal and longitudinal pixels), a temporal dimension T, and 12 Sentinel-2 channels ((H, W, T, D=12)).<\/p>\n<p>Currently, we split the data into resolution-based patches. Concretely, this means that we will pick some spatial patch size p, and split our overall Sentinel-2 image into patches of size p x p:<\/p>\n<p>For each patch, we create a token per timestep per resolution. So a Sentinel-2 input with 2 timesteps yields 6 tokens per patch (2 timesteps x 3 resolutions, 10m, 20m, and 60m).<br \/>\nIn total, a(H, W, T, D=12) Sentinel-2 input will yield H\/p x W\/p x T x 3 tokens.<br \/>\nUsing a unique token per resolution is a common technique when processing Sentinel-2 data\u2014Galileo and SatMAE both take this approach, and SatMAE shows significantly better results when doing it. However, it is not universal: CROMA is a model that only uses a single token for all bands, regardless of resolution. Because token counts compound multiplicatively, collapsing resolutions into a single token produces three times fewer tokens and material savings across pretraining, fine-tuning, and inference.<br \/>\nNaively combining the tokens in this way leads to significant performance drops, including a 10 ppt drop on m-eurosat kNN (a common benchmark task for remote sensing models). We hypothesize that separating Sentinel-2 bands into different tokens makes it easier for OlmoEarth to model important cross-band relationships.<br \/>\nMerging tokens without impacting performance required us to modify our pre-training regimen. We describe those changes in detail in our paper.<\/p>\n<p>\t\tFor developers<\/p>\n<p>The result is a model family that does more with less. At every size, OlmoEarth v1.1 runs up to three times cheaper than OlmoEarth v1, making frequent, planet-scale map refreshes more affordable for every team running OlmoEarth. If you&#8217;re using a model from the original OlmoEarth family, try OlmoEarth v1.1. It provides similar performance to OlmoEarth v1 while requiring one third of the compute, though we have seen some regressions (see our technical report for more details). If it works for your task, you should see a significant speedup during fine-tuning and inference.<\/p>\n<p>\t\tFor researchers<\/p>\n<p>Pretrained remote sensing models have many degrees of freedom, which makes them hard to study. When performance shifts, is it the architecture, the dataset, or the pre-training algorithm?<br \/>\nWe train OlmoEarth v1.1 on the same dataset as OlmoEarth v1, so any differences between the two isolate the effect of methodological changes. We hope this advances understanding of scientific principles when pretraining models for remote sensing.<\/p>\n<p>\t\tGet started<\/p>\n<p>Check out the OlmoEarth v1.1 weights and training code, including the weights for our Base, Tiny, and Nano models.\u00a0<br \/>\n<br \/><br \/>\n<br \/><a href=\"https:\/\/huggingface.co\/blog\/allenai\/olmoearth-v1-1\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\ud83e\udde0 Models: https:\/\/huggingface.co\/collections\/allenai\/olmoearth | \ud83d\udcc4 Tech Report: https:\/\/allenai.org\/papers\/olmoearth_v1_1 | \ud83d\udcbb Code: https:\/\/github.com\/allenai\/olmoearth_pretrain We released OlmoEarth (v1) in November 2025. Since then, partners have applied it across a wide range of tasks, from tracking mangrove change to classifying drivers of forest loss to producing country-scale crop-type maps in days, scaling deployments to national, continental, and global [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4184,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[676],"tags":[],"class_list":["post-4183","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-ai"],"_links":{"self":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/4183","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4183"}],"version-history":[{"count":0,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/4183\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/media\/4184"}],"wp:attachment":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4183"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4183"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4183"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}