Payload Logo

Your Browser Can't Handle Real Geospatial Data. Here's Why.

Date Published

The moment everything freezes

You upload a CSV. A million rows of parcel data, maybe GPS traces from a fleet of delivery trucks, or transaction records from every real estate sale in California over the past decade. The upload bar fills. The spinner spins. And then—nothing. Your browser tab goes white. Chrome's memory usage climbs past 2GB, then 3GB. Your laptop fan screams. Eventually, if you're lucky, you get an error message. More likely, the tab just crashes. This is the dirty secret of browser-based geospatial tools.

Tools like kepler.gl , Foursquare Studio, and SQLRooms are genuinely impressive. They've brought sophisticated geospatial visualization to anyone with a web browser—no ArcGIS license required, no server infrastructure to maintain. But they share a fundamental limitation that nobody likes to talk about: they weren't built for the datasets that real geospatial analysts actually work with.

The 250 megabyte ceiling (and other hard limits)

Let's start with the obvious constraint. Kepler.gl's own documentation states it plainly: "there is a 250mb limit on how much data Chrome allows you to upload into a browser." That sounds like a lot until you realize that a typical county-level parcel dataset can easily hit 500MB. A year's worth of ride-share trip data? Forget about it.

But the upload limit is just the first wall you hit. The deeper problem is WebAssembly memory. DuckDB WASM—the engine that powers SQL queries in many of these browser-based tools—runs in a sandbox that "limits the amount of available memory to 4 GB " according to DuckDB's official documentation. And browsers often impose even stricter limits in practice.

Here's the thing: that 4GB isn't just for your data. It's shared with everything else running in your browser tab—the JavaScript runtime, the WebGL rendering engine, the DOM, the visualization library itself. As one engineering team at Motif Analytics discovered, "memory available to WASM is limited by the browser... this can be learnt painfully when trying to load larger datasets or doing non-trivial joins."

DuckDB WASM memory often stays at ~2GB after query execution and isn't released until you kill the entire browser tab.

When a million rows is considered small

The gap between what these tools can handle and what analysts actually need is almost comical. Google BigQuery's browser visualization explicitly states that "performance is subject to browser capabilities" and renders only "up to approximately one million vertices, 20,000 rows, or 128 MB of results." Twenty thousand rows. That's not a dataset—that's a sample.

ESRI's engineers, writing about large-scale web visualization, put it bluntly: "4 GB of data is too large in any scenario." They had a 52-million point ocean dataset and had to reduce it by 98% just to get it to render. Kinetica's team noted that "web browsers struggle to handle more than a few thousand features." A few thousand. When real-world geospatial analysis routinely involves hundreds of millions of records.

Even Foursquare, the company that maintains kepler.gl and has invested heavily in browser-based geospatial tooling, acknowledges this reality. Their documentation notes that "real world datasets are often bigger. 100 million row datasets are no longer uncommon." They're building solutions, but they're also being honest about the problem.

The CPU is doing too much work

Memory isn't even the whole story. Kepler.gl's development team has been upfront about performance issues stemming from architecture choices. Their 2019 roadmap noted: "We see complaints about the app crashing, lags during filtering, aggregation and domain calculation. Most of it is because we are still processing large array of data in CPU ." CPU-bound operations in a browser context mean the UI freezes. Every filter slider becomes a game of patience.

The fix—moving more computation to the GPU—helps with rendering. But it doesn't solve the fundamental issue: the data itself has to live somewhere. And in a browser, "somewhere" is an increasingly cramped space that you're sharing with every other tab, extension, and background process.

Why hexagons change everything

The solution isn't to build a bigger browser. It's to be smarter about what data you actually need at any given moment.

Enter H3 —Uber's hexagonal hierarchical spatial indexing system. H3 divides the entire planet into a grid of hexagons at 16 different resolutions. The coarsest level (resolution 0) creates cells averaging about 4.3 million square kilometers. The finest (resolution 15) produces cells smaller than a square meter. And here's the clever part: the system is hierarchical. Every large hexagon contains exactly seven smaller hexagons at the next finer resolution.

Why hexagons instead of squares? As Uber's engineering team explains, "hexagons have only one distance between a hexagon centerpoint and its neighbors'," which "greatly simplifies performing analysis and smoothing over gradients." Hexagons tile more naturally, handle edge cases better, and produce cleaner aggregations. It's geometry that works with you instead of against you.

H3 cells at resolution 15 are smaller than one square meter—precise enough to index individual parking spots.

Loading only what you can see

The real power of H3-based tiling isn't the hexagons themselves—it's what they enable. When your data is pre-indexed into hierarchical tiles, you don't need to load everything. You load what's visible in the current viewport at the appropriate resolution. Zoom out, and the system swaps in coarser aggregations. Zoom in, and finer-grained data streams in. Foursquare's Hex Tiles system demonstrates this approach: "The Hex Tile system intelligently loads and unloads data from your browser, allowing a user to fluidly visualize planetary-scale data at any level of granularity."

This isn't a hack or a workaround. It's a fundamentally different architecture. Instead of shoving a billion rows into browser memory and hoping for the best, you're maintaining a constant memory footprint regardless of total dataset size. The browser only ever holds what it needs for the current view.

TopHap's approach: 47 trillion records that actually work

TopHap, a real estate analytics platform, has taken this tiling approach to its logical extreme. Their platform serves 47 trillion records covering 150 million parcels across the United States—everything from property boundaries to environmental data to historical transaction records. And it runs in a browser without melting your laptop.

The secret isn't magic. It's efficient pre-processing. Data gets broken into geospatially-indexed H3 tiles before users ever touch it. When someone navigates the map, the system serves precisely the tiles needed for that view—aggregated at an appropriate resolution for the zoom level. You can analyze millions of data points because you're never actually loading millions of data points at once.

"The platform integrates hundreds of data layers consisting of billions of data points into its visual analytics," as their team describes it. That's not hyperbole. It's the result of understanding that browser limitations aren't going away—so you architect around them.

Pre-tiled geospatial data can enable browser visualization of datasets 1000x larger than raw file uploads would allow.

The server-side compromise that isn't a compromise

Here's the counterintuitive truth: doing more work on the server makes the browser experience feel more powerful, not less. When heavy lifting happens at upload time—indexing data into efficient tile structures, pre-computing aggregations, building spatial hierarchies—the runtime experience becomes nearly instant.

Foursquare has recognized this with their new Spatial Desktop product, which "breaks kepler.gl free from browser limitations" by combining "GPU rendering with native DuckDB performance to handle tens of millions of points." The key phrase there: native DuckDB. Not WASM. Running locally, without the browser memory constraints.

But desktop apps aren't always practical. For many use cases, browser-based tools are the only option. The answer there is smart server-side preparation: pre-tile your data, use formats like PMTiles that support efficient partial reads, and let the browser do what browsers are good at—rendering—while keeping the heavy data processing elsewhere.

What this means for your workflow

If you're hitting walls with current geospatial visualization tools, here's the reality check: the tools aren't broken. They're working exactly as designed. They just weren't designed for your dataset.

The options forward:

Pre-process your data. Convert raw CSVs into tiled formats before uploading. Tools exist for this—Mapbox's Tippecanoe, various H3 libraries, proprietary systems like TopHap's pipeline. The upfront investment pays off in usability.

Consider hybrid architectures. Run heavy queries on a proper database (DuckDB running natively, PostGIS, BigQuery) and send only visualization-ready subsets to the browser.

Look for platforms built for scale. TopHap, Kinetica, and others have solved these problems already. Sometimes the right answer is using infrastructure someone else built rather than fighting browser constraints yourself.

The ceiling is real, but it's not the end

Browser-based geospatial tools have democratized mapping and spatial analysis in genuinely important ways. A decade ago, rendering a million points interactively required expensive software and significant expertise. Now anyone can do it—until they can't.

The limitation at a million rows isn't a bug. It's physics. Or at least, it's the physics of WebAssembly memory allocation and Chrome's tab sandbox. But the solutions exist. H3 tiling, intelligent viewport-based loading, server-side pre-processing—these aren't future technologies. They're available now, in production systems handling datasets orders of magnitude larger than what browser-only tools support.

The question isn't whether you can work with massive geospatial datasets in browser-based tools. It's whether you're using tools that were architected for massive datasets from the start.

References

kepler.gl Documentation

DuckDB WASM Documentation

Motif Analytics

Google Cloud BigQuery

ESRI Blog

Kinetica

Foursquare Engineering

kepler.gl GitHub Roadmap

H3 Documentation

Uber Engineering Blog

Foursquare Hex Tiles

TopHap

TopHap Company

Foursquare Spatial Products