GeoJSON
| Input | Output | Alias |
|---|---|---|
| ✔ | ✗ |
Description
Reads a GeoJSON FeatureCollection document and produces one row per feature. Each row has the following fixed schema:
| Column | Type | Description |
|---|---|---|
id | String | The feature's id member (a JSON string or number), stored as text; an empty string if the id is absent or null. |
geometry | Geometry | The feature's geometry, stored as a Geometry variant type. |
properties | Nullable(JSON) | The feature's properties object, stored as a semi-structured JSON column. An explicit "properties": null is preserved as NULL. |
Each geometry is stored in ClickHouse's Geometry type (a Variant). The supported GeoJSON geometry types are Point, LineString, MultiLineString, Polygon, and MultiPolygon. The two other GeoJSON geometry types, GeometryCollection and MultiPoint, cannot be represented by the Geometry type; reading one into the geometry column raises an exception by default, which can be changed to insert NULL instead — see Handling unsupported geometry types below. By default, the geometry column is NULL only when a feature's geometry is an explicit JSON null; under input_format_geojson_unsupported_geometry_handling = 'null' it is also NULL for an unsupported geometry type.
The document's structure is validated: the top-level type must be FeatureCollection and every element of features must have type Feature. Coordinates must satisfy the GeoJSON shape invariants — a LineString (and each line of a MultiLineString) must have at least two positions, and a Polygon ring (and each ring of a MultiPolygon) must be closed and have at least four positions. Malformed documents are rejected rather than silently loaded.
Other keys in the FeatureCollection object (such as name or crs) and other keys inside each Feature object (such as bbox) are ignored.
Key ordering is flexible: the top-level type may appear before or after the features array, and within a geometry object coordinates may appear before or after type.
Schema inference returns the fixed schema above, so DESCRIBE and SELECT ... FROM format(...) work without a table definition.
Example usage
Given the following GeoJSON file london.geojson containing a mix of geometry types:
We can query the file and inspect geometry types:
The file extension .geojson is automatically detected, so the format argument can be omitted:
We can use variantType to check the underlying type of each Geometry object:
And we can extract the underlying data like this:
Accessing a Geometry subcolumn returns the value when the row holds that type, and the type's default otherwise — (0,0) for Point and [] for the array-based types — so use variantType(geometry) to tell which one is set.
We can also ingest GeoJSON data into a table:
Then query by feature type:
We can also infer the schema of GeoJSON data without a table definition:
Handling unsupported geometry types
Some valid GeoJSON geometry types — such as GeometryCollection and MultiPoint — can't be represented by ClickHouse's Geometry type. You can control what happens when such a geometry must be stored in the geometry column using the input_format_geojson_unsupported_geometry_handling setting. Possible values are:
'throw'— throw an exception (default)'null'— insert aNULLvalue for thegeometrycolumn and continue parsing
This handling applies only when the geometry column is read. When geometry is not a requested output column (for example SELECT id FROM ...), an unsupported geometry is still validated for well-formedness but does not trigger the handling — it neither throws nor inserts NULL, because no geometry value is materialized.