Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Rust visitor pattern and efficient DataFusion query federation (splitgraph.com)
2 points by gruuya on Dec 20, 2022 | hide | past | favorite | 3 comments


For comparison, the implementation of the same feature in ClickHouse:

https://github.com/ClickHouse/ClickHouse/blob/master/src/Sto...

https://github.com/ClickHouse/ClickHouse/blob/master/src/Sto...

ClickHouse allows federated queries with MySQL, Postgres, ODBC and JDBC data sources.


Nice, looks familiar! Any plans for supporting aggregation pushdowns (we have had some experience with that in Postgres/Multicorn[1][2])?

Though, I imagine there's a region in the data size/network throughput/latency space where simply fetching the data and then doing analytics in ClickHouse is more performant than actually going for the pushdown.[3]

[1] https://www.splitgraph.com/blog/postgresql-fdw-aggregation-p... [2] https://www.splitgraph.com/blog/postgresql-fdw-aggregation-p... [3] https://duckdb.org/2022/09/30/postgres-scanner.html


We have tested it on a few queries with aggregation (with low cardinality results). For MySQL, fetching the raw data and doing aggregation in ClickHouse appeared to be faster. For PostgreSQL it was identical (at least Postgres did not do significantly more work for aggregation than for data reading). It also depends on the network, but at least it was not limited by 10 Gbit network.

Automatic pushdown of aggregations is currently not considered, but we consider a syntax to allow explicitly push down a whole subquery.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: