Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ConnectorX: Accelerating Data Loading from Databases to Dataframes [pdf] (vldb.org)
3 points by gruuya on Oct 28, 2022 | hide | past | favorite | 1 comment


What really struck me here at first was how Pandas read_sql spends so little time on the actual query execution and data transfer, while client side processing is taking up the majority (~85%) of time.

It makes more sense though, once you realise that they're talking about unsaturated networks, and so they can focus on relatively simple optimisation techniques (e.g. query partitioning and zero-copy) to bring about significant speedup in data loading.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: