Skip to content

Conversation

@gingerwizard
Copy link

@gingerwizard gingerwizard commented Dec 31, 2022

ClickHouse is a popular OSS OLAP db and a natural fit for this data

@dex-sv
Copy link

dex-sv commented Feb 20, 2023

Clickhouse is not an OLTP db

@tbragin
Copy link

tbragin commented Feb 21, 2023

I'm sure that's a typo. @gingerwizard let's fix :) ClickHouse is an OLAP database.

@gingerwizard
Copy link
Author

Fixed 🤦

@dex-sv
Copy link

dex-sv commented Mar 10, 2023

Hi, I did a few experiments with the code from this PR and I got errors like those:

OverflowError: int too big to convert
clickhouse_connect.driver.exceptions.ProgrammingError: Internal serialization error. This usually indicates invalid data types in an inserted row or column

The errors are caused by attempting to insert large integer values into numeric columns.

Could you please share some info about the decisions on the data types used in the tables?

  • Decimal(38, 0) for value - why 38, 0? Why decimal? The input data seems to be Python int type.
  • Int64 - why not UInt64
  • Int64 for the timestamps - why not UInt32?

@gingerwizard
Copy link
Author

I will PR the fix for this and correct these types .

Decimal(38, 0) was selected as this is the eqv. of what BigQuery applied. I assumed it could grow beyond the range of a UInt64. I'll check the data and update.

For timestamps Id like to move to the explicit DATETIME type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants