Under the Hood Cleanups
Last updated
Was this helpful?
Last updated
Was this helpful?
Sheetwork has a few cleanups backed in that it applies to column names as well as to some of the content. For now, the only way to avoid these cleanups to be performed is to run sheetwork in , with the exception of which you can control in your sheets.ym
file.
If your google sheet contains columns that look like this ColumnWithCamelCasing
chances are your database client will make them look like `COLUMNWITHCAMELCASING`
which isn't the prettiest thing...
Sheetwork will automatically reformat camel cased columns to something that looks like this: column_with_camel_case
if you have enabled snake_case_camel: True
in your .
Another one that generally upsets your database client are special characters such as /
, .
etc. Sheetwork will convert those to underscores _
any character that is not a word (^\w\s
regex), unless it's a whitespace character.
removes any trailing _
characters (at beginning or end of a column)
Those are simply dropped, ingesting it would make it quite unpredictable if you expect to refer to later.
We lowercase the column names.
For now, we perform the following:
This one is a pretty nefarious thing to have in your data and we couldn't refrain from sanitising it. Any trailing whitespace will be removed. That means spaces before and after a string.
These are converted to your database equivalent for NULL
.
We try not to mess with the content much, that's probably more of an ETL process that other tools (such as ) are best suited for.