sheetwork
Contribute on GitHubChat on Discord
v1.0.0 Nicolas Jaar
v1.0.0 Nicolas Jaar
  • Introduction
  • Installation & Configuration
    • Installation & Update
    • Configuration
      • Set up your sheetwork project
      • Connecting to Google Sheets
      • Set up your sheetwork profile
  • Usage
    • Quick CLI
    • Using sheets.yml configuration
      • Basic Configuration
      • Column Operations & Advanced Controls
  • Sheetwork Operations
    • sheetwork init
    • sheetwork upload
    • Under the Hood Cleanups
  • FAQ
    • Do I need to list ALL the columns in my sheets config?
    • Do the column names need to be in the format of the original sheet?
    • What kind of cleanups or reformating does sheetwork do?
Powered by GitBook
On this page
  • Column Transformations
  • Data type casting
  • Column Renaming
  • Column Exclusion
  • Column Name Formatting
  • Not a fan of CamelCase?

Was this helpful?

  1. Usage
  2. Using sheets.yml configuration

Column Operations & Advanced Controls

Column Transformations

Sheetwork can perform a series of transformation on your columns. You'll set these up by creating a new columns block in your yml as you'll see below

Data type casting

By default, everything that comes from a Google Sheet will be loaded as a string. But you might not want it to land in your database as a string. Here comes column-based casting! Building on the previous example here's what you could do:

sheets:
  - sheet_name: test_sheet
    sheet_key: 10JJdwwd_lm4556uwedu4zu5u6r0h2VIDTjRXg
    target_schema: sandbox
    target_table: my_google_sheet
    # create a new columns block
    columns:
      - name: column_a # name of the column in your sheet
        datatype: int

Supported Data Types

Currently supported data types are:

int: integer
numeric: decimals and other kinds of non-whole number (precision 38,18)
varchar: for strings
date: for dates
boolean: for True False
timestamp: for a datetime object

Column Renaming

This one is fun, it's not uncommon that your google sheet might contain a name that doesn't fit in your database or that is not up to your conventions. You can rename if easily as follows:

sheets:
  - sheet_name: test_sheet
    sheet_key: 10JJdwwd_lm4556uwedu4zu5u6r0h2VIDTjRXg
    target_schema: sandbox
    target_table: my_google_sheet
    # create a new columns block
    columns:
      - name: column_a # name of the column in your sheet
        datatype: int
        identifier: "An AwkwardlyNamed Column Gosh that is ugly"

The identifier: field should be the exact copy of the Google Sheet column you want to rename. That is why we store it between " the name: field becomes the way you'll refer to the new column from now on. When no identifier is provided name will need to be the exact column in the google sheet.

Column Exclusion

It's not uncommon for google sheets to contain a lot more columns than you'd want. That's where exclude_colums: comes in handy! You can list as a many columns as you want to exclude from the import by adding a list of columns as shown below.

sheets:
  - sheet_name: test_sheet
    sheet_key: 10JJdwwd_lm4556uwedu4zu5u6r0h2VIDTjRXg
    target_schema: sandbox
    target_table: my_google_sheet
    # create a new columns block
    columns:
      - name: column_a # name of the column in your sheet
        datatype: int
        identifier: "An AwkwardlyNamed Column Gosh that is ugly"
    excluded_columns: ['Bad COlumn', 'another bad one I do not want']

Column Name Formatting

Not a fan of CamelCase?

For whatever reason, you might want to convert those CamelCased columns into snake_cased columns, snake_case_camel got your back!

In fact Snowflake imposes an ALL_CAPS convention on its column name. This means that ingesting the following column "CamelCasesCol" would end up in your database as this ugly, unreadable thing: CAMELCASESCOL 🤮

sheets:
  - sheet_name: test_sheet
    sheet_key: 10JJdwwd_lm4556uwedu4zu5u6r0h2VIDTjRXg
    target_schema: sandbox
    target_table: my_google_sheet
    # create a new columns block
    columns:
      - name: column_a # name of the column in your sheet
        datatype: int
        identifier: "An AwkwardlyNamed Column Gosh that is ugly"
    excluded_columns: ['Bad COlumn', 'another bad one I do not want']
    snake_case_camel: True
PreviousBasic ConfigurationNextsheetwork init

Last updated 5 years ago

Was this helpful?