sheetwork
Contribute on GitHubChat on Discord
v1.0.0 Nicolas Jaar
v1.0.0 Nicolas Jaar
  • Introduction
  • Installation & Configuration
    • Installation & Update
    • Configuration
      • Set up your sheetwork project
      • Connecting to Google Sheets
      • Set up your sheetwork profile
  • Usage
    • Quick CLI
    • Using sheets.yml configuration
      • Basic Configuration
      • Column Operations & Advanced Controls
  • Sheetwork Operations
    • sheetwork init
    • sheetwork upload
    • Under the Hood Cleanups
  • FAQ
    • Do I need to list ALL the columns in my sheets config?
    • Do the column names need to be in the format of the original sheet?
    • What kind of cleanups or reformating does sheetwork do?
Powered by GitBook
On this page
  • What is sheetwork?
  • How to use sheetwork?

Was this helpful?

Introduction

NextInstallation & Update

Last updated 4 years ago

Was this helpful?

What is sheetwork?

sheetwork is a handy open-source CLI-tool that allows non-coders to ingest Google Spreadsheets directly into their databases with control over data types, renaming, basic data sanitisation etc.

sheetwork is still very much in its early inception do not use it for production jobs unless you have taken the time to thoroughly test it.

compatibility tested and developed on python >= 3.6, Mac OS >= 0.15. Integration tests are successful on Mac OS, and Ubuntu platform but fails on Windows (due to an issue with gspread --the library we use to connect to GoogleSheets-- accessing APPDATA). That being said, it has not been properly tested on Windows platforms. Check if you would like to try it or help.

sheetwork currently only offers support for cloud database . However, its design follows an adapter pattern (currently in the making) and can be extended to interact with most databases. Feel free to check how you can .

How to use sheetwork?

Quick & "Dirty"

sheetwork is as simple and quick to use as it is too write the following line of code

$ sheetwork upload --sheet_key 123afakekey11 --schema sandbox --table my_table

Control over the content of your google sheet and how it lands on the database will be limited, and all columns will be materialised on your database as strings.

sheetwork project & YAML configuration

For maximum control over your content sheetwork uses .yml files to let you configure operations you might want it to perform on your google sheet such as type casting, renaming, sanitization etc. Here is what a basic sheets.yml file looks like:

sheets.yml
sheets:
  - sheet_name: test_sheet
    sheet_key: 123afakekey
    target_schema: sandbox
    target_table: my_google_sheet
    snake_case_camel: True
    columns:
      - name: col_numeric
        datatype: numeric
      - name: col_b
        datatype: varchar
      - name: renamed_col
        identifier: "A long and dirty name"
    excluded_columns: ['to_exclude', 'col_not_in_df_for_fun']

All it takes to ingest your newly configured sheet is to call it in the command line with the following arguments

$ sheetwork upload --sheet_name test_sheet

Yep! It's that simple!

Now of course there's a lot more to it, so we'll see you in another section for sure!

how to contribute to the project
Snowflake
contribute to the project