Doing SQL Data Analysis As A PM

Any Product Manager should be able to do their own SQL-based Exploratory Data Analysis.

But the adhoc/exploratory tools aren't great.

What I've used

Problems with this approach...

  • some queries take a long time (e.g. multiple minutes), which kills your flow
  • by default, dBeaver overwrites your result-set-view each time, so if you want to "go back" you have to re-run the previous (slow) query
  • you're typically writing multiple queries in a single window, and executing just the selected query. But the result set isn't clearly associated with the specific query that generated it.
    • you might even start editing that query, so the query that generated the result set on the page doesn't exist anymore
  • at some point you probably copy a result set over to your spreadsheet, and do additional analysis (simple calculations, pivot tables, deleting garbage rows (the latest month-aggregate because it's a partial month), etc. Then do some graphing from part of a spreadsheet area
  • your spreadsheet doesn't link back to the query that made the result-set you pasted over

My hacked-up process

  • I make a (Jira) ticket for an analysis
  • I add a comment for each chunk of work
  • I paste my query into the comment
    • some time in 2021 a Jira update resulted in those pastes being double-spaced which is irritating
  • I paste a link to the gsheet tab into the comment
  • Sometimes I even find a good place to link from the gsheet tab back to the ticket
  • I summarize my conclusions back up in the ticket body itself
  • I share those conclusions by copy/pasting into Slack or Email, linking back to the ticket.
  • (I've also had a couple cases where I downloaded a result-set as CSV and then did some Python coding to summarize it... but that's probably an outlier...)
  • because of the manual steps, it's hard for anyone to re-run later
    • also because you often have a constant like a date-range in queries

What I want

  • something like Colab Notebook or nbdev? Glamorous Toolkit?
  • cloud-based so I can let anyone read or comment
  • process/structure that replaces Jira comments, ties narrative "here's what I'm going to check next" to query to result-set (auto-saving each one, though you can explicitly delete it)
  • way to flag a step as being correct/updated next, so people can easily skip that transient step
  • easy separation of parameters (date ranges) from queries using them
  • pivot-table feature
  • way to make the "wrap-around-bits" in a spreadsheet work as your replace the underlying result-set (which might result in a different number of rows) - does something like the old spreadsheet-script-records do that job?

Edited:    |       |    Search Twitter for discussion