select * from robinhood.stocks

Building a more privacy-centric personal finance dashboard

Sep 13, 2020

Over the years I've experienced varying levels of interest and success in monitoring my personal finances. Having played around with Mint, You Need a Budget, Clarity Money, Personal Capital, Tiller Money, Copilot...there's no shortage of options when it comes to out-of-the-box dashboards, account monitoring, and tracking against a budget.

Despite the many options, I always found myself falling back to a spreadsheet — updated manually at the end of every month — to get a complete picture of my finances. Poking around, I learned I wasn't alone. Twitter, Reddit, and HackerNews are awash with the best spreadsheet to start using after you run into the limitations of personal finance apps.

So over the last few weeks I set out to build a more privacy-centric personal finance data platform — called Abstract Capital — that automates data collection and allows me to build financial dashboards in SQL.

And with a few hours of setup, you can to! I've open-sourced the code here and shared the learnings below.

Why Spreadsheets?

There were a few reasons why I always ended up opting for a spreadsheet over these finance apps...

Incomplete - My goal is to understand how my financial snapshot is evolving across savings, investments, and spending. I need to be able to review and manage all accounts. I found many apps didn't support all the sources of financial activity. These were typically modern or niche tools, like Venmo, Apple Card, and Coinbase.
Privacy Concerns - To use these apps and build a complete financial snapshot requires providing login credentials (generally username and password) to third-party tools. They mostly use Plaid under the hood to access account info and store additional data about you and your financial snapshot. It feels pretty icky to type in a bank password in to anywhere but you bank's login page. I don't necessarily want a startup with unknown privacy & security controls having a complete view of my finances.
Shallow Insights - At the end of all that setup, I didn't find the outputs super valuable. Maybe I'd get a cool visualization of my monthly spending, or an alert that I'd spent a lot on a particularly category, but it was often wrong (e.g. I was getting paid back for a large group purchase that wasn't captured) or didn't provide me with the depth of understanding I was craving.

The challenge for these apps is that everyone has a different "financial stack" and use case. Some want to save, some want to forecast and inform large purchase decisions. Some have shared household accounts and some keep them separate. Some pay cash, some rely solely on credit cards, and some pay in Bitcoin. Catering to this diversity is hard!

"Gimme the Data!"

So in August 2020, I set out to prototype my own personal finance data platform. Rather than trying to build the next coolest personal finance app, I wanted to focus on getting myself a copy of my financial data. I called the service Abstract Capital 💸

My requirements for Abstract Capital:

Privacy-Preserving: Needs to run in my own cloud account, be open-sourced, and provide me with access to the underlying data
Complete: needs to be able to build a complete snapshot, across all financial data sources
Flexible: make as few opinions about the data and final analysis as possible
Cheap: needs to be cheap to run...don't wanna spend $$$ to monitor $

I didn't have to look too far for inspiration. The original premise and environment for Segment (customer data platform) is similar to the personal finance ecosystem of today. Marketing and analytics SaaS is highly-fragmented, with many tools that focus on different use cases, but all rely on the same underlying dataset.

Your financial budgeting, forecasting, monitoring, loan application, and dashboarding apps similarly rely on the same data: how much you've saved, how much you've invested, how much you owe, and how much you've spent.

My goal for Abstract Capital was to automate the extraction and storage of my own financial data, and control which apps and services could access that data.

How Abstract Capital Works

Abstract Capital turns out to be a pretty traditional Extract-Transform-Load (ETL) problem. There are a variety of data sources ("financial accounts") that we need to pull data from, lightly massage it into a standard spec ("transactions", "investments"), and send it along to a central data bus that other apps can read from.

The first (and biggest) problem I ran into was that the financial services I use aren't exactly developer friendly. Bank of America limits their API usage to certified partners and sends you your API credentials via Certified Mail™️. Many of the financial services don't have any public APIs whatsoever. It's a highly regulated industry, providing access to highly-sensitive data, so I don't knock these institutions for not providing a Stripe-level developer experience.

Luckily, Plaid has solved the financial accounts access problem. Using Plaid cut against my goal of making Abstract Capital completely free of any third-party services. Plaid is used by companies like Venmo, Coinbase, and Betterment, so I figured I had enough social proof for my MVP.

Here's how Abstract Capital works...

Auth with Plaid: Run a Plaid app to authenticate into your accounts
Sync: A scheduled sync periodically pulls data from your accounts (via Plaid) and writes collection records to topics in pubsub. I set the scheduler to run hourly to start.
Consumers: Consumers will subscribe to pubsub and write to destinations (e.g. BigQuery, Airtable)

Example: Tracing a Transaction

Let's run through an example run for syncing a transaction. First, we need to create an "Item" (i.e. connect to an account) in Plaid to read credit card transactions. Plaid Link handles all the credential validation and multi-factor auth for us, and provides us an access token we can use to query for our data.

Once we've authenticated, we can see an example Transaction object returned from Plaid (below). It contains all the transaction data (amount, date, merchant) and a join key to an account object. Plaid even enriches it with its own categorization and location of the transaction.

The scheduler kicks off the syncTransactions function (hourly). It queries for the last 30 days of transactions and pushes each transaction into a transactions pubsub topic.

I've set up a Transactions table in BigQuery as an initial Consumer. Our loadTransactions function subscribes to the transactions topic and writes to BigQuery.

Now we can query our credit card transactions in SQL!

select name, date, amount, category 
from transactions  
where name in ('Taco Bell', 'Burger King')

Getting to the Dashboard

Finally, I connected Mode Analytics and build out an interactive dashboard. I built widgets for Total Monthly Spend, Cumulative Spend, Spend Breakdown by Merchant, and a Weekly Spend by Merchant.

There's also a widget to review Big Purchases and All purchases to scan for anything out of the ordinary.

Cost to Run Abstract Capital

Turns out that due to generous free plans on Google Cloud Platform, Abstract Capital is completely free to run.

Plaid: Free developer account allows you to link up to you 100 credentials.
Cloud Functions: Free for 2 million invocations per month
Google PubSub: Free for the first 10 GB
BigQuery: Storage is free for the first 10 GB, querying is free for first 1 TB
Mode: Studio plan allows you to use R, Python, and SQL for free

I wasn't expecting this to be completely free, but most of these services are designed for businesses and much higher volumes of data. We'll take it!

Where to from here?

Overall I've learned a ton from the Abstract Capital prototype. I now own a copy of my financial data, have an up-to-date dashboard I can review, and can write a bit of SQL to dig in to any changes or anomalies. It's empowering. There are additional destinations beyond BigQuery that could be helpful — Airtable, Google Sheets, Twilio (text alerts for large transactions), and TurboTax come to mind.

I'd love to see more companies take a data- and privacy-centric approach to helping consumers navigate their finances. I believe there's room in the FinTech ecosystem for this type of personal finance middleware to power new use cases. There are more products than ever that are providing consumer financial services, so understanding financial picture is becoming ever more difficult to assemble. And need to be able to provide third-party services access data is also growing — imagine being able to provide temporary access to your financial data to apply for a mortgage or try out the latest budgeting app.

Finally, there seems to be an emerging opportunity in helping consumers own their data and run apps in more privacy-preserving ways. Spinning up a BigQuery instance and writing ETL jobs isn't the way most people want to spend their nights and weekends. As deployment models in the enterprise shift from Cloud SaaS to in-VPC ("Cloud Prem Architecture"), there seems to be a need on the horizon for a Consumer Cloud that simplifies cloud offerings and allows consumers to deploy personal apps without ceding their privacy and data.

Thanks to Calvin, Nate, Jason, Osama, Tejas, Alexia, Tim, Niels and JJ for the feedback on this project and blog post.

Invisible Forces