Building a more privacy-centric personal finance dashboard
Over the years I've experienced varying levels of interest and success in monitoring my personal finances. Having played around with Mint, You Need a Budget, Clarity Money, Personal Capital, Tiller Money, Copilot...there's no shortage of options when it comes to out-of-the-box dashboards, account monitoring, and tracking against a budget.
Despite the many options, I always found myself falling back to a spreadsheet — updated manually at the end of every month — to get a complete picture of my finances. Poking around, I learned I wasn't alone. Twitter, Reddit, and HackerNews are awash with the best spreadsheet to start using after you run into the limitations of personal finance apps.
So over the last few weeks I set out to build a more privacy-centric personal finance data platform — called Abstract Capital — that automates data collection and allows me to build financial dashboards in SQL.
And with a few hours of setup, you can to! I've open-sourced the code here and shared the learnings below.
There were a few reasons why I always ended up opting for a spreadsheet over these finance apps...
The challenge for these apps is that everyone has a different "financial stack" and use case. Some want to save, some want to forecast and inform large purchase decisions. Some have shared household accounts and some keep them separate. Some pay cash, some rely solely on credit cards, and some pay in Bitcoin. Catering to this diversity is hard!
So in August 2020, I set out to prototype my own personal finance data platform. Rather than trying to build the next coolest personal finance app, I wanted to focus on getting myself a copy of my financial data. I called the service Abstract Capital 💸
My requirements for Abstract Capital:
I didn't have to look too far for inspiration. The original premise and environment for Segment (customer data platform) is similar to the personal finance ecosystem of today. Marketing and analytics SaaS is highly-fragmented, with many tools that focus on different use cases, but all rely on the same underlying dataset.
Your financial budgeting, forecasting, monitoring, loan application, and dashboarding apps similarly rely on the same data: how much you've saved, how much you've invested, how much you owe, and how much you've spent.
My goal for Abstract Capital was to automate the extraction and storage of my own financial data, and control which apps and services could access that data.
Abstract Capital turns out to be a pretty traditional Extract-Transform-Load (ETL) problem. There are a variety of data sources ("financial accounts") that we need to pull data from, lightly massage it into a standard spec ("transactions", "investments"), and send it along to a central data bus that other apps can read from.
The first (and biggest) problem I ran into was that the financial services I use aren't exactly developer friendly. Bank of America limits their API usage to certified partners and sends you your API credentials via Certified Mail™️. Many of the financial services don't have any public APIs whatsoever. It's a highly regulated industry, providing access to highly-sensitive data, so I don't knock these institutions for not providing a Stripe-level developer experience.
Luckily, Plaid has solved the financial accounts access problem. Using Plaid cut against my goal of making Abstract Capital completely free of any third-party services. Plaid is now owned by Visa and is used by companies like Venmo, Coinbase, and Betterment, so I figured I had enough social proof for my MVP.
Here's how Abstract Capital works...
Let's run through an example run for syncing a transaction. First, we need to create an "Item" (i.e. connect to an account) in Plaid to read credit card transactions. Plaid Link handles all the credential validation and multi-factor auth for us, and provides us an access token we can use to query for our data.
Once we've authenticated, we can see an example Transaction object returned from Plaid (below). It contains all the transaction data (amount, date, merchant) and a join key to an account object. Plaid even enriches it with its own categorization and location of the transaction.
The scheduler kicks off the
syncTransactions function (hourly). It queries for the last 30 days of transactions and pushes each transaction into a transactions pubsub topic.
I've set up a Transactions table in BigQuery as an initial Consumer. Our
loadTransactions function subscribes to the transactions topic and writes to BigQuery.
Now we can query our credit card transactions in SQL!
select name, date, amount, category from transactions where name in ('Taco Bell', 'Burger King')
If only that $33.16 at Taco Bell was a data issue 😱🌮
Finally, I connected Mode Analytics and build out an interactive dashboard. I built widgets for Total Monthly Spend, Cumulative Spend, Spend Breakdown by Merchant, and a Weekly Spend by Merchant.
There's also a widget to review Big Purchases and All purchases to scan for anything out of the ordinary.
Turns out that due to generous free plans on Google Cloud Platform, Abstract Capital is completely free to run.
I wasn't expecting this to be completely free, but most of these services are designed for businesses and much higher volumes of data. We'll take it!
Overall I've learned a ton from the Abstract Capital prototype. I now own a copy of my financial data, have an up-to-date dashboard I can review, and can write a bit of SQL to dig in to any changes or anomalies. It's empowering. There are additional destinations beyond BigQuery that could be helpful — Airtable, Google Sheets, Twilio (text alerts for large transactions), and TurboTax come to mind.
I'd love to see more companies take a data- and privacy-centric approach to helping consumers navigate their finances. I believe there's room in the FinTech ecosystem for this type of personal finance middleware to power new use cases. There are more products than ever that are providing consumer financial services, so understanding financial picture is becoming ever more difficult to assemble. And need to be able to provide third-party services access data is also growing — imagine being able to provide temporary access to your financial data to apply for a mortgage or try out the latest budgeting app.
Finally, there seems to be an emerging opportunity in helping consumers own their data and run apps in more privacy-preserving ways. Spinning up a BigQuery instance and writing ETL jobs isn't the way most people want to spend their nights and weekends. As deployment models in the enterprise shift from Cloud SaaS to in-VPC ("Cloud Prem Architecture"), there seems to be a need on the horizon for a Consumer Cloud that simplifies cloud offerings and allows consumers to deploy personal apps without ceding their privacy and data.
Feel free to share any comments or feedback on Twitter. Thanks to Calvin, Nate, Jason, Osama, Tejas, Alexia, Tim, Niels and JJ for the feedback on this project and blog post.