diff options
author | Christian Cleberg <hello@cleberg.net> | 2024-04-27 17:01:13 -0500 |
---|---|---|
committer | Christian Cleberg <hello@cleberg.net> | 2024-04-27 17:01:13 -0500 |
commit | 74992aaa27eb384128924c4a3b93052961a3eaab (patch) | |
tree | d5193997d72a52f7a6d6338ea5da8a6c80b4eddc /content/blog/2022-03-03-financial-database.org | |
parent | 3def68d80edf87e28473609c31970507d9f03467 (diff) | |
download | cleberg.net-74992aaa27eb384128924c4a3b93052961a3eaab.tar.gz cleberg.net-74992aaa27eb384128924c4a3b93052961a3eaab.tar.bz2 cleberg.net-74992aaa27eb384128924c4a3b93052961a3eaab.zip |
test conversion back to markdown
Diffstat (limited to 'content/blog/2022-03-03-financial-database.org')
-rw-r--r-- | content/blog/2022-03-03-financial-database.org | 256 |
1 files changed, 0 insertions, 256 deletions
diff --git a/content/blog/2022-03-03-financial-database.org b/content/blog/2022-03-03-financial-database.org deleted file mode 100644 index 55a6473..0000000 --- a/content/blog/2022-03-03-financial-database.org +++ /dev/null @@ -1,256 +0,0 @@ -#+title: Maintaining a Personal Financial Database -#+date: 2022-03-03 -#+description: An example project showing to build and maintain a simple financial database. -#+filetags: :personal: - -* Personal Financial Tracking -For the last 6-ish years, I've tracked my finances in a spreadsheet. -This is common practice in the business world, but any good dev will -cringe at the thought of storing long-term data in a spreadsheet. A -spreadsheet is not for long-term storage or as a source of data to pull -data/reports. - -As I wanted to expand the functionality of my financial data (e.g., -adding more reports), I decided to migrate the data into a database. To -run reports, I would query the database and use a language like Python -or Javascript to process the data, perform calculations, and visualize -the data. - -* SQLite -When choosing the type of database I wanted to use for this project, I -was split between three options: - -1. MySQL: The database I have the most experience with and have used for - years. -2. PostgreSQL: A database I'm new to, but want to learn. -3. SQLite: A database that I've used for a couple projects and have - moderate experience. - -I ended up choosing SQLite since it can be maintained within a single -=.sqlite= file, which allows me more flexibility for storage and backup. -I keep this file in my cloud storage and pull it up whenever needed. - -** GUI Editing -Since I didn't want to try and import 1000--1500 records into my new -database via the command line, I opted to use -[[https://sqlitebrowser.org/][DB Browser for SQLite (DB4S)]] as a GUI -tool. This application is excellent, and I don't see myself going back -to the CLI when working in this database. - -DB4S allows you to copy a range of cells from a spreadsheet and paste it -straight into the SQL table. I used this process for all 36 accounts, -1290 account statements, and 126 pay statements. Overall, I'm guessing -this took anywhere between 4--8 hours. In comparison, it probably took -me 2-3 days to initially create the spreadsheet. - -#+caption: DB4S -[[https://img.cleberg.net/blog/20220303-maintaining-a-personal-financial-database/db4s.png]] - -** Schema -The schema for this database is actually extremely simple and involves -only three tables (for now): - -1. Accounts -2. Statements -3. Payroll - -*Accounts* - -The Accounts table contains summary information about an account, such -as a car loan or a credit card. By viewing this table, you can find -high-level data, such as interest rate, credit line, or owner. - -#+begin_src sql -CREATE TABLE "Accounts" ( - "AccountID" INTEGER NOT NULL UNIQUE, - "AccountType" TEXT, - "AccountName" TEXT, - "InterestRate" NUMERIC, - "CreditLine" NUMERIC, - "State" TEXT, - "Owner" TEXT, - "Co-Owner" TEXT, - PRIMARY KEY("AccountID" AUTOINCREMENT) -) -#+end_src - -*Statements* - -The Statements table uses the same unique identifier as the Accounts -table, meaning you can join the tables to find a monthly statement for -any of the accounts listed in the Accounts table. Each statement has an -account ID, statement date, and total balance. - -#+begin_src sql -CREATE TABLE "Statements" ( - "StatementID" INTEGER NOT NULL UNIQUE, - "AccountID" INTEGER, - "StatementDate" INTEGER, - "Balance" NUMERIC, - PRIMARY KEY("StatementID" AUTOINCREMENT), - FOREIGN KEY("AccountID") REFERENCES "Accounts"("AccountID") -) -#+end_src - -*Payroll* - -The Payroll table is a separate entity, unrelated to the Accounts or -Statements tables. This table contains all information you would find on -a pay statement from an employer. As you change employers or obtain new -perks/benefits, just add new columns to adapt to the new data. - -#+begin_src sql -CREATE TABLE "Payroll" ( - "PaycheckID" INTEGER NOT NULL UNIQUE, - "PayDate" TEXT, - "Payee" TEXT, - "Employer" TEXT, - "JobTitle" TEXT, - "IncomeRegular" NUMERIC, - "IncomePTO" NUMERIC, - "IncomeHoliday" NUMERIC, - "IncomeBonus" NUMERIC, - "IncomePTOPayout" NUMERIC, - "IncomeReimbursements" NUMERIC, - "FringeHSA" NUMERIC, - "FringeStudentLoan" NUMERIC, - "Fringe401k" NUMERIC, - "PreTaxMedical" NUMERIC, - "PreTaxDental" NUMERIC, - "PreTaxVision" NUMERIC, - "PreTaxLifeInsurance" NUMERIC, - "PreTax401k" NUMERIC, - "PreTaxParking" NUMERIC, - "PreTaxStudentLoan" NUMERIC, - "PreTaxOther" NUMERIC, - "TaxFederal" NUMERIC, - "TaxSocial" NUMERIC, - "TaxMedicare" NUMERIC, - "TaxState" NUMERIC, - PRIMARY KEY("PaycheckID" AUTOINCREMENT) -) -#+end_src - -** Python Reporting -Once I created the database tables and imported all my data, the only -step left was to create a process to report and visualize on various -aspects of the data. - -In order to explore and create the reports I'm interested in, I utilized -a two-part process involving Jupyter Notebooks and Python scripts. - -1. Step 1: Jupyter Notebooks - - When I need to explore data, try different things, and re-run my code - cell-by-cell, I use Jupyter Notebooks. For example, I explored the - =Accounts= table until I found the following useful information: - - #+begin_src python - import sqlite3 - import pandas as pd - import matplotlib - - # Set up database filename and connect - db = "finances.sqlite" - connection = sqlite3.connect(db) - df = pd.read_sql_query("SELECT ** FROM Accounts", connection) - - # Set global matplotlib variables - %matplotlib inline - matplotlib.rcParams['text.color'] = 'white' - matplotlib.rcParams['axes.labelcolor'] = 'white' - matplotlib.rcParams['xtick.color'] = 'white' - matplotlib.rcParams['ytick.color'] = 'white' - matplotlib.rcParams['legend.labelcolor'] = 'black' - - # Display graph - df.groupby(['AccountType']).sum().plot.pie(title='Credit Line by Account Type', y='CreditLine', figsize=(5,5), autopct='%1.1f%%') - #+end_src - -2. Step 2: Python Scripts - - Once I explored enough through the notebooks and had a list of - reports I wanted, I moved on to create a Python project with the - following structure: - - #+begin_src txt - finance/ - ├── notebooks/ - │ │ ├── account_summary.ipynb - │ │ ├── account_details.ipynb - │ │ └── payroll.ipynb - ├── public/ - │ │ ├── image-01.png - │ │ └── image-0X.png - ├── src/ - │ └── finance.sqlite - ├── venv/ - ├── _init.py - ├── database.py - ├── process.py - ├── requirements.txt - └── README.md - #+end_src - - This structure allows me to: - - 1. Compile all required python packages into =requirements.txt= for - easy installation if I move to a new machine. - 2. Activate a virtual environment in =venv/= so I don't need to - maintain a system-wide Python environment just for this project. - 3. Keep my =notebooks/= folder to continuously explore the data as I - see fit. - 4. Maintain a local copy of the database in =src/= for easy access. - 5. Export reports, images, HTML files, etc. to =public/=. - - Now, onto the differences between the code in a Jupyter Notebook and - the actual Python files. To create the report in the Notebook snippet - above, I created the following function inside =process.py=: - - #+begin_src python - # Create summary pie chart - def summary_data(accounts: pandas.DataFrame) -> None: - accounts_01 = accounts[accounts["Owner"] == "Person01"] - accounts_02 = accounts[accounts["Owner"] == "Person02"] - for x in range(1, 4): - if x == 1: - df = accounts - account_string = "All Accounts" - elif x == 2: - df = accounts_01 - account_string = "Person01's Accounts" - elif x == 3: - df = accounts_02 - account_string = "Person02's Accounts" - print(f"Generating pie chart summary image for {account_string}...") - summary_chart = ( - df.groupby(["AccountType"]) - .sum() - .plot.pie( - title=f"Credit Line by Type for {account_string}", - y="CreditLine", - autopct="%1.1f%%", - ) - ) - summary_chart.figure.savefig(f"public/summary_chart_{x}.png", dpi=1200) - #+end_src - - The result? A high-quality pie chart that is read directly by the - =public/index.html= template I use. - - #+caption: Summary Pie Chart - [[https://img.cleberg.net/blog/20220303-maintaining-a-personal-financial-database/summary_chart.png]] - - Other charts generated by this project include: - - - Charts of account balances over time. - - Line chart of effective tax rate (taxes divided by taxable income). - - Salary projections and error limits using past income and inflation - rates. - - Multi-line chart of gross income, taxable income, and net income. - - The best thing about this project? I can improve it at any given - time, shaping it into whatever helps me the most for that time. I - imagine that I will be introducing an asset tracking table soon to - track the depreciating value of cars, houses, etc. Who knows what's - next? |