UPDATE: Test.xml file is added.
KMyMoney so far is the best GUI application for personal accounting that I've tried to use. It is not perfect, but using it in conjunction with beancount and hledger addresses my needs in financial report generation. In order to export transactions stored in KMyMoney's XML file to hledger/beancount journal file, one needs to know its XML structure.
A typical KMyMoney file is a gzip-compressed XML file with *.kmy extension. Firstly, we need to decompress it:
1 |
[johndoe@ArchLinux]% cat Test.kmy > Test.xml |
You can download the decompressed Test.xml file here.
Elements
The resulting file contains the following elements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[johndoe@ArchLinux]% awk '/^(<|<\/| <)[A-Z].*/{print $0}' Test.xml <KMYMONEY-FILE> <FILEINFO> <USER name="John Doe" email=""> <INSTITUTIONS count="0"/> <PAYEES count="5"> <COSTCENTERS count="0"/> <TAGS count="0"/> <ACCOUNTS count="21"> <TRANSACTIONS count="6"> <KEYVALUEPAIRS> <SCHEDULES count="0"/> <SECURITIES count="0"/> <CURRENCIES count="2"> <PRICES count="2"> <REPORTS count="0"/> <BUDGETS count="0"/> <ONLINEJOBS count="0"/> </KMYMONEY-FILE> |
If we don't need any extra information, we can only print the elements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
[johndoe@ArchLinux]% awk 'match($0, /^ <([A-Z]+) /, arr) { print arr[1]}' Test.xml USER INSTITUTIONS PAYEES COSTCENTERS TAGS ACCOUNTS TRANSACTIONS SCHEDULES SECURITIES CURRENCIES PRICES REPORTS BUDGETS ONLINEJOBS |
Accounts
Accounts represents assets, liabilities and equities and also expense and income categories. Accounts may be formatted in two ways. The full format looks like this:
1 2 3 4 5 6 7 8 |
[johndoe@ArchLinux]% awk '/<ACCOUNT.*id="A000003"/,/<\/ACCOUNT>/' Test.xml <ACCOUNT id="A000003" currency="EUR" lastreconciled="" lastmodified="" description="" name="Food" parentaccount="AStd::Expense" opened="1900-01-01" type="13" institution="" number=""> <SUBACCOUNTS> <SUBACCOUNT id="A000014"/> <SUBACCOUNT id="A000015"/> <SUBACCOUNT id="A000016"/> </SUBACCOUNTS> </ACCOUNT> |
Note that '/<ACCOUNT.*id="A000003"/,/<\/ACCOUNT>/' is a range pattern in AWK, meaning "print" every line starting with a match '/<ACCOUNT.*id="A000003"/' and ending with a line matching '/<\/ACCOUNT>/'.
The short format looks like this:
1 2 |
[johndoe@ArchLinux]% awk '/<ACCOUNT.*id="A000014"/,/\/>/' Test.xml <ACCOUNT id="A000014" currency="EUR" lastreconciled="" lastmodified="2021-05-30" description="" name="Bread" parentaccount="A000003" opened="1900-01-01" type="13" institution="" number=""/> |
A single AWK scripts which will retrieve both formats:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[johndoe@ArchLinux]% awk 'BEGIN{f=0} { \ if (f==1) {print $0} \ if ($0 ~ /<ACCOUNT.* id="A0000(03|14)"/){ \ print $0; \ if ($0 ~ /\/>/) \ {exit 0} \ else {f=1;} \ }; \ if ($0 ~ /<\/ACCOUNT>/){f=0}}' Test.xml <ACCOUNT id="A000003" currency="EUR" lastreconciled="" lastmodified="" description="" name="Food" parentaccount="AStd::Expense" opened="1900-01-01" type="13" institution="" number=""> <SUBACCOUNTS> <SUBACCOUNT id="A000014"/> <SUBACCOUNT id="A000015"/> <SUBACCOUNT id="A000016"/> </SUBACCOUNTS> </ACCOUNT> <ACCOUNT id="A000014" currency="EUR" lastreconciled="" lastmodified="2021-05-30" description="" name="Bread" parentaccount="A000003" opened="1900-01-01" type="13" institution="" number=""/> |
Or using getline function inside AWK to get the same output as in the previous command:
1 2 3 4 5 6 7 8 9 10 11 |
[johndoe@ArchLinux]% awk '/<ACCOUNT.* id="A0000(03|14)"/ { \ i=1; \ f[i=1] = $0; \ if ($0 !~ /\/>$/){ \ while (getline && $0 !~ /<\/ACCOUNT/) \ f[++i] = $0; \ f[++i] = $0; \ }; \ for (j=1; j<=i; j++){ print f[j]}; \ print "\n"\ }' Test.xml |
Payees
A payee is a person or organization to whom money is paid in exchange for a certain good or a service. It can some general description like "Groceries", "Supermarket" or "Restaurant" or specific product you bought, e.g. "Wireless router Linksys".
Payees can be retrieved using the following command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
[johndoe@ArchLinux]% awk '/<PAYEE.*id="P.*"/,/<\/PAYEE>/' Test.xml <PAYEE id="P000001" reference="" name="Salary" email="" matchingenabled="0"> <ADDRESS street="" postcode="" city="" telephone="" state=""/> </PAYEE> <PAYEE id="P000002" reference="" name="Gasoline" email="" matchingenabled="0"> <ADDRESS street="" postcode="" city="" telephone="" state=""/> </PAYEE> <PAYEE id="P000003" reference="" name="Conversion: EUR -> USD" email="" matchingenabled="0"> <ADDRESS street="" postcode="" city="" telephone="" state=""/> </PAYEE> <PAYEE id="P000004" reference="" name="Groceries" email="" matchingenabled="0"> <ADDRESS street="" postcode="" city="" telephone="" state=""/> </PAYEE> <PAYEE id="P000005" reference="" name="Vacation" email="" matchingenabled="0"> <ADDRESS street="" postcode="" city="" telephone="" state=""/> </PAYEE> |
A unique payee ID start with letter "P" followed by digits, for example "P000004". To search for a specific payee, execute:
1 2 3 4 |
[johndoe@ArchLinux]% awk '/<PAYEE.*id="P000004"/,/<\/PAYEE>/' Test.xml <PAYEE id="P000004" reference="" name="Groceries" email="" matchingenabled="0"> <ADDRESS street="" postcode="" city="" telephone="" state=""/> </PAYEE> |
Similarly, IDs for transactions, accounts and splits follow the same naming convention: T000000000000000005, A000015, S0001. Payees, transactions and accounts have global IDs, whereas splits have IDs which are unique only inside a transaction.
Transactions
Transactions represent money flow from accounts to accounts when goods and services are bought or sold or when money is transferred from one account to another.
1 2 3 4 5 6 7 8 9 10 |
[johndoe@ArchLinux]% awk 'BEGIN{f=0} {if (f==1){print} if ($0 ~ /<TRANSACTION .*id="T000000000000000004"/){f=1;print} if ($0 ~ /<\/TRANSACTION>/){f=0}}' Test.xml <TRANSACTION id="T000000000000000004" memo="" commodity="EUR" entrydate="2021-05-24" postdate="2021-02-19"> <SPLITS> <SPLIT id="S0001" memo="Many items in this transaction" value="-19/4" action="" reconcileflag="0" shares="-19/4" price="1/1" payee="P000004" reconciledate="" account="A000001" number="" bankid=""/> <SPLIT id="S0002" memo="Baguette" value="3/2" action="" reconcileflag="0" shares="3/2" price="1/1" payee="P000004" reconciledate="" account="A000014" number="" bankid=""/> <SPLIT id="S0003" memo="10 Eggs" value="2/1" action="" reconcileflag="0" shares="2/1" price="1/1" payee="P000004" reconciledate="" account="A000015" number="" bankid=""/> <SPLIT id="S0004" memo="Earl Grey" value="7/4" action="" reconcileflag="0" shares="7/4" price="1/1" payee="P000004" reconciledate="" account="A000016" number="" bankid=""/> <SPLIT id="S0005" memo="Reimbursement" value="-1/2" action="" reconcileflag="0" shares="-1/2" price="1/1" payee="P000004" reconciledate="" account="A000005" number="" bankid=""/> </SPLITS> </TRANSACTION> |
Exactly the same ouput can be obtained using a range pattern:
1 |
[johndoe@ArchLinux]% awk '/<TRANSACTION .*id="T000000000000000004"/,/<\/TRANSACTION>/' Test.xml |
This transaction's commodity is EUR, which means that all accounts in this transaction have also commodity EUR. There are no multi-currency accounts in KMyMoney! This is very important, since other plain text accounting software do support multi-currency accounts.
The transaction above corresponds to the following Beancount's ledger transaction:
1 2 3 4 5 6 7 |
; T000000000000000004 2021-02-19 * "Groceries" ; P000004 Assets:Cash-EUR -4.7500 EUR ; =-19/4 EUR, Many items in this transaction Expenses:Food:Bread 1.5000 EUR ; =3/2 EUR, Baguette Expenses:Food:Eggs 2.0000 EUR ; =2/1 EUR, 10 Eggs Expenses:Food:Tea 1.7500 EUR ; =7/4 EUR, Earl Grey Expenses:Miscellaneous-Other -0.5000 EUR ; =-1/2 EUR, Reimbursement |
KMyMoney uses fractions instead of decimal numbers. The tea (Earl Grey) costs "7/4" EUR (1.75 EUR), it is not a number yet, it needs to be interpreted (evaluated) to be represented in floating point or decimal format. This method of representing numbers has an advantage: rounding errors can be mitigated, e.g. if you buy 3 units of commodity "A" for 10 units of commodity "B", then the price of one unit of commodity "A" will be equal to "10/3" and not 3.3333 units of commodity "B".
The amount transferred to/from an account is called "value" and it is defined as:
value = shares * price
All values in a transaction have to sum up to zero as in any double-entry bookkeeping system. So if we check: -19/4 + 3/2 + 2/1 + 7/4 - 1/2 = 0. Indeed, they do.
The first split S0001 usually corresponds to the source account, in which transaction was recorded. The currency of this account will be set to the commodity of the transaction.
Currency Conversion
Now if you have an account in foreign currency, let's say USD, and all your expense categories are in the main currency (EUR), then KMyMoney will translate amounts in USD to amounts in EUR, because expense accounts do not have multi-currency mode. For example, let's say you bought groceries with USD:
they will be displayed in USD in the interface, but recorded in EUR for the expense categories using the conversion rate you provide in the pop-up window. This transaction will be recorded as:
1 2 3 4 5 6 7 8 9 |
[johndoe@ArchLinux]% awk '/<TRANSACTION .*id="T000000000000000005"/,/<\/TRANSACTION>/' Test.xml <TRANSACTION id="T000000000000000005" memo="" commodity="USD" entrydate="2021-05-24" postdate="2021-03-10"> <SPLITS> <SPLIT id="S0001" memo="" value="-33/10" action="" reconcileflag="0" shares="-33/10" price="1/1" payee="P000004" reconciledate="" account="A000013" number="" bankid=""/> <SPLIT id="S0002" memo="Rye bread" value="19/10" action="" reconcileflag="0" shares="31/20" price="38/31" payee="P000004" reconciledate="" account="A000014" number="" bankid=""/> <SPLIT id="S0003" memo="10 Eggs" value="21/10" action="" reconcileflag="0" shares="171/100" price="70/57" payee="P000004" reconciledate="" account="A000015" number="" bankid=""/> <SPLIT id="S0004" memo="Reimbursement" value="-7/10" action="" reconcileflag="0" shares="-57/100" price="70/57" payee="P000004" reconciledate="" account="A000005" number="" bankid=""/> </SPLITS> </TRANSACTION> |
The corresponding Beancount transaction is given below:
1 2 3 4 5 6 |
; T000000000000000005 2021-03-10 * "Groceries" ; P000004 Assets:Cash-USD -3.3000 USD ; -33/10 USD Expenses:Food:Bread 1.5500 EUR @@ 1.9000 USD ; 19/10 USD = (31/20 shares) * (price of 38/31 [USD/EUR]) ; Rye bread Expenses:Food:Eggs 1.7100 EUR @@ 2.1000 USD ; 21/10 USD = (171/100 shares) * (price of 70/57 [USD/EUR]) ; 10 Eggs Expenses:Miscellaneous-Other -0.5700 EUR @@ 0.7000 USD ; -57/100 USD = (-57/100 shares) * (price of 70/57 [USD/EUR]) ; Reimbursement |
Account A000014 (Expenses:Food:Bread) will be augmented by the amount of "shares" Euros (1.55 EUR) and not by "value". However, account A000013 (Assets:Cash-USD) will be decreased by the amount of "value" US Dollars (-3.3 USD).
So in total, the balance of A000014 (Expenses:Food:Bread) will be +3.05 EUR = 1.50 EUR (from transaction T000000000000000004) + 1.55 EUR (from transaction T000000000000000005). This is the amount of money we've spent on bread in two transactions.
If you want the transaction to be recorded in USD as is, then you need to create a new account for every expense category and use it instead, e.g. "Expenses:Food:Bread-USD". Obviously, this approach is impractical. Report generation involving financial flow, income and expenses can be handled using Beancount or hledger. I wrote a custom kmymoney2hledgers converter to handle multi-currency problem. The beforementioned transaction can be changed to:
1 2 3 4 5 6 |
; T000000000000000005 2021-03-10 * "Groceries" ; P000004 Assets:Cash-USD -3.3000 USD Expenses:Food:Bread 1.9000 USD ; Rye bread Expenses:Food:Eggs 2.1000 USD ; 10 Eggs Expenses:Miscellaneous-Other -0.7000 USD ; Reimbursement |
where unnecessary conversion from USD to EUR is omitted. Beancount and hledger are able to perform this conversion using implicit conversion rates (if main currency is EUR, then these USDs were exchanged for some amount of EUR, and hence the conversion rate is known) or rates specified explicitly.
KMyMoney may stay as a user-friendly GUI application for entering transactions. Auto-completion for payees, ability to easily change the account hierarchy and move transactions from one account to another are the advantages of KMyMoney.