Words Written This Year#

The year is coming to an end. I promised myself to write more, so now would be a good time to count things, that is: words. Let’s count words!

I tried a top-down approach a while ago, but it did not work out, because I got side-tracked with technicalities. Let’s try a bottom-up approach this time.

This blag’s repo is a good starting point.

Can git count words? Of course it can.

First, I need all the commits I made

from subprocess import check_output

commits = check_output("git log --since=2023-01-01 --author=felix --pretty=format:'%H' --reverse", shell=True, text=True).split('\n')

Let’s have a look at a single commit and get its word count. I’m only interested in markdown, so *.md is the glob to use.

def wc(commit):
    return int(check_output(f"git show --word-diff=porcelain {commit} '*.md' | rg '^[+]' | wc -w", shell=True, text=True))

Let’s do this for all the commits and sum

wcs = [wc(c) for c in commits]
sum(wcs)

That’s 2427 words for this blog. Nice! 😎

Let’s create a script that does this for any repo.

Running this over the first few repos that came to mind, I got a word count of 37260. It’s something. 🙂

Count Words in all The Repos#

Note: I had to revise the script to handle repos without a HEAD.

First, find all the repos that are somewhere in $HOME. Next, count over all of them.

pip install -U puddl

IFS=$'\n' repos=$(puddl-git-repo-find $HOME)

felix-git-word-count $repos > /tmp/result
cat /tmp/result | sort -n

There was one repo containing a dump of my HedgeDoc notes, which doesn’t count, as it contains notes from the Beginning of Time (TM) [1].

The updated score is: 46266.

I know that I wrote more: In wikis, tickets and git commits, but that’s for another day. 🤓

Contact

Notes? Comments? Feel free to contact me on The Matrix.