Jason Becker
January 28, 2023

Hi Jason,

Another later-in-the-week reply for our last week of this project. To what you said about taking stock, I think a break or a big life change is an excellent time to think about these things. We found ourselves doing that when we moved into this house and once again when we knew we were having a child.

Onto which, the final preparations are now taking place: washing the clothes, organising the nursery, and prepping the hospital bag (a long with many pregnancy and post-pregnancy products I’d never realised even existed). I’ve gone past the worrying stage now for the most part and I’m focusing on things I can control.

Mexico sounds wonderful and I hope you’ve been able to relax and enjoy it - spending an extended time away from home in somewhere so different sounds lovely.

Now to get a little bit meta about this project of yours. Having done this for four weeks now I’m struck by how difficult I’ve found being committed to writing something every week - it’s certainly a good job I suggested early in the year pre-baby else I’m not sure it would have gone quite as well. Despite having ideas here and there for little projects or blog posts something about the somewhat stricter schedule I’ve struggled to do it “on time” (despite the loose rules).

I have, however, really enjoyed being part of this project and I’m looking forward to reading in the coming months.

Speak soon,
Robb

Hi Robb,

It’s funny, because here at the beginning of this project, while taking stock, I’ve had two contrary reactions. First, it does take a surprising amount of discipline to sit down and write to someone. It’s certainly harder than just shooting off whatever is at the top of my mind. Second, it feels so feeble to just write a letter once a week while I see all the progression you’ve been making on several side projects, while preparing for the baby, during this same month.

You’ve been automating your now page, released a widely celebrated set of icons, built a CLI for omg.lol, and a host of other small projects. Based on chaosweb.space, I think you’d like mavica’s work.

It’s generating that itch in me again to figure out how to leave some energy at the end of the day to do the things I love on the computer after doing those things at work all day on the computer. Part of my taking stock is realizing that I have to find a way to push over that activation energy hump so that I can just work on small tools for myself all the time.

I’m glad this first month felt like just writing letters to a friend about what’s happening— it feels like an easy introduction. Maybe they’ll all go this way, but maybe some folks will want to really dig into a specific topic. I’m glad that I am not responsible for writing the first letter, because I think that makes it more likely that each month will be a bit different based on who is participating.

Thanks for helping me kick off this project.

Jason

January 27, 2023

The only “innovations” in school vouchers over the last two decades are legal/technical, entirely focused on maintaining private schools’ ability to discriminate and teach anything to kids.

There has been no fundamental change to gap between vouchers and quality school tuition. No improvement for students with special education needs. No protections against radical religious indoctrination. No movement on the underfunding of traditional schools where vouchers are paid for. No improvement on lack of actual options within a reasonable distance of most students’ homes. No meaningful improvement delivering instruction in a non-traditional setting or online such that it becomes a major provider of choice for families.

ESAs and vouchers are not supported on some idea of choice. They are explicitly about supporting religious schools that can teach religious beliefs without limit. They’re about richer, wealthier people who can already afford private schools getting a huge check back from the government to support their opting out of the public system. It’s the richer and wealthier folks who hate taxes and the idea of supporting a public good making sure they get theirs back rather than having sufficient taxes to support the system everyone accesses.

This is entirely distinct from other possible mechanisms of choice, including well-regulated charters and inter district choice. But these mechanisms, which could have (and have had) bipartisan consensus offer the potential to improve school operations and choices for families. That’s solving the wrong problem. They solve the problems choice advocates say they care about, but not the actual interests they serve. Choice is about tax breaks for the rich, excuses to not increase funding for schools, and having state supported religious indoctrination.

January 22, 2023

Here’s a real live example of why I am stuck with R and stuck with data.table. In my work, I often receive various delimited files from customers. Mostly, these delimited files are created from Oracle or MS SQL and have all kinds of gnarly things going on. Without sharing too much, here’s a partial example of the last few fields of one line in one of those files:

1
|25-AUG-22|"SAN81803 EXPO® White Board CARE Dry Erase Surface Cleaner 8 oz Spray Bottle|54

Do you see the problem? I have a " character in a field, but the field itself is not quoted and the quote is not escaped.

Let’s compare how different systems handle this file. Before we do so, it’s important to know how many rows are in this data set:

1
2
wc -l my_file.txt
  239167 my_file.txt

DuckDB

The new hot thing, DuckDB is an in-memory database like sqlite, but optimized for analytics and reporting. It has support to automatically create a table from a CSV file. What happens when I try and do this?

1
2
D create table df as select * from read_csv_auto('my_file.txt');
Error: Invalid Input Error: Error in file "my_file.txt" on line 50614: quote should be followed by end of value, end of row or another quote. (DELIMITER='|' (auto detected), QUOTE='"' (auto detected), ESCAPE='' (auto detected), HEADER=1 (auto detected), SAMPLE_SIZE=20480, IGNORE_ERRORS=0, ALL_VARCHAR=0)

Yup, you’re right DuckDB. That’s absolutely the line with the problem. Guess I’m stuck.

Python/Pandas

How about Python and pandas, somehow the king of modern data science and data frames.

1
2
>>> import pandas as pd
>>> df = pd.read_csv('my_file.txt', sep = '|', low_memory=False)

Hey, so far so good! This file get read without any messages, warnings, or errors.

1
2
3
>>> df
...
[238701 rows x 22 columns]

Uh oh. The data has only 238,701 rows, which is quite a bit less than 239,167 (well, 239,166 since this file does have a header row). This may not be a problem, because it’s possible that new lines exist in a text delimited file that is not a new record (if properly quoted). At least now that I have the data loaded, I can check for the sum of a column called “Amount”, because this is financial data. We can compare this to other methods later in addition to the row count so we can be sure the full data set was ready by pandas.

1
2
>>> sum(df.Amount)
196848446.45999622

R - readr

I am a full on tidyverse apologist. So of course I’m going to reach for readr::read_delim to get this file loaded.

1
2
> library(readr)
> df <- read_delim('my_file.txt', delim = '|', show_col_types = FALSE))

Awesome. Like pandas, readr had no messages, warnings, or errors. Let’s see how many rows there are and the sum of that amount column.

1
2
3
4
> df |> nrow()
[1] 238609
> sum(df$Amount)
[1] 196828725

Uh oh again. It seems that readr::read_delim also doesn’t reach 239,166 lines, but instead has only 238,609 lines. That’s almost 100 less than pandas, and the sum is off by almost $20,000. I don’t know at this stage if pandas is right, but it seems pretty likely that readr just silently gave up on some lines that it shouldn’t have.

R - data.table

Let’s try the package data.table which has a function fread to read delimited files.

1
2
3
4
> df <- fread("my_file.txt")
Warning message:
In fread("my_file.txt") :
  Found and resolved improper quoting out-of-sample. First healed line 50614: <<25-AUG-22|"SAN81803 EXPO® White Board CARE Dry Erase Surface Cleaner 8 oz Spray Bottle|54>>. If the fields are not quoted (e.g. field separator does not appear within any field), try quote="" to avoid this warning.

That’s interesting! When using fread, I get a warning that points out the very line I mentioned above. It even prints the line (I removed most of it) and says that it resolved the issue. It also recommends that I might want to try and specify quote = "" if, like on my file, fields are not quoted. We’ll come back to that.

1
2
> nrow(df)
[1] 239166

Well, would you look at that? Exactly the amount of lines I’d expect if there is one header row and no new lines. Let’s check that amount column.

1
2
> sum(df$Amount)
[1] 196926161

That’s almost $80,000 more than pandas. That’s a lot of money to have missing.

Just for fun, instead of expecting fread to figure everything out about my file on its own, what if I follow its suggestion and tell fread that there is no quote character and no fields are quoted?

1
2
3
4
5
> df <- fread('my_file.txt',  quote = "")
> nrow(df)
[1] 239166
> sum(df$Amount)
[1] 196926161

By giving fread just a little information about the file, I get no warning about resolved lines, it just reads the file correctly with the same results.

So now, because of fread, I have some idea of what the problem was. Almost everything that reads a delimited file expects fields to be quoted, at least optionally, at least some of the time. They hate a quote at the start of a field that is actually in the raw data (understandably). Maybe if I tell these other tools that the quote character is nothing they’ll work better.

PostgreSQL

Let’s try PostgreSQL using COPY. Note, I’m not including the create table statement to avoid revealing more about the data.

1
2
3
4
5
6
jason=# \copy df from 'my_file.txt' CSV HEADER DELIMITER '|';
COPY 130426
jason=# select sum(amount) from df;
     sum
-------------
 97411927.64

Well, that’s not right. I should have nearly twice the number of lines. And sure enough, I’ve got only half the dollars. No warnings, no errors, no messages. Nothing. Silent failure.

Can I fix it?

Now that I know the issue is the quote character, let’s see if I can fix all the methods that failed to load this file.

readr

1
2
3
4
5
> df <- read_delim('my_file.txt', delim = '|', show_col_types = FALSE, quote = '')
> nrow(df)
[1] 239166
> sum(df$Amount)
[1] 196926161

pandas

1
2
3
4
5
>>> df = pd.read_csv('my_file.txt', sep = '|', low_memory=False, quoting=3)
>>> df
[239166 rows x 22 columns]
>>> sum(df.Amount)
196926160.57999647

duckdb

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
D create table df as select * from read_csv_auto('my_file.txt', quote='');
D select count(*) from df;
┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
239166└──────────────┘
D select sum(amount) from df;
┌───────────────────┐
sum(amount)    │
│      double       │
├───────────────────┤
196926160.5800201└───────────────────┘

postgresql

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
jason=# \copy df from 'my_file.txt' CSV HEADER DELIMITER '|' QUOTE '';
ERROR:  COPY quote must be a single one-byte character
jason=# \copy df from 'my_file.txt' DELIMITER '|';
ERROR:  invalid input syntax for type integer: "col1"
CONTEXT:  COPY df, line 1, column col1: "col1"
jason=# \copy df from program 'tail -n +2 my_file.txt' DELIMITER '|';
COPY 239166
jason=# select sum(amount) from df;
     sum
--------------
 196926160.58
(1 row)

Would you look at that? Everyone does just fine when you tell them there is no quote character or quoted fields. 1

All of these methods are essentially instant on my computer. Performance is not the issue. What’s incredible is that data.table::fread can identify file issues and resolve them. In this case, it turns out that data.table::fread was also able to describe the problem well enough that I could fix every other method of reading the file successfully. I will say, going back and reading the error for duckdb may have given me some hints, but `pandas, readr, and PostgreSQL completely failed to even notify me something was wrong. In an automated pipeline, I would have no indication that hundreds of rows, or in PostgreSQL’s case hundreds of thousands of rows, were just gone.

I was able to fix pandas, readr, duckdb, and PostgreSQL, but I have run into many scenarios where this is not the case. For example, what if I had this same row in a file that sometimes did quote certain fields. The fread function can handle this no problem, resolving the issue, warning me, and moving on. Every other method just wouldn’t work.

I don’t control this data. I don’t control producing it. It comes in on a scheduled basis, often hourly, and I need to accurately process it.

Silent failures are terrible.

The only way out is to embrace open, documented, binary data formats. But until all major RDBMS (and probably Excel) have native export and import, flat files will continue to be the de facto standard. In the meantime, it would be nice if PostgreSQL’s COPY, bcp, and spool could at least try and do things like quote fields and escape characters by default when targeting delimited files.

Some additional testing

I was asked to check vroom and arrow in R as well as polars via Explorer in Elixir. The results were, not good.

vroom

1
2
3
4
5
library(vroom)
> df <- vroom('my_file.txt', delim = '|')
Rows: 238609 Columns: 22
> sum(df$Amount)
[1] 196828725

arrow

1
2
3
4
5
> df <- arrow::read_delim_arrow('df.txt', delim = '|')
> sum(df$Amount)
[1] 196848446
> nrow(df)
[1] 238701

explorer / polars

1
2
3
4
5
6
7
8
iex(1)> Mix.install([
...(1)>   {:explorer, "~> 0.5.0"}
...(1)> ])
iex(2)> df = Explorer.DataFrame.from_csv(filename = "my_file.txt",
...(2)> delimiter: "|", infer_schema_length: nil)
{:error,
 {:polars,
  "Could not parse `OTHER` as dtype Int64 at column 3.\nThe current offset in the file is 4447442 bytes.\n\nConsider specifying the correct dtype, increasing\nthe number of records used to infer the schema,\nrunning the parser with `ignore_parser_errors=true`\nor  adding `OTHER` to the `null_values` list."}}

  1. I wanted to show the pain of doing this in PostgreSQL. Only CSVs skip the first row with the HEADER option. But QUOTE can’t be set to blank/none/null. And using the default TEXT format, PostgreSQL can’t deal with the header row. So instead I had to use the PROGRAM option, which lets me run a bash script as the input and skip the first row, which then succeeds. ↩︎

January 19, 2023

Hi Jason,
A late reply this week - I completely forgot about this until very late last night.

Your TV setup looks very similar to ours but we’re lucky enough to have two wall lights behind it so the wall looks much less bare but you’re right it’s hard to put anything to garish there otherwise it’s distracting.

With 10 weeks to go, I’ve been thinking a lot about technology and how that will affect my duaghter. This post in particular made me think about how much I’m going to share about her online once she’s here. I don’t think there’s any right answer but it has occupied my mind the past few days. Come to think of it, the impending birth is basically the only thing I can think about at the moment. I’m sure that we’ll be fine but I can’t help but worry that we won’t have enough clothes or nappies, or something I haven’t even thought of will go wrong.
As for non-baby things, I’ve been having fun messing around with the omg.lol API building a CLI to interact with the service and I’m working on add a /now page to my website (as well as the new omg.lol now pages). How has your week been this week?

Speak soon,
Robb

Hi Robb,

Easily excusing the “late” reply1, with a later reply of my own. This week has been incredibly busy at work as the post-holiday break, post-three day weekend, we’re really back at it and in it, started to kick in. It’s been long but rewarding– one of those weeks where you’re exhausted, but I’m doing the kind of work I do well bringing the energy and attention I need to.

In particular, I’ve recently reorganized our team so that I have slightly smaller set of direct reports that are more “coherent” structurally– I am managing directly one person who leads each function below me. It’s too early to say if this is working better for the whole team, but this week made me feel confident that it works better for me, which is really important for avoiding burn out.

On the TV side, wall lights were another thing we considered– a sconce on each side just to give it something. It just feels strange to have so much blank space above the TV as well. I need someone to, I don’t know, share a Pinterest board or something with me so I can figure out what people actually do. The entire dilemma of what to do behind the TV reinforces a personal frustration of mine. It feels wrong that our “living room” is oriented toward a television. I would like for things to be different, but I don’t think my partner or her mother would be sufficiently on board to make that change. It’s more aspirational, really, to make sure that all television time is appointment time and not casual watching.

I cannot imagine the stack of worry that comes with being just weeks away from being a dad. It’s good to work with fun new tools right now while you can– a good distraction before side projects get put aside for a while. My gut is that it’s not worth worrying too much about online presence. I’m not a parent, and I’m not facing that decision, but my gut is that it’s easy to overthink the consequences (or lack thereof). Short of straight up exploitation, which is rare, these things seem to work out ok for parents and kids regardless of the choices they make. That’s not to say the choices don’t matter, but it seems like there aren’t wrong choices.

I’m a big fan of /now pages (I really need to update mine). I really value the narrative of a Now page. For me, it’s a time I get to think about what matters that gets lost in the series of smaller posts or dripped out updates. I have resisted adding any “automated” elements– it’d be easy to add the book I’m currently reading, for example, or maybe something like starred articles from my RSS reader. Something to think about.

We’re coming to the end of our time in Mexico. I’m thinking a lot about what makes home, well, home, and what I’ve learned about where I want to live and what I want my life to be like from 2+ months away. It’s a different kind of taking stock than becoming a parent, but I find myself taking stock nonetheless.

Looking forward to next week,

Jason


  1. Rules are once per week, doesn’t have to be right at the start. ↩︎

January 16, 2023

Here’s a thought.

Right now, I have an M1 Mac Mini on my desk. It is great, and handles almost eveyrthing I throw at it. But there are some times my work Macbook Pro is noticeably better. And it should be– I have an M1 Pro with 32 GB of RAM.

I’ve been hoping for an M1 Pro (or by now, M2 Pro) Mac mini for a while. The Mac Studio is not too expensive– but I don’t need a fan on my desk 1. It sounds like it’s possible that device is being released tomorrow.

My Synology is old. I bought it and most of it’s drives on April 25, 2015. I’m getting worried about an important piece of equipment hitting 8 years of running mostly continuously. Amazon used to have some incredibly generous backup solutions which meant my NAS was backed up, but now it’s not. I’m currently using about 10TB of capacity.

What if I bought an OWC drive enclosure and a couple of 12TB hard drives for about $650 total? I could then connect the enclosure over USB-C to my current MacMini and put it in the same closet as the Synology, running headless. I could then use Backblaze to back up my drives, and still have two open enclosure spots to use in the future. I would be giving up RAID (for now, I suppose I could always use software-based RAID in the future), but I’d gain offsite backups.

It seems Synology has abandoned making products with good transcoding CPUs, so maybe this is a better choice? I’m also pretty confiden that the Mac mini will draw way less power and generate less heat, while being far more capable.


  1. I could get over this, I’m sure, but I don’t really need the power. THe ports would be nice, but I’ve already solved that problem at my desk. A Studio that was less loud or more judicious about it’s fan might convince me, but that product doesn’t exist. ↩︎

January 10, 2023

Last week’s letter

Good Morning Jason,

what room or project are you most proud of?

The office was my top priority (my partner had different ideas) as I spend 3-4 days a week working in there and I’m very proud of how that turned out. I built the desktop and matching shelves myself from scaffold boards because finding something in the exact size I wanted turned out to be fairly difficult. This was a project that took a few weekends of lots of sanding, glueing, and staining but the final results is something I’m very proud of. Here’s an in-progress shot and the final result in situ. I also did the faux wood-panelling in our bedroom which we’re both very pleased with.

The work I do is primarily focused on property reports for tenants (inventories, fire risk assessments, etc) so there isn’t much crossover with renovating the house but I what I did learn is that planning is key. We wish we had spent a few weeks planning what we wanted to achieve before jumping into the renovation. There were definitely things that made our life a bit more difficult because we did some work when we should have waited for another job to be finished first.

That sounds like an interesting job but it must be difficult to work with organisations like schools that can be slow and unwieldy to get new tech implemented. How long have you been doing that?

I saw you posted yesterday about being ill, hope you’re feeling a bit better today?

Speak soon,
Robb

Hi Robb,

Luckily, I am feeling better. Note to self, when you order a steak medium and it comes out just barely rare just send the damn thing back. The day of suffering that followed was not worth it.

I’ve done some more work in my office since this last photo, but this is a not-terribly-inaccurate representation of where things are. I also use the IKEA pegboard. I did not quite get as fancy on the desk itself– which is an IKEA Karlby 98" top that I had a friend cut to 80" and then added some really cool metal legs from an Etsy shop. When the pandemic hit we went 100% remote, which meant tha this room got transformed into an office. I probably have 6-10 scattered blog posts about the process that landed on the setup linked above– most of the changes by now are additional plants and things hung on the wall (plus some equipment changes).

I think it’s pretty natural for the office to be the place you’re most proud of– it’s one you get to call your own and the spot you’re probably stuck spending the most time in.

We’ve been thinking about doing a similar paneling look either behind our bed or possibly behind our TV. Maybe that’ll be a project for when we return home. It’s hard to have a big wall behind a TV– it looks bare without anything, but most things we could put there would be distracting.

A living room with a TV on a walnut stand with gray doors and two black floor standing speakers.

I’ve been working at my current company nearly 9 years. Before that, I worked at a university research center working with school districts on early warning systems, and before that, I worked for the state department of education. I think what’s most challenging is that everyone is well-established. There aren’t new school districts popping up building their systems and processes from scratch. The people, organizations, culture, and work processes are all fairly fixed. So we have to do things much more completely and better than most companies to even get in the door. Then we have to get a large set of folks on board so that we can deliver on our promise. We’re a small team and we’re supporting billions of dollars of budgeting and monitoring. There’s a lot of technical/systems and cultural debt that we have to work with to succeed.

That said, the opportunity for improvement is huge, and it’s very satisfying when someone gets it and we can make their work so much easier and more effective.

Looking forward to next week,

Jason

January 7, 2023

Last year’s theme was fun. I didn’t write a ton about it. My “annoucement” post simply said,

I’ve decided to focus on Fun in 2022. I just haven’t had enough of that these last few years.

I didn’t quite know what I would seek out for fun, but it turns out, it took me just 8 days.

I signed up for a volleyball league, and by May I escalated from one night a week to 3-4 nigths a week. By September, I was joining more intermediate play. Returning to volleyball after 17 years was a tremendous amount of fun. And although due to surgery and travel I haven’t played since early October, it’s one of the things I’m most looking forward to when we return to Baltimore in February.

In the summer of 2021, we went to Mexico and the most fun I had was our day of biking, hiking, and swimming through the jungle and in caves. In fact, all of the highlights of the last few years for me were days with strenuous physical activity. It’s not the only thing that brings me joy, but these days are sharper and clearer in my memory than any other. They’re sharper than the other good times, and they’re sharper than the other bad times. I have to keep reminding myself of this, because my base motivation is still to remain stationary. It’s hard for me to motivate myself to get up early on the weekend and go for hike. I never regret when I do.

Volleyball was great because I had to schedule it and put it on my calendar. I built a small community of friends and people I wanted to see. I hoped they were happy on the days I could make it. And because signing up was a promise of a full court, or at least enough people to play, there was just enough guilt to mean that signing up meant going. Scheduling my physical activity with limited slots and friends who are relying on me seems ot lower the activation energy just enough to make it happen. I knew this about myself– I still go to a gym that is entirely based on small group training, and my consistency there is entirely due to the same factors that lead me to showing up for volleyball. It’s scheduled, choosing a session means locking someone else out, and there’s a community there I look forward to spending time with.

Volleyball wasn’t the only source of fun. I took a desperately needed trip to Puebla and Mexico City in early March. Personally and professionally, 2021 was a rough year. And although 2022 was a year of full of healing, growth, and fun, 2021 was not quite done with me those first two months. I’m glad we had that trip planned, but I’m also proud that I used that trip to restore myself. I set solid boundaries with work before, during, and upon my return. And I don’t think it is an exaggeration to say that I came home a healthier person, more capable of moving forward than I had been in a long time.

That trip rolled into a fun weekend in Chicago in May. It was the perfect bite-sized vacation that just wasn’t possible during the peaks of COVID. It felt a lot like our trip to New Orleans in December 2019– fast, fun, restorative, and mostly, normal. It was around this time that we started to take more seriously an idea that we had while in Mexico– maybe we should spend a good chunk of winter in Mexico City.

Baltimore is dark and cold in the winter. Mexico City stays mild (50s at night, 70s during the day) pretty much year round. Because it’s further south, there’s significantly more sunlight during winter. Because both Elsa and I get time off from work for the holidays and work remotely, good wifi is pretty much all we need. Looking back, I never had work trips in December and January.

Although I had tons of anxieties about booking a long time away from home, I said yes in the interest of fun. Today, I’m writing from Mexico City, about halfway through our stay. I’m glad I said yes, and I’m glad to have had fun guide me.

All of my concerns and anxieties stemmed from an idea of what the best use of our time and money was. I am a person who has often let worry, planning, optimizing, and a host of other anxieties paralyze me into inaction. I want to do these things, but because I perceive these opportunities as rare and limited, I allow myself to be frozen, or I allow the expectations swamp any possible reality, zapping the fun from existence.

In order to have fun, I have to find ways of letting go of these anxieties and just do.

This extended to food. I have been generally eating healthier– my body is keeping score and it’s clear this year was a strong year for my healthy. At the same time, I had some of the best food of my life this year. I’m doing a better job of allowing myself to make food something I can celebrate. I make better choices for the every day mundane meals and find ways to make that still filled with joy. I know how to cook healthy food I love. I know how to get food quickly that’s still healthy when convenience is more important. But I’ve also sought out great food, sometimes expensive, often not, and let myself enjoy great meals. I’ve eaten healthier and better in every way.

But having fun wasn’t just about saying yes, it was also about boundaries and saying no. It was about doing a better job of turning off when I needed emergency surgery and not working and trusting my team. It was about going to Cuba without connectivity and being ok. It was about taking those trips and being present where I was. It was about separating the personal and professional relationships I had, even with the same person, so that each can be more healthy. It was about letting some things take longer at work so that other parts of me had time to thrive. It was about being more aggressive about putting books down I was not enjoying. Stopping things I thought would be fun but weren’t. Making easy commitments when they felt right and avoiding commitments that didn’t.

Was 2022 the most fun I’ve ever had? No. But it was a successful return to fun, or at least a year where I built better tools to find fun and to nurture the things that are fun.

It’s disappointing that MRAN and checkpoint are being shut down. They were incredibly simple ways to move toward more consistent environments.

We’ve moved on to the Posit package manager because of binary availability.

But! The right move for the R community would be to rally behind renv and lock files in general. This is much more in line with how the broader development community ensures reproducible software builds.

Things are still a little clunky in renv land, but I’m confident with increased adoption we’d see rapidly improved ergonomics.

January 5, 2023

This actually isn’t surprising at all, but it still needs to said over & over — the biggest barrier to more urban biking in cities is the fear of cars. “A study confirms that if we are serious about getting people on bikes, they need a safe place to ride.”

Brent Toderian, linking to Biggest Barrier to Biking Is a Fear of Cars

We don’t need a $7,500 tax credit for electric cars. We need to spend money on safe, separated bike infrastructure and e-bikes.

DayOne has turned out to be the perfect travel journal. There’s not a lot I want to write about while in Mexico, but Elsa and I did want to keep track of where we ate.

I thought about using various geotagging services, but very few are private. Those that are, well, kind of stink. But I’ve been making entries with pictures and taking advantage of DayOne’s great support for geotagging to record most of our meals here in Mexico City. When making DayOne entries, your location is recorded. This way you know where you are, the weather, and other facts while writing an entry. One killer feature is that when I add photos to my entries, DayOne will prompt to ask if I want to change the entry date and location to match the date and location of the photo. This means I can take pictures inside a restaurant, museum, park, or store I like and not worry about making a journal entry in the moment. I can come back days later and still get the correct date, time, and location for my entry.

I have a private map of all the places I’ve been. Since CDMX is likely going to continue to be a fairly regular destination, it’s easy to keep track of favorites and make sure we try new things.

When I started using DayOne I wasn’t sure what it would be for. Over the years I just keep finding new ways to use it. It’s not just one thing for me. None of my favorite tools are.

It feels like in the past few years there’s been a growing wave of people talking about the power of adult friendships (and frankly, the crisis of adult friendships, at least in the United States).

Friendship Forever is another article in that vein. It’s filled with powerful quotes. This one is my favorite:

But no matter the medicinal virtues of being a true friend or sustaining a long, close relationship with another, the ultimate touchstone of friendship is not improvement, neither of the other nor of the self: the ultimate touchstone is witness, the privilege of having been seen by someone and the equal privilege of being granted the sight of the essence of another, to have walked with them and to have believed in them, and sometimes just to have accompanied them for however brief a span, on a journey impossible to accomplish alone. —David Whyte

January 4, 2023

I was reminded of Noah Smith’s great The internet wants to be fragmented post from a couple of weeks ago when Matt Bircher linked to it.

Matt pulled out the same quote that resonated most strongly with me:

It started with the Facebook feed. On the old internet, you could show a different side of yourself in every forum or chat room; but on your Facebook feed, you had to be the same person to everyone you knew.

I didn’t always believe this. I changed all of my user names to be something that was both consistent and identifiable around 2006. One reason1 I did this was that I came to believe that being identifiable meant I could be held accountable. Pseudonymity and anonymity in most contexts felt like avoiding standing behind your statements. 2 This was wrong.

At this stage, I kind of think that Reddit is closest to an ideal centralized system. It’s a place that aggregates many individual communities around one log in. Each community can be moderated based on its norms, with voting distributing norm enforcement in a fairly easy way. I think the one thing that’s missing is you still have to be the same person across Reddit. Imagine if Reddit allowed a unique psuedonym for each communtiy you post in. One log in, but your identity does not persist across communities. Finding that I wrote something in r/Urbanism about my politics doesn’t invite you to come after me in r/ProductManagement when I’m discussing something professional. Reddit the service can know I’m the same person, and this way truly egregious behavior can lead to more global banning, but otherwise, each identity can be separated to be policed separately within each separate community. No more clicking on my user name and finding me anywhere across the site. Of course, Reddit doesn’t even let you change your username, so there’s little hope in this feature appaering.

Centralization on the web is valuable to the extent that it permits me to have one login that aggregates multiple communities/people I’ve curated. That’s what my Twitter feed was/Mastodon feed is. That’s what my list of RSS subscriptions are. That’s what the Reddit communities I’ve joined are.


  1. The other reason was that after more than a decade online, from prior to puberty through college, I realized my name was the one thing I would not grow out of as an identifier. ↩︎

  2. To be clear, I always understood there were legitimate reasons to not use your real identity or to have an identity that was not easily tied to your “actual” (in real life) identity. ↩︎

January 2, 2023

In all the blogs I have ever written, I have had analytics that tells me how many visitors came to various pages. What posts were popular? Tap tap tap 🎤 is this thing on? When I moved my blog to Micro.blog after years of self-hosting, I removed my Google Analytics snippet. What little proof I have that anyone is “here” comes infrequent and primarily from strangers.

I don’t know who has subscribed via RSS.

I don’t know who is following @jsonbecker@json.blog via ActivityPub.

I don’t know who subscribed to the newsletter I briefly turned on and paid for and then turned off, though it still appears to get sent.

I don’t use Conversation.js to view WebMentions or replies of any kind.

All other forms of social media tell me not just how many people follow me, but who has followed me. Most provide me with stats on individual posts, including both views and various forms of interactions. I don’t know how many people read this site. I don’t know who reads this site. Even if I wanted to know, I’d have to somehow collect RSS subscriptions, site page hits, Micro.blog views and interactions, Twitter views and interactions, and Mastodon views and interactions– at a minimum– to get any kind of picture of “reach”.

I used to like knowing that a particular page about how I solved a problem in R continued to get a lot of search traffic. As a result, I was motivated to keep that post reasonably up to date. On social media, I liked knowing that certain friends were reading— it made it possible to make a knowing joke or let me assume that they knew about something that was going on with me because I knew they read it. I guess I don’t agree that likes, follows, replies, or audience metrics are distorting popular contests. Not all feedback is toxic.1

Maybe it’s easier for me to absorb the various metrics about posts because I’ve never had meaningful internet popularity, nor was that ever my goal. I don’t like blog comments— this site is for my words, not everyone else’s— but I do enjoy replies, which remain significantly easier on social platforms than anywhere else. I only rarely receive replies, and I get them entirely through social-like systems where I crosspost like Micro.blog, Twitter, and Mastodon. I like getting a like, because it says, “I was here, and what I found resonated with me.” I’ve had my email address on this site for years and received one email in all of that time. I can’t help to feel like there are better solutions than stripping it all away.

Some people use their blogs as a personal repository of knowledge. They talk about how their site is like a public version of their outsourced brain, letting them search for answers they already have. That’s not why I write. These are my thoughts, sometimes personal and revealing, often not. They always start as something private, but they become something I choose to make public. I want someone to read what I write. I want it to make them laugh, or smile, or think, or get angry, or just get to know who I am a little better.

Why do I write anything in public? Mostly because I would drive my friends crazy with emails and text messages if I shared each thing I thought they might like with them. I kind of already do. I would drive them crazy if I shared all the thoughts I have that I’d love a reaction to. Writing in public is an easy way for me to broadcast to a self-selected group of folks and have them grapple with and engage with me. It helps to maintain many social and para-social relationships without the pressures of direct, synchronous communication.

If a friend leaves me “on read” when I sent them an article directly with my thoughts, I’m going to feel bad. Did I interrupt them? Am I annoying? Are they interested in this conversation? Are they interested in me?

If I write 10 blog posts and I find out they read just one of them, however they let me know, on their own time, I feel great.

I have been thinking about all of this since reading Monique Judge call for a return to personal blogging. I agree with so much of that article, which is why I’ve been semi-consistently blogging for years. But there’s one thing that struck me as, if not wrong, challenging:

People built entire communities around their favorite blogs, and it was a good thing. You could find your people, build your tribe, and discuss the things your collective found important.

Creating communities around blogs remains hard. Very popular sites with authors that focus on very specific topics who also spend significant time moderating their comments sometimes ended up with an entire community. Most blogs just got loads of spam and a drive by comment from someone who landed on your page via Google and decided to be a jerk.

One of the triumphs of social media over blogs was how quickly and easily you could join or bootstrap a community. Are these communities as great as the niche internet of 2001? No. But so many more people were able to find community on social media. Web 2.0 was meant to make the niche web that felt like a community accessible to everyone. It succeeded.

Social media’s success at bringing community to everyone on the internet is mirrored in its failure to ensure those communities were healthy and safe. The real Web 3.0 shouldn’t retreat from some of the goals of Web 2.0 – replies, likes, reposts, follows, and views are all native parts of how communities are built on the web today. I don’t think they are the problem. I just don’t think they are the end point.


  1. In truth, I think the feedback should impact what I write. If I knew that writing some R code on here got 10x the views and that they came almost entirely from people not following me, it’d be a pretty good sign that it would be worth making it easier to just follow that content from me. Not everyone needs to read the “personal” part of this blog, and I often want to “follow/subscribe” to an intersection of a person and some topics they care about and not have to read everything someone writes. That has been the best and worst part of social media consumption– you’re stuck with the whole person, every time. ↩︎

January 1, 2023

This month, I will be corresponding with Robb Knight. He can be found on Micro.blog at @rknightuk.

Hi Jason,

We have only interacted briefly on Micro.blog so I figured I should start by introducing myself. I’m a 30-something developer working on software for the property industry. I live with my partner, Jess, and two cats in Portsmouth on the south coast of the UK.

We have spent the past 12 months decorating and redoing every room in our house - the previous owners lived here since it was built in 1971 and hadn’t done any work to it since then. This involved me learning a whole set of new skills like floor laying, wallpapering, and fitting new skirting boards (baseboards for Americans).

In July we found out my partner was pregnant with a girl and she is due in March 2023. This accelerated the timeline of getting the house finished but we are now ready for her arrival at least in terms of furniture and the nursery. Mentally ready? I’m not so sure.

Look forward to hearing from you, Robb

Hi Robb,

First of all, congratulations on pending fatherhood! I’m glad we were able to slip in our month of correspondance before the pending sleep depravation.

What an exciting and busy year. Even though we moved into our home 5 years ago (and it was new construction), I still feel like we need to keep decorating and redoing. Our work has been less skills-based and more “accumulating more stuff than I am comfortable owning”-based, since our new(ish) home is much larger than the 700 square feet we lived in previously. I have always found that I have ambitious of being handy in theory, but mostly fail when it comes to applying that ambition. At this stage, my partner Elsa just pays people to do things before telling me they’ve gone wrong or haven’t happened.

I am curious, what room or project are you most proud of? I’m not quite “done”, but pretty close to having my office set up how I’d like. It was a big pandemic project since we got rid of the company office right away. Having my own space has changed my whole relationship with my home.

I took a peek at the work you do and it’s fascinating. I have actually discussed this area (home management, focused on home inspections in the US followed by “asset management” and warranty support nad the like) with my work partner multiple times as an idea to pursue{^tech]. The intersection of home-renovation and your work must have been an interesting exercise. I’d be curious what you’ve learned managing your house that suprised you or changed your perspective on the work you do day to day.

Thanks for your participation in Letters. I’m already enjoying this project, and I hope others will as well.

Jason

December 31, 2022

I read fewer books and fewer pages this year than last year. That’s ok– 2022 saw the conclusion of the Scholomance Trilogy and The Founders Trilogy, both of which ended in deeply emotionally satisfying ways. It also so the continuation of the Checquy novels, Dan Moren’s Galactic Cold War, the Nisibidi Scripts, and more, which all had strong entries.

I enjoyed everything I read this year, but I’m not sure that anything was truly a standout. Robert Jackson Bennett has now had two trilogies in a row that I adore and felt stuck the landing. Naomi Novik was already a favorite with both Uprooted and Spinning Silver, but the Scholomance books have cemented her alongside Robert Jackson Bennett, NK Jemisin, Adrian Tchaikovsky and Becky Chambers as “writers I will buy sight unseen until they prove otherwise.”

I still have a lot of sequels to catch up on, including by some authors in my “must read” list, so I expect 2023 to be off to a quick start. I’ll stick with my goal of 40 books, because that seems to be about right in terms of level of “challenge”, though I still wish I could ramp up to 52 a year.

I listened to a few audiobooks this year (non-fiction) that I continue not to track (and a couple of non-fiction books). I really miss iTunes University lectures, so I think I’m going to try and find more lectures to listen to in place of podcasts next year.

This Year in Reading

Upgrade: A Novel by Blake Crouch
Upgrade: A Novel by Blake Crouch
Binti (Binti, 1) by Nnedi Okorafor
Binti (Binti, 1) by Nnedi Okorafor
Lies Sleeping by Ben Aaronovitch
Lies Sleeping by Ben Aaronovitch
Station Eternity by Mur Lafferty
Station Eternity by Mur Lafferty
Amongst Our Weapons by Ben Aaronovitch
Amongst Our Weapons by Ben Aaronovitch
A Prayer for the Crown-Shy (Monk & Robot, 2) by Becky Chambers
A Prayer for the Crown-Shy (Monk & Robot, 2) by Becky Chambers
Midnight Riot by Ben Aaronovitch
Midnight Riot by Ben Aaronovitch
Akata Woman (The Nsibidi Scripts) by Nnedi Okorafor
Akata Woman (The Nsibidi Scripts) by Nnedi Okorafor
Hunter's Trail (Scarlett Bernard) by Melissa F. Olson
Hunter's Trail (Scarlett Bernard) by Melissa F. Olson
Trail of Dead (Scarlett Bernard) by Melissa F. Olson
Trail of Dead (Scarlett Bernard) by Melissa F. Olson
Dead Spots (Scarlett Bernard) by Melissa F. Olson
Dead Spots (Scarlett Bernard) by Melissa F. Olson
Blitz: A Novel (The Rook Files, 3) by Daniel O'Malley
Blitz: A Novel (The Rook Files, 3) by Daniel O'Malley
Gallant by Victoria Schwab
Gallant by Victoria Schwab
The City of Dusk by Tara Sim
The City of Dusk by Tara Sim
The Golden Enclaves: A Novel (The Scholomance) by Naomi Novik
The Golden Enclaves: A Novel (The Scholomance) by Naomi Novik
City of Bones by Martha Wells
City of Bones by Martha Wells
Plague Birds by Jason Sanford
Plague Birds by Jason Sanford
The Nova Incident: The Galactic Cold War Book III by Dan Moren
The Nova Incident: The Galactic Cold War Book III by Dan Moren
Light From Uncommon Stars by Ryka Aoki
Light From Uncommon Stars by Ryka Aoki
The Immortal King Rao: A Novel by Vauhini Vara
The Immortal King Rao: A Novel by Vauhini Vara
Locklands: A Novel (The Founders Trilogy) by Robert Jackson Bennett
Locklands: A Novel (The Founders Trilogy) by Robert Jackson Bennett
Machinehood by S.B. Divya
Machinehood by S.B. Divya
False Value (Rivers of London) by Ben Aaronovitch
False Value (Rivers of London) by Ben Aaronovitch
The Hanging Tree by Ben Aaronovitch
The Hanging Tree by Ben Aaronovitch
Whispers Underground by Ben Aaronovitch
Whispers Underground by Ben Aaronovitch
Foxglove Summer by Ben Aaronovitch
Foxglove Summer by Ben Aaronovitch
Broken Homes by Ben Aaronovitch
Broken Homes by Ben Aaronovitch
Moon Over Soho by Ben Aaronovitch
Moon Over Soho by Ben Aaronovitch
Fevered Star by Rebecca Roanhorse
Fevered Star by Rebecca Roanhorse
Beautiful Country: A Memoir by Qian Julie Wang
Beautiful Country: A Memoir by Qian Julie Wang
House of Sky and Breath by Sarah J. Maas
House of Sky and Breath by Sarah J. Maas
Hollow Kingdom by Kira Jane Buxton
Hollow Kingdom by Kira Jane Buxton
The Language of Power (Steerswoman Series) (Volume 4) by Rosemary Kirstein
The Language of Power (Steerswoman Series) (Volume 4) by Rosemary Kirstein
The Untold Story (The Invisible Library Novel) by Genevieve Cogman
The Untold Story (The Invisible Library Novel) by Genevieve Cogman
Two Serpents Rise (Craft Sequence, 2) by Max Gladstone
Two Serpents Rise (Craft Sequence, 2) by Max Gladstone
Elder Race by Adrian Tchaikovsky
Elder Race by Adrian Tchaikovsky
December 29, 2022

If I took a picture of every sign I loved in CDMX, I would never get to my destination walking. Here’s a few I stopped and snapped, but far from the “best” or only ones that I liked.

December 19, 2022

Sometimes it pays to stare, to sit with ideas, to think, to see all the cards laid out in front of you for days until a coherent narrative appears. For me, it was looking at a pile of disconnected features and ideas and being unable to figure out how they did or did not fit together.

After over a week of coming back to various lists and filling in the details, a bigger thematic picture emerged.

I now know what stays and what goes and how to help my team make decisions along the way without me.

The work that’s left to do is all about being Consistent and Complete.

Over time our product has left behind a bunch of small improvements that are obvious to us and our users, but for various reasons were cut from scope with the initial feature implementation. And as our application has grown and matured, we’ve sometimes developed better user experiences and/or better components and patterns in our implementations to achieve similar ends. We need to pick up after ourselves, finish our rough edges, and use our best implementation everywhere, every time.

Now to convince the leadership team, and prove that these guiding principles can be used to say “no” to some things and “yes” to others.

December 17, 2022

The social web really started as an easy, consistent way for someone without technical skills to build a homepage about themselves.

As blogging took over, the social web became an easy way to read and write in one place with a consistent experience.

I think we over consider the social/follow/respond/boost elements of these products. What these services and their client applications got right was where we read is where we write. Maybe NetNewsWire had it right when it’s early iterations had blogging features, versus later splitting MarsEdit into its own application.

What Google Reader 1 had right was commenting and stars, adding a conversation and sharing layer to the web. It lacked the blogging tools for a more complete “write” layer, but it added an element of discovery/serendipity and a chance that what you write will be seen.

That’s why reading and writing in the same interface is so powerful— there’s a sense that everyone else participating in reading and writing on the web might interact with what you have to say. Google Reader created a global comment section, filtered to the people you chose. Twitter functioned much the same way.

All of this could live via protocols and without centralization, whether through Webmentions + RSS + Micropub/MetaWeblog or ActivityPub. But easy reading and writing in one place, with a strong product vision, and just the right amount of serendipity is the magic formula.

The power of RSS (and feeds in general) is you can build a reading product on top of the whole web, and anything you write is also available to the rest of the web. The power of domains is owning your identity (or identities, real or pseudonymous). The power of webmentions and ActivityPub Notifications is visibility, with your response and links communicating with the original content that someone has read it and responded.

Maybe the one missing piece from the existing set of web protocols is discovery. Domain as identity side steps discovery, and I think this is where services have excelled. They create a single, standardized index for finding people who share your interests (or who you already know). I wonder if standard About or Now pages with appropriate metadata could make indexing and discovery more consistent.


  1. You knew I’d bring this up, right? ↩︎

November 28, 2022

I have decided to work on a new project on this blog in 2023. This is risky, because I need other people to participate and I came up with the idea this morning. So this serves as both an announcement and a call for participants.

Letters will involve me corresponding with someone else on the internet over the course of a month. Each week, we will each write a letter to each other. There are no set topics. The rules will be:

  1. The person I’m corresponding with will write the first letter.
  2. I will respond during the same week. They do not have to write again until the next week.
  3. Each letter will be at least 250 words.
  4. I will post the correspondent’s letter followed by my response on my blog. If they have a blog, they can do the same and I will gladly link to them.

If you’d be interested in participating in Letters, email me at hello@jbecker.co and provide me with two months in 2023 that could work for you. Feel free to link me to your personal webpage or provide a short bit about yourself (but no pressure on any idea of topics).

Why am I interested in this project? I was thinking about how much of our history (in the West at least) comes from important figures having extensive private correspondences that were saved, catalogued, and released after their deaths. And while I’d love some private pen pals, it just got me thinking that public letters are a rich way to discuss complex issues. What’s the point of having a blog and not restricting my online presence and interactions to 280 characters if not engaging in richer, complex conversation?

My favorite world online was that of personal blogs and journals in conversation with each other. I’m hoping Letters can jump start that for me. I’m also hoping to build some deeper relationships online with folks who are in my orbit enough to see this post, but I don’t really feel like I know them in a meaningful way.

So that’s the idea for 2023. I hope some of you will be interested in participating so that I don’t have to spend late December scrambling to convince folks to write a letter to me each week or cancel the project all together.

Let’s make something fun on the internet.

I already have one confirmed, and three “maybes” for my Letters project.

Although I was not inspired by Substack, who it turns out I was ripping off, I thought I should share what set me on this train of thought.

Jess shared a tweet thread by Michelle Huang about training GPT-3 on her diary. The result was the ability to have what Michelle felt were uncanny conversations with her younger self. It was an interesting example of using what I normally deem to be pretty creepy machine learning in a therapeutic context. As per usual for me, instead of coming away with thinking about all the things I bet I was supposed to think about, my takeaway was this– I don’t have much, if any, writing from my younger self. What journaling or writing I did is all lost to deleted LiveJournals, bulletin board forums, blogs, or hard drives erased and dumped. I don’t often mourn for this material, but this was the first time in a long while I thought, “If I knew about this possible future application, I might have saved more of what I wrote, even if it was just somewhere for me.”

A few hours later, I read about yet another famous, respected actor defending their colleagues who hold disgusting views. “We can’t ban people!” or some such “anti-woke” nonsense was their response, I believe. And my thought was, “One of the ways that our modern media climate is a mistake has been the direct access to artists.” Of course, this isn’t really true; direct access has enabled entirely knew ways to get paid to do art and build a large audience. But my thought stemmed from the idea that it is so much harder to separate the art from the artists these days. We have an all too direct line to the thoughts and feelings of famous people we admire, whether through social media or the expectations that they will speak to traditional media. And I wondered, would we be better off if all most of us ever knew of the artist was their work and how we understand it, at least until their death when their records and letters are released revealing all of their horrifying and deplorable beliefs? The thought was that I am almost more comfortable letting history understand people posthumously, while letting those of us who are the artist’s contemporaries experience only their output.

It is so much harder to believe the artist is dead when they won’t shut up.

It was these two thoughts, disconnected by several hours, that had me thinking about letters. I had the distinct idea of correspondence as this important way that people are revealed to us when they are gone. This also reminded me of the so-called Republic of Letters, which seemed to have been en vogue to reference during the “Will he or won’t he?” period of Elon Musk’s Twitter purchase.

And so I thought, “In some ways, my blog represents my correspondence.” I publish on my own site to be in control of my content. This is the largest and most personal repository of my writing1. It would be better if this blog was in conversation with others. That’s a part of the early web that I miss. We still link to other people, but rarely do I find blogs in conversation with each other.

Letters come from this.


  1. Ok, to be honest, I did start journaling a few years ago, although a lot of the content is on this site as well. But I’m not sure I’d ever want someone to have access to the unpublished entries in my Day One. That’s a level of intimacy that feels almost profane. ↩︎

November 14, 2022

Caveat— I’ve been outside of thinking about the big education philosophy stuff for some time and this may well be a repackaging of a core debate I read tons about and forgot.

This morning I was thinking about whether we need the schools that prepare kids for the world we have or the world we want to create. I think a fair amount of disagreement lives between these two positions. Then there’s the secondary conflict among those who want schools that prepare kids for the world we want to create, because they disagree with what that future world should be.

This came to mind reading an expert who felt it was important to look past whether a program is labeled as bilingual, English transition, or many incarnations in between when considering if an instructional program is empowering.

** This post included a twitter embed that is no longer available, because Elon Musk is a shit **

November 12, 2022

Artists have periods. During these periods, they create a certain kind of project and it runs its course. Then after a time of exploration, they often being creating again, producing a very different kind of art. Sometimes, they never make art again.

That seems almost impossible as a “creator” online today. The business model is fundamentally about broadcasting, in volume, with consistency.

Maybe this is more specific to “influencers”, but I don’t think so. Think of educational video makers on YouTube. Think of podcasters on episode 500. Think of paid newsletter writers. Think most blogs.

I think these projects need to have an end, or at least ending needs to be an easier option. There needs to be a healthy way to transition away when a project is done and the creativity has been thoroughly wrung out. I am not even sure we know how to talk about these things ending and let folks move on in a healthy way.

I can think of some prominent projects (in my corner) that came to an end. Every Frame a Painting. Minimal Mac. Hypercritical. But ending is the exception, and not the rule.

Being able to mark some thing as complete and move on, much like creating ephemeral, feels counter to our current cultural technology on the web.

November 11, 2022

For years, large crypto-based companies have been living on a huge hype machine. Whenever folks tried to understand where exactly any value was being created it always came down to one thing.

You reach the point of saying, “That sounds like a total scam in these ways!” The response to this was always, “You just don’t understand this, it’s very complicated and incredibly smart people can see what’s going on. Just look at how much this coin1 is trading for!” Then you have to decide for yourself:

  • Am I smart enough to understand this?
  • The social proof is so significant, I may not understand this, or;
  • I believe the social proof above my own understanding, and I’m getting on board.

I am hear to tell you that the first choice was always right, regardless of the incredibly wealthy people who decided to put significant, personal financial interest into convincing you that the second or third option were correct.

It’s just that now, it seems everyone is finally coming around to seeing they were duped.

Unbelievable that FTX and Binance are valued on the basis of crypto assets they created with absolutely no use other than a speculative circle jerk.


  1. There’s always some *coin, and it is always traded, and it’s value is always supposed to be something other than trading for the coin, but no one can ever really tell you what that is. ↩︎

November 9, 2022

I don’t really follow 538 for my mental health, but I just saw that Nate Silver wrote some pieces as Nathan Redd and Nathan Bleu to provide two different perspectives for how things might go. I think that’s a great way to help people understand the probabilistic nature of modeling.

Consumers of popular media still struggle with 538 or with NYTimes and their needles on election night. It’s not because this form of journalism is not valuable or capturing important information. It’s because thinking in models and probabilities is still deeply unnatural for most folks.

In the practice of data visualization, there’s been all manner of attempts to visualize uncertainty. We add error bars. We add shaded regions. We play with jittering, simulations, alpha channels, and drawing curves. Folks have built simulators to demonstrate drawing from distributions or the structure of joint-probabilities and such.

But as in the world of policy research, even the most sophisticated folks are drawn strongly to point estimates.

Unfortunately, the idea of “getting it right” based on the point estimate remains a strong measure of success in the eyes of many. And deviations are not meant with an academic curiosity, examination of the data, or consideration of our own failing heuristics. People are angry because they rely on their perceptions, even when their perceptions are filled with unearned certainty built on the back of common failed heuristics.

When the point estimates have disappointed, too many folks have run back to journalism and commentary built entirely on vibes or declared polling and data journalism a failure. The post-mortem analysis on election modeling, which has been fascinating, thorough, and revealing over the last few years, was fully recast (unnecessarily) as an apology tour and still, we cling to the certainty of point estimates.

So Redd versus Bleu is a great idea. All of visualizations and numbers and technical explanations in the world have not provided the average media consumer with strong enough skills to interpret data with uncertainty. Instead of going back to the vibes op-ed to understand what’s going on, experts with the right skills for statistical inference can model how to understand the data from differing perspectives. Sure, it might trigger feelings of Lies, Damn Lies, and Statistics, but uncertainty is the reality we live in and the muck we have to understand.

I don’t always love Nate Silver, and I don’t always love 538, and I don’t always love how data journalism has shaped coverage of elections. But I think modeling statistical inference and interpretation of uncertain data from multiple perspectives in narrative form is an important tool. I hope to see this expand into more policy discussion space.

I think we’ve taken visualization and simulation as far as we can go for helping people to understand data. The next frontier is making the narrative steel man argument, from data, for different possible interpretations.

November 6, 2022

Eight years ago today, 6 months into my Allovue journey, I came down to Baltimore for a party in our small office above a bar. The party started at 6, and Ted and I still hovered over a laptop at 6:15pm. We excitedly called Jess over to show her– we just fully loaded our first set of general ledger accounts and transactions into Balance. It was, I hope, her favorite birthday present that year.

I’d be lying if I said that I knew that day was a key milestone in a life-defining adventure and partnership. It just seemed like a cool problem we solved.