Jason Becker
April 16, 2015

An initial proposal has been made to the city of Providence and state of Rhode Island to keep the PawSox in Rhode Island and move them to a new stadium along the river in Providence.

The team is proposing that they privately finance all of the construction costs of the stadium while the land remains state (or city? I am not clear) owned. The state will lease the land underneath the stadium (the real value) with an option to buy for 30 years at $1 a year. The state will also pay $5,000,000 rent for the stadium itself annually for 30 years. The PawSox will then lease back the stadium at $1,000,000 per year. The net result will be the stadium is built and Rhode Island pays the PawSox owners $4,000,000 a year for 30 years.

The Good

Privately financing the upfront cost of the stadium puts risks of construction delays and cost overruns on the PawSox. Already they are underestimating the cost of moving a gas line below the park grounds. Whatever the cost of construction, whatever the impact on the team of a late opening, the costs to the state are fixed. There is essentially no risk in this plan for taxpayers, defining risk as a technical term for uncertainty. We know what this deal means: $120,000,000 over 30 years.

The interest rate is pretty low. Basically, although the risk is privatized, we should view this stadium as the PawSox providing the state of Rhode Island a loan of $85,000,000 which we will pay back at a rate of approximate 1.15% 1. Now just because the interest is low doesn’t mean we should buy…

The stadium design is largely attractive, even if the incorporated lighthouse is drawing ire. I don’t mind it, but I do like the idea of replacing it with an Anchor has some Greater City Providence commenters have recommended. Overall, I think the design fits with the neighborhood. It’s easy to get caught up in pretty renderings.

The pedestrian bridge remains and is accessible. As someone who lives in Downcity, I am very much looking forward to this dramatic improvement to my personal transit. I think the bridge’s importance for transit is underrated, although admittedly we could make Point Street Bridge friendlier to pedestrians and bike riders instead.

Brown University seems interested in hosting events, like football games, at the stadium. The plan also seems to give the state a lot of leeway in holding various events in the space when it’s not used for the baseball season. It could really be a great event space from mid-April until early November each year.

The Bad

Even the team’s own economic estimates only foresee $2,000,000 in increased tax revenues. Although they claim this estimate is conservative, I would take that with a huge grain of salt. You do not lead with a plan that straight up says the taxpayers will be out $60,000,000 over 30 years unless you don’t have a better foot to put forward. I am going to go ahead and assume this estimate is about right. It’s certainly in the ballpark. (Ugh.) But what that means is that Rhode Islanders should understand this is not an investment. This is not like building transit infrastructure or tax stabilization agreements to spur private construction. This deal is more akin to building schools. We do not, in fact cannot, expect that the economic impact makes this project a net positive for revenues. With $12,000,000 expected in direct spending, the project could be net positive for GDP, but even then it is obvious this is not the best annual investment to grow the economy. It is easy to come up with a laundry list of projects that cost less than this that could create more economic activity and/or more revenue to the state and city. Therefore, the project should be viewed primarily on use value. Will Rhode Islanders get $4,000,000 a year in value from the pleasure of using (and seeing) this stadium and its surrounding grounds? In school construction, we expect the benefits to be short term job creation, long term impacts on student health and well-being, ability to learn, and our ability to attract quality teachers. But most of those benefits are diffuse and hard to capture. Ultimately, we mostly support school construction because of the use benefits the kids and teachers see each year.

The time line is crazy. If they’re serious about a June decision, they’re nuts. We have a complicated budget process ongoing right now. We have a teacher contract in Providence to negotiate. We have a brand new I-195 Commission trying to make their mark and get cranes in the sky. There’s no way a negotiation in good faith can be completed in 60 days unless they agree to every counter. If this is a “final best offer”, essentially, due to time line, then it is disingenuous.

What happens in 30 years? We don’t have any guarantees of being whole in 30 years, and the same threats and challenges posed by the PawSox today will come up again in 30 years. Are we committed to a series of handouts until the team is of no monetary or cultural value?

Other cities are likely going to come into play. The PawSox don’t have to negotiate a deal that’s fair for Rhode Island. They just have to negotiate to a deal that’s comparable to an offer they think someone else will make. Rhode Island’s position is weak, provided that anyone else is willing to make a deal.

The Strange

The PawSox are asking for a 30-year property tax exemption. There’s a lot to think through here. First, there are at least two parcels that were meant to be tax generating that are a part of this plan– the land Brown’s Continuing Education building currently sits on and the small develop-able parcel that was cut out from the park for a high value hotel or similar use. The stadium wants both of these parcels in addition to the park. I think City Council President Aponte is being a bit silly talking about being “made whole” over this deal, unless he’s talking about those two parcels. The park land was never going to generate city tax revenue and was actually going to cost the city money to maintain. Part of my openness to any proposal on this park land is my lack of confidence that the city will invest appropriately to maintain a world-class park space along the waterfront. There’s very little “whole” to be made.

It is also possible that Providence will have to designate additional park space if the stadium is built. If that’s true and it’s coming off the tax roles than the PawSox absolutely should have to pay property taxes, period. There’s one possible exception I’ll address below…

I also feel very strongly about having a single process for tax stabilization across all I-195 land that is not politically driven but instead a matter of administrative decision. Exceptions for a big project breaks the major benefit of a single tax stabilization agreement ruling all the I-195 land, which is our need to send a signal that all players are equal, all developers are welcome, and political cronyism is not the path required to build. While some of those $2,000,000 in tax benefits will accrue to Providence through increased surrounding land value, many costs associated with the stadium will as well. There are police details, road wear and tear, fire and emergency services, and more to consider.

My Counter

I don’t think this deal is dead, but I am not sure that the PawSox, city, or state would accept my counter. I have struggled with whether I should share what I want to happen versus what I think a deal that would happen looks like. I would be tempted to personally just let the PawSox walk. But if Rhode Island really wants them to stay, here’s a plausible counter:

  1. The PawSox receive the same tax stabilization agreement all other developers get from the city of Providence. Terms for a fair valuation of the property are agreed upon up front that are derived from some portion of an average of annual revenues.
  2. The lease terms should be constructed such that the net cost (excluding the anticipated increase in tax receipts) is equal to the tax dollars owed to the city of Providence. Therefore, the state essentially pays for the $85,000,000 of principal and the city taxes. This could be through a PILOT, but I’d prefer that amount go to the PawSox and the PawSox transfer the dollars to the city. It’s just accounting, but I prefer the symbol of them paying property taxes. I don’t think it’s a terrible precedent for the state to offer PILOT payments to cover a gap between the city TSA in I-195 with a developer’s ask, if the state sees there is substantial public interest in that investment, but still better to actually get developers used to writing a check to the city.
  3. If the city has to make additional green space equivalent to the park we are losing, I foresee two options. First is the PawSox paying full load on whatever that land value is. The second is probably better, but harder to make happen. Brown should give up the Brown Stadium land to the city. They can make it into a park without reducing the foot print of taxable property in the city. If they did this, Brown should essentially get free use of the stadium with no fees (except police detail or similar that they would pay for their games on the East Side) in perpetuity. They should get first rights after the PawSox games themselves.
  4. The stadium itself will be reverted to ownership by the Rhode Island Convention Center Authority if the option to buy the land is not exercised in 30 years. This way the whole stadium and its land are state owned, since the state paid for it. The possible exception would be if Brown has to give up its stadium to park land, in which case I might prefer some arrangement be made with them.
  5. The PawSox ownership agrees to pay a large penalty to the state and the city if they move the team out of Rhode Island in the next 99 years.
  6. PawSox maintenance staff will be responsible for maintaining the Riverwalk park, stadium grounds, and the green-way that has been proposed for the I-195 district. Possible we can expand this to something like the Downcity Improvement District (or perhaps just have them pay to expand the DID into the Knowledge District). This will help ensure this creates more permanent jobs and reduces costs to the city for maintaining its public spaces that contribute to the broader attractiveness of the stadium.
  7. There should be a revenue share deal for any non-PawSox game events with the city and/or state for concession purchases and parking receipts.
  8. The stadium should not be exempt from future TIF assessments for infrastructure in the area.

I am not sure that I would pay even that much for the stadium, but this would be a far better deal overall. I can absolutely think of better ways to spend state dollars, but I also realize that the trade-off is not that simple. Rhode Island is not facing a windfall of $85,000,000 and trying to decide what to do with it. A stadium that keeps the PawSox in Rhode Island inspires emotion. The willingness to create these dollars for this purpose may be far higher than alternative uses. The correct counterfactual is not necessarily supporting 111 Westminster (a better plan for less). It is not necessarily better school buildings. It is not necessarily meaningful tax benefits for rooftop solar power. It is not lowering taxes, building a fund to provide seed capital to local startups, a streetcar, dedicated bus and/or bike lanes, or tax benefits to fill vacant properties and homes. The correct counterfactual could be nothing. It could be all of these things, but in much smaller measure. It is very hard to fully evaluate this proposal because we are not rational actors with a fixed budget line making marginal investment decisions. Ultimately, with big flashy projects like this, I lean toward evaluating them on their own merits. Typically, and I think this case is no exception, even evaluating a stadium plan on its own merits without considering alternative investments makes it clear these projects are bad deals. Yet cities and states make them over and over again. We would be wise to look at this gap in dollars and cents and our collective, repeated actions not as fits of insanity but instead as stark reminders of our inability to simply calculate the total benefits that all people receive.

In my day job, I get to speak to early stage investors. There I learned an important tidbit– a company can name whatever valuation they want if an investor can control the terms. That’s my feeling with the PawSox. The cash is important, it’s not nothing. But any potential plan should be judged by the terms.

Here’s hoping Rhode Island isn’t willing to accept bad terms at a high cost.


  1. $A = P(1 + \frac{r}{n})^{nt}$ where $A = $120,000,000$, $P = $85,000,000$, $n = 1$, and $t = 30$. I’ll leave you to the algebra. ↩︎

January 22, 2015

I keep this on my desktop.

Install:

1
2
3
4
5
6
brew install postgresql
initdb /usr/local/var/postgres -E utf8
gem install lunchy
### Start postgres with lunchy
mkdir -p ~/Library/LaunchAgents
cp /usr/local/Cellar/postgresql/9.3.3/homebrew.mxcl.postgresql.plist ~/Library/LaunchAgents/

Setup DB from SQL file:

1
2
3
4
5
### Setup DB
lunchy start postgres
created $DBNAME
psql -d $DBNAME -f '/path/to/file.sql'
lunchy stop postgres

Starting and Stopping PostgreSQL

1
2
lunchy start postgres
lunchy stop postgres

may run into trouble with local socket… try this:

1
rm /usr/local/var/postgres/postmaster.pid

Connecting with R

1
2
3
# make sure lunch start postgres in terminal first)
require(dplyr)
db <- src_postgres(dbname=$DBNAME)

Inspired by seeing this post and thought I should toss out what I do.

January 10, 2015

Severing My Daemon

When I was in high school, I piggy-backed on a friend’s website to host a page for my band. We could post pictures, show locations and dates, lyrics, and pretend like we produced music people cared about. It was mostly a fun way for me to play with the web and something to show folks when I said I played guitar and sang in a band. One day, my friend canceled his hosting. He wasn’t using his site for anything and he forgot that I had been using the site. I was 18, I never thought about backups, and I had long deleted all those pesky photos taking up space on my memory cards and small local hard drive.

Four years of photos from some of the best experiences of my life are gone. No one had copies. Everyone was using the site. In the decade since, no set of pictures has ever been as valuable as the ones I lost that day.

Who controls the past…

As you can imagine, this loss has had a profound effect on how I think about both my data and the permanence of the internet. Today, I have a deep system of backups for any digital data I produce, and I am far more likely to err on keeping data than discarding it. Things still sometimes go missing. 1

Perhaps the more lasting impact is my desire to maintain some control over all of my data. I use Fastmail for my email, even after over 10 years of GMail use. 2 I like knowing that I am storing some of my most important data in a standard way that easily syncs locally and backs up. I like that I pay directly for such an important service so that all of the incentive for my email provider is around making email work better for me. I am the customer. I use Bittorrent Sync for a good chunk of my data. I want redundancy across multiple machines and syncing, but I don’t want all of my work and all of my data to depend on being on a third party server like it is with Dropbox. 3. I also use a Transporter so that some of my files are stored on a local hard drive.

Raison D’être

Why does this blog exist? I have played with Tumblr in the past and I like its social and discovery tools, but I do not like the idea of pouring my thoughts into someone else’s service with no guarantee of easy or clean exit. I tried using Wordpress on a self-hosted blog for a while, but I took one look at the way my blog posts were being stored in the Wordpress database and kind of freaked out. All those convenient plugins and short codes were transforming the way my actual text was stored in hard to recover way. Plus, I didn’t really understand how my data was stored well enough to be comfortable I had solid back ups. I don’t want to lose my writing like I lost those pictures.

This blog exists, built on Pelican, because I needed to a place to write my thoughts in plain text that was as easy to back up as it was to share with the world. I don’t write often, and I feel I rarely write the “best” of my thoughts, but if I am going to take the time to put something out in the world I want to be damn sure that I control it.

Bag End

I recently began a journey that I thought was about simplifying tools. I began using vim a lot more for text editing, including writing prose like this post. But I quickly found that my grasping for new ways to do work was less about simplifying and more about better control. I want to be able to work well, with little interruption, on just about any computer. I don’t want to use anything that’s overly expensive or available only on one platform if I can avoid it. I want to strip away dependencies as much as possible. And while much of what I already use is free software, I didn’t feel like I was in control.

For example, git has been an amazing change for how I do all my work since about 2011. Github is a major part of my daily work and has saved me a bunch of money by allowing me to host this site for free. But I started getting frustrated with limitations of not having an actual server and not really having access to the power and control that a real server provides. So I recently moved this site off of Github and on to a Digital Ocean droplet. This is my first experiment with running a Linux VPS. Despite using desktop Linux for four years full time, I have never administered a server. It feels like a skill I should have and I really like the control.

Quentin’s Land

This whole blog is about having a place I control where I can write things. I am getting better at the control part, but I really need to work on the writing things part.

Here’s what I hope to do in the next few months. I am going to choose (or write) a new theme for the site that’s responsive and has a bit more detail. I am probably going to write a little bit about the cool, simple things I learned about nginx and how moving to my own server is helping me run this page (and other experiments) with a lot more flexibility. I am also going to try and shift some of my writing from tweetstorms to short blog posts. If I am truly trying to control my writing, I need to do a better job of thinking out loud in this space versus treating them as disposable and packing them on to Twitter. I will also be sharing more code snippets and ideas and less thoughts on policy and local (Rhode Island) politics. The code/statistics/data stuff feels easier to write and has always gotten more views and comments.

That’s the plan for 2015. Time to execute.


  1. I recently found some rare music missing that I had to retrieve through some heroic efforts that included Archive.org and stalking someone from an online forum that no longer exists (successfully). ↩︎

  2. I was a very early adopter of Gmail. ↩︎

  3. I still use Dropbox. I’m not an animal. But I like having an alternative. ↩︎

November 27, 2014

A few thoughts:

  1. This is a very interesting way to take advantage of a number of existing Amazon technologies–primarily their payment processing and review system.
  2. Services are an increasingly important part of the economy and is less subject to commoditization. This is Amazon dipping into a massive growth area by commoditizing discovery and payment. It also offloads some of the risk from both sides of the transaction. It’s very bold, possibly brilliant.
  3. If you have tried to find a reliable carpenter, electrician, plumber, house cleaning service, etc lately, it should be obvious the value that Amazon can provide. Even as a subscriber to Angie’s List, which has been invaluable, finding reliable, affordable, and quality services is still a frustrating experience.
  4. This is why technology companies get huge valuations. It is hard to anticipate just how technologies to become the first online booksellers will lead to a massive number of accounts with credit cards and a strongly trusted brand. It is hard to anticipate how book reviews and powerful search andfiltering become the way you find people to come into your home and fix a toilet. But truly, it’s hard to anticipate the limits of a company with massive reach into people’s wallets that scales.

It has been said a thousand times before, but I feel the need to say it again. So much of what Star Wars got right was creating a fully realized, fascinating world. As much as stunning visual effects that have largely stood the test of time were a part of that story, it was how Star Wars sounded that is most remarkable.

Watch that trailer. It has moments that look an awful lot like Star Wars– vast dunes in the middle of the desert, the Millenium Falcon speeding along, flipping at odd angles emphasizing its unique flat structure. But it also has a lot of elemetns that are decidedly modern and not Star Wars like. 1 I think what’s most remarkable is I can close my eyes and just listen. Immediately I can hear Star Wars. The sounds of Star Wars are not just iconic, they are deeply embedded in my psyche and embued with profound meaning.

I first had the opportunity to see Star Wars on the big screen it was during the release of the “Special Editions”. There is nothing like hearing Star Wars in a theater.


  1. Shakey-cam is the primary culprit. ↩︎

November 18, 2014

Because of the primacy of equity as a goal in school finance system design, the formulas disproportionately benefit less wealthy districts and those with high concentrations of needier students. … because of the universal impact on communities, school finance legislation requires broad political buy-in.

I think it is worth contrasting the political realities of constructing school finance law with the need and justification for state funding of education in the first place.

The state is the in the business of funding schools for redistributive purposes. If that wasn’t required, there’s little reason to not trade an inefficient pass through of sales and income tax dollars through to communities that could have lower sales and income taxes (or state sales and income taxes) replaced with local sales, income, property taxes , and fees. We come together as states to solve problems that extend beyond parochial boundaries, and our political unions exist to tackle problems we’re not better off tackling alone.

There are limits to redistributive policy. Support for the needs of other communities might wane, leading to challenging and reducing the rights of children with new law or legal battles, serious political consequences for supporters of redistirbution, and decreased in economic activity (in education, property value). These are real pressures that need to be combatted both by convincing voters and through policy success 1. There are also considerations around the ethics of “bailing out” communities that made costly mistakes like constructing too many buildings or offering far too generous rights to staff in contracts that they cannot afford to maintain. We struggle as policy experts to not create the opportunity for moral hazards as we push to support children who need our help today.

Policy experts and legal experts cannot excuse the needs of children today, nor can they fail to face the limits of support for redistribution or incentivizing bad adult behavior.


  1. I don’t doubt that support for redistributive policy goes south when it appears that our efforts to combat poverty and provide equal opportunities appear to fail, over and over again, and in many cases may actually make things worse. ↩︎

November 12, 2014

There are some basic facts about the teacher labor market that are inconvenient for many folks working to improve education. I am going to go through a few premises that I think should be broadly accepted and several lemma and contentions that I hope clarifies my own view on education resources and human capital management.

Teaching in low performing schools is challenging.

If I am looking for a job, all else being equal, I will generally not choose the more challenging one.

Some may object to the idea that teachers would not accept a position that offers a greater opportunity to make a difference, for example, teaching at an inner city school, over one that was less likely to have an impact, like teaching in a posh, suburban neighborhood. It is certainly true that some teachers, if not most teachers place value on making a greater impact. However, the question is how great is that preference? How much less compensation (not just wage) would the median teacher be willing to take to work in a more challenging environment?

I contend that it is atypical for teachers to accept lower compensation for a more challenging job. I would further suggest that even if there were a sufficient number of teachers to staff all urban schools with those that would accept lower compensation for a position in those schools, the gap in compensation that they would accept is low.

There are large gaps in non-pecuniary compensation between high performing school and low performing schools that is difficult to overcome.

Let us supposed that it’s true there are large parts of the teacher workforce that would accept lower compensation (wage and non-wage) to teach in urban schools. There are real benefits to taking on a role where the potential for impact is great.

However, we can consider this benefit as part of the hedonic wages supplied by a teaching role. Other forms of non-monetary compensation that teachers may experience include: a comfortable physical work environment with sufficient space, lighting, and climate control; sufficient supplies to teach effectively; support and acceptance of their students, their families, and the broader school communities; a safe work environment; job security; alignment to a strong, unified school culture; and strong self-efficacy.

Some of these features could be easily replicated in many low performing schools. It is possible to have better quality physical schools and sufficient funding for supplies. Other features can be replicated, but not nearly as easily. Low performing schools where students have complex challenges inside and outside of the classroom are not environments where everyone has a strong sense of self-efficacy. Even the initial sense that making a difference is within reach erodes for many after facing a challenging environment day after day, year after year. A safe environment and a strong school culture are well within reach, but hardly easy and hardly universal. These things should be universal. They require funding, leadership, and broadly successful organizations.

The key is not that all high performing schools always have these features and no low performing schools can or do have these features. What is important is that many of these features are less often found in low performing, particularly urban schools.

I contend that the typical gap in non-pecuniary compensation between high and low performing schools is large enough to wipe out any negative compensating wage differential that may exist due to a desire for greater impact.

The primary mechanism to get “more” education is increasing the quality or quantity of teaching.

Let us take the leap of suggesting that teaching is a key part of the production of education. If we want to improve educational equity and address the needs of low performing schools, we need some combination of more and higher quality teaching. This is a key driver of policies like extended learning time (more), smaller class sizes (more), professional development (better), and teacher evaluation and support systems (better). It is what is behind improving teacher preparation programs (better), alternative certification (better), and progressive support programs like RTI (more and better).

November 1, 2014

November marks the start of National Novel Writing Month (NaNoWriMo). The quick version is folks band together and support each other to write 50,000 words in November.

I would love to write a novel one day. I am not sure I could do it well, but I am pretty sure I could hit 50,000-80,000 words if I dedicated time to tell a story.

I don’t have a story to tell.

So this year, I have decided to not feel guilty about skipping out on another NaNoWriMo (always the reader, never the author), and instead I am modifying it to meet my needs. With no story to tell and no experience tackling a single project the size of a novel, I am going to tackle a smaller problem– this blog.

Instead of 50,000 words in 30 days, I am going to try and write 1000 words a day for the next four weeks. I will not hold myself to a topic. I will not even hold myself to non-fiction. I will not hold myself to a number of posts or the size of the posts I write. I will not even hold myself to true daily count, instead reviewing where I stand at the end of each week.

I am hoping that the practice of simply writing will grease my knuckles and start the avalanche that leads to writing more. A small confession– I write two or three blog posts every week that never leave my drafts. I find myself unable to hit publish because the ideas tend to be far larger or far smaller than I anticipate when I set out to write and share my frustrations. I also get nervous, particularly when writing about things I do professionally, about not writing the perfect post that’s clear, heavily researched, and expresses my views definitively and completely. This month, I say goodbye to that anxiety and start simply hitting publish.

I will leave you with several warnings.

  1. Things might get topically wacky. I might suddenly become a food blogger, or write about more personal issues, or write a short story and suddenly whiplash to talking about programming, education policy, or the upcoming election. If high volume, random topics aren’t your thing, you should probably unsubscribe from my RSS feed and check back in a month.
  2. I might write terrible arguments that are poorly supported and don’t reflect my views. This month, I will not accept my most common excuses for not publishing, which boil down to fear people will hold me to the views I express in my first-draft thinking. I am going to make mistakes this month in public and print the dialog I am having with myself. The voices I allow room to speak as I struggle with values, beliefs, and opinions may be shock and offend. This month, this blog is my internal dialog. Please read it as a struggle, however definitive the tone.
  3. I am often disappointed that the only things I publish are smaller ideas written hastily with poor editing. Again, this month I embrace the reality that almost everything I write that ends up published is the result of 20 minutes of furious typing with no looking back, rather than trying to be a strong writer with a strong view point and strong support.

I hope that the end of this month I will have written at least a couple of pieces I feel proud of, and hopefully, I will have a little less fear of hitting publish in the future.

October 5, 2014

A terrible thing is happening this year. Women all across the internet are finding themselves the target of violence, simply for existing. Women are being harassed for talking about video games, women are being harassed for talking about the technology industry, women are being harassed for talking, women are being harassed.

A terrible thing is happening. Women are finding themselves the target of violence.

A terrible thing has always happened.


I remember being a 16 year old posting frequently on internet forums. One in particular focused on guitar equipment. I loved playing in a band, and I loved the technology of making guitar sounds. Many people on the forum were between 16 and 24, although it was frequented by quite a few “adults” in their 30s, 40s, and 50s. It was a wonderful opportunity to interact as an adult, with adults.

Every week members created a new thread where they posted hundreds of photos of women. Most of them were professional photographs taken at various night clubs as patrons entered. Some were magazine clippings or fashion modeling. I remember taking part, both in gazing and supplying the occasional photograph from the internet. We were far from the early days of the world wide web, this being around 2003, but this was also before social media matured and online identity was well understood by the general public.

This thread became controversial. A change from private to corporate ownership of this forum led to increased moderation, and the weekly post with photos of women was one of the targets.

I did not understand.

In the debates about the appropriateness of the content and its place within our online community, I took the side of those who wanted the post to remain alive. I was not its most ardent supporter, nor was I moved to some of the extremes in language and entitlement that typically surround these conversations. However, my views were clear and easy. These were public photographs, largely taken with permission (often for compensation). And, of course, none of the pictures were pornographic.

Appropriateness for me at 16 was defined by pornography. I did not understand.


My parents did not raise me to be misogynist. One of the most influential moments in my life came on a car ride to the dentist. I was also around 16 or 17. I think it was on my way to get my wisdom teeth removed. I had been dating the same girl for a while, and it was time for my father to give me the talk. All he said to me was, “Women deserve your respect.”

That was it.


We were in college, and my friends and I were all internet natives. We had used the web for over ten years. We grew up in AOL chatrooms and forums. The backwaters of the internet at this time shifted from Something Awful to 4Chan. This was the height of some of the most prolific and hilarious memes: lolcats, Xzibit, advice dogs (a favorite was bachelor frog, which seemed to understand our worst impulses expressed in only modest exaggeration).

There was also violence.

It was not uncommon to see names, phone numbers, and addresses that 4chan was supposed to harass because someone said so. Various subcultures seemed to be alternatively mocked and harassed endlessly in the very place that had first embraced, supported, and connected people under the guise of radical anonymity. The most famous of the “Rules of the Internet” was Rule 34 – if you can think of it, there is a porn of it– and its follow up, Rule 35 – if you can not find porn of it, you should make it. 4chan seemed determined to make this a reality. But really the most troublesome thing was the attitude toward women. Nothing was as unacceptable to 4chan as suggesting that women are anything but objects for male gaze. In a place sometimes filled with radically liberal (if more left-libertarian than left-progressive) politics that would spawn groups like Anonymous, nothing brought out as much criticism as suggesting our culture has a problem with women.

My response was largely to fade from this part of the internet. I had only reached the point of being uncomfortable with this behavior. It would take more time for me to understand. It still felt like this was a problem of ignorant people.


I am rarely jealous of intelligence. I am rarely jealous of wealth. I am rarely jealous of experiences. What I am most often jealous of is what seems to me to be a preternatural maturity of others, particularly around issues of ethics and human rights.

Fully grappling with privilege is not something that happens over a moment, it is a sensitivity to be developed over a lifetime. We are confronted with media that builds and reinforces a culture that is fundamentally intolerant and conservative. There are countless microaggressions that are modeled everywhere for our acceptance as normal. It has taken me a decade of maturation, hard conversations, and self-examination to only begin to grow from fully complicit and participating in objectification of women to what I would now consider to be the most basic level of human decency.

The internet has gone from enabling my own aggression toward women to exposing me to a level of misogyny and violence that deeply disturbs and disgusts me, shattering any notion that my past offenses were harmless or victimless. The ugly underside of our culture is constantly on display, making it all the more obvious how what felt like isolated events on the “ok” side of the line were actaully creating a space that supported and nurtured the worst compulsions of men.


I often think about my own journey when I see disgusting behavior on the internet. I wonder whether I am facing a deeply, ugly person or myself at 16. I try to parse the difference between naïvety, ignorance, and hate and to understand if they require a unique response.

Mostly, I struggle with what would happen if Jason Today spoke to Jason 16.

Jason 16 could not skip over a decade of growth simply for having met Jason Today. It took me conversations with various folks playing the role of Jason Today over and over again, year after year. I wish I believed there was another way to reach the Jason 16s out there. I wish I knew how to help them become preternaturally aware of their actions. All I know how to do is try to be compassionate to those who hate while firmly correcting, try to meet the heightened expectations I place on myself, try to apologize when I need to, and try to support those that seem more equipped to push the conversation forward.

Along this path, I never lept to agreement so much as paused. Each time I heard a convincing point, I paused and considered. Growth came in a series of all too brief pauses.

Pauses are often private and quiet, its discoveries never on direct display.

If pauses are the best anyone can expect, then working to change our culture of violence toward women will rarely feel like much more than shouting at the void.

June 12, 2014

The Vergara v. California case has everyone in education talking. Key teacher tenure provisions in California are on the ropes, presumably because of the disparate impact on teacher, annd therefore education, quality for students who are less fortunate.

I have fairly loosely held views about the practice of tenure itself and the hiring and firing of teachers. However, I have strongly held views that unions made mistake with their efforts to move a lot of rules about the teaching labor market into state laws across the country. Deep rules and restrictions are better left to contracts, even from a union perpsective. At worst, these things should be a part of regulation, which can be more easily adapted and waived.

That said, here are a collection of interesting thoughts on tenure post-Vergara:

John Merrow, reacting to Vergara:

Tenure and due process are essential, in my view, but excessive protectionism (70+ steps to remove a teacher?) alienates the general public and the majority of effective teachers, particularly young teachers who are still full of idealism and resent seeing their union spend so much money defending teachers who probably should have been counseled out of the profession years ago.

With the modal ‘years of experience’ of teachers dropping dramatically, from 15 years in 1987 to 1 or 2 years today, young teachers are a force to be reckoned with. If a significant number of them abandon the familiar NEA/AFT model, or if they develop and adopt a new form of teacher unionism, public education and the teaching profession will be forever changed.

San Jose Mercury News reporting on the state thwarting a locally negotiated change to tenure:

With little discussion, the board rejected the request, 7 to 2. The California Teachers Association, one of the most powerful lobbies in Sacramento, had opposed granting a two-year waiver from the state Education Code – even though one of the CTA’s locals had sought the exemption… …San Jose Teachers Association President Jennifer Thomas, whose union had tediously negotiated with the district an agreement to improve teacher evaluations and teaching quality, called the vote frustrating… San Jose Unified and the local teachers association sought flexibility to grant teachers tenure after one year or to keep a teacher on probation for three years.

The district argued that 18 months – the point in a teacher’s career at which districts must make a tenure decision – sometimes doesn’t allow time to fairly evaluate a candidate for what can be a lifetime job.

Now, Thomas said, when faced with uncertainty over tenure candidates, administrators will err on the side of releasing them, which then leaves a stain on their records.

Kevin Welner summarzing some of the legal implications of Vergara:

Although I can’t help but feel troubled by the attack on teachers and their hard-won rights, and although I think the court’s opinion is quite weak, legally as well as logically, my intent here is not to disagree with that decision. In fact, as I explain below, the decision gives real teeth to the state’s Constitution, and that could be a very good thing. It’s those teeth that I find fascinating, since an approach like that used by the Vergara judge could put California courts in a very different role —as a guarantor of educational equality—than we have thus far seen in the United States… …To see why this is important, consider an area of education policy that I have researched a great deal over the years: tracking (aka “ability grouping”). There are likely hundreds of thousands of children in California who are enrolled in low-track classes, where the expectations, curricula and instruction are all watered down. These children are denied equal educational opportunities; the research regarding the harms of these low-track classes is much stronger and deeper than the research about teachers Judge Treu found persuasive in the Vergara case. That is, plaintiffs’ attorneys would easily be able to show a “real and appreciable impact” on students’ fundamental right to equality of education. Further, the harm from enrollment in low-track classes falls disproportionately on lower-income students and students of color. (I’ll include some citations to tracking research from myself and others at the end of this post.)

Welner also repeats a common refrain from the education-left that tenure and insulating teachers from evaluations is critical for attracting quality people into the teaching profession. This is an argument that the general equilibrium impact on the broader labor market is both larger in magnitude and in the opposite direction of any assumed positive impacts from easier dismissal of poor performing teachers:

This more holistic view is important because the statutes are central to the larger system of teacher employment. That is, one would expect that a LIFO statute or a due process statute or tenure statute would shape who decides to become a teacher and to stay in the profession. These laws, in short, influence the nature of teaching as a profession. The judge here omits any discussion of the value of stability and experience in teaching that tenure laws, however imperfectly, were designed to promote in order to attract and retain good teachers. By declining to consider the complexity of the system, the judge has started to pave a path that looks more narrowly at defined, selected, and immediate impact—which could potentially be of great benefit to future education rights plaintiffs.

Adam Ozimek of Modeled Behavior:

I can certainly imagine it is possible in some school districts they will find it optimal to fire very few teachers. But why isn’t it enough for administrators to simply rarely fire people, and for districts to cultivate reputations as places of stable employment? One could argue that administrators can’t be trusted to actually do this, but such distrust of administrators brings back a fundamental problem with this model of public education: if your administrators are too incompetent to cultivate a reputation that is optimal for student outcomes then banning tenure is hardly the problem, and imposing tenure is hardly a solution. This is closely related to a point I made yesterday: are we supposed to believe administrators fire sub-optimally but hire optimally

His piece from today (and this one from yesterday) argues that Welner’s take could be applied to just about any profession, and furthermore, requires accepting a far deeper, more fundamental structural problem in education that should be unacceptable. If administrators would broadly act so foolishly as to decimate the market for quality teaching talent and be wholly unable to successfully staff their schools, we have far bigger problems. And, says Ozimek, there is no reason to believe that tenure is at all a response to this issue.

Dana Goldstein would likely take a more historical view on the usefulness of tenure against adminstrator abuse.

But, writing for The Atlantic, she focuses instead on tenure as a red herring:

The lesson here is that California’s tenure policies may be insensible, but they aren’t the only, or even the primary, driver of the teacher-quality gap between the state’s middle-class and low-income schools. The larger problem is that too few of the best teachers are willing to work long-term in the country’s most racially isolated and poorest neighborhoods. There are lots of reasons why, ranging from plain old racism and classism to the higher principal turnover that turns poor schools into chaotic workplaces that mature teachers avoid. The schools with the most poverty are also more likely to focus on standardized test prep, which teachers dislike. Plus, teachers tend to live in middle-class neighborhoods and may not want a long commute.

May 19, 2014

I have never found dictionaries or even a thesaurus particularly useful as part of the writing process. I like to blame this on my lack of creative careful writing.

But just maybe, I have simply been using the wrong dictionaries. It is hard not to be seduced by the seeming superiority of Webster’s original style. A dictionary that is one-part explanatory and one-part exploratory provides a much richer experience of English as an enabler of ideas that transcend meager vocabulary.

May 12, 2014

I had never thought of a use for Brett Terpstra’s Marky the Markdownifier before listening today’s Systematic. Why would I want to turn a webpage into Markdown?

When I heard that Marky has an API, I was inspired. Pinboard has a “description” field that allows up to 65,000 characters. I never know what to put in this box. Wouldn’t it be great to put the full content of the page in Markdown into this field?

I set out to write a quick Python script to:

  1. Grab recent Pinboard links.
  2. Check to see if the URLs still resolve.
  3. Send the link to Marky and collect a Markdown version of the content.
  4. Post an updated link to Pinboard with the Markdown in the description field.

If all went well, I would release this script on Github as Pindown, a great way to put Markdown page content into your Pinboard links.

The script below is far from well-constructed. I would have spent more time cleaning it up with things like better error handling and a more complete CLI to give more granular control over which links receive Markdown content.

Unfortunately, I found that Pinboard consistently returns a 414 error code because the URLs are too long. Why is this a problem? Pinboard, in an attempt to maintain compatibility with the del.ico.us API uses only GET requests, whereas this kind of request would typically use a POST end point. As a result, I cannot send along a data payload.

So I’m sharing this just for folks who are interested in playing with Python, RESTful APIs, and Pinboard. I’m also posting for my own posterity since a non-Del.ico.us compatible version 2 of the Pinboard API is coming.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import requests
import json
import yaml


def getDataSet(call):
  r = requests.get('[api.pinboard.in/v1/posts/...](https://api.pinboard.in/v1/posts/recent') + call)
  data_set = json.loads(r._content)
  return data_set

def checkURL(url=""):
  newurl = requests.get(url)
  if newurl.status_code==200:
    return newurl.url
  else:
    raise ValueError('your message', newurl.status_code)

def markyCall(url=""):
  r = requests.get('[heckyesmarkdown.com/go/](http://heckyesmarkdown.com/go/?u=') + url)
  return r._content

def process_site(call):
  data_set = getDataSet(call)
  processed_site = []
  errors = []
  for site in data_set['posts']:
    try:
      url = checkURL(site['href'])
    except ValueError:
      errors.append(site['href'])
    description = markyCall(url)
    site['extended'] = description
    processed_site.append(site)
  print errors
  return processed_site

def write_pinboard(site, auth_token):
  stem = 'https://api.pinboard.in/v1/posts/add?format=json&auth_token='
  payload = {}
  payload['url'] = site.get('href')
  payload['description'] = site.get('description', '')
  payload['extended'] = site.get('extended', '')
  payload['tags'] = site.get('tags', '')
  payload['shared'] = site.get('extended', 'no')
  payload['toread'] = site.get('toread', 'no')           
  r = requests.get(stem + auth_token, params = payload)
  print(site['href'] + '\t\t' + r.status_code)

def main():
  settings = file('AUTH.yaml', 'rw')
  identity = yaml.load(AUTH.yaml)
  auth_token = identity['user_name'] + ':' + identity['token']
  valid_sites = process_site('?format=json&auth_token=' + auth_token)
  for site in valid_sites:
    write_pinboard(site, auth_token)

if __name__ == '__main__':
  main()
April 1, 2014

I frequently work with private data. Sometimes, it lives on my personal machine rather than on a database server. Sometimes, even if it lives on a remote database server, it is better that I use locally cached data than query the database each time I want to do analysis on the data set. I have always dealt with this by creating encrypted disk images with secure passwords (stored in 1Password). This is a nice extra layer of protection for private data served on a laptop, and it adds little complication to my workflow. I just have to remember to mount and unmount the disk images.

However, it can be inconvenient from a project perspective to refer to data in a distant location like /Volumes/ClientData/Entity/facttable.csv. In most cases, I would prefer the data “reside” in data/ or cache/ “inside” of my project directory.

Luckily, there is a great way that allows me to point to data/facttable.csv in my R code without actually having facttable.csv reside there: symlinking.

A symlink is a symbolic link file that sits in the preferred location and references the file path to the actual file. This way, when I refer to data/facttable.csv the file system knows to direct all of that activity to the actual file in /Volumes/ClientData/Entity/facttable.csv.

From the command line, a symlink can be generated with a simple command:

1
ln -s target_path link_path

R offers a function that does the same thing:

1
file.symlink(target_path, link_path)

where target_path and link_path are both strings surrounded by quotation marks.

One of the first things I do when setting up a new analysis is add common data storage file extensions like .csv and .xls to my .gitignore file so that I do not mistakenly put any data in a remote repository. The second thing I do is set up symlinks to the mount location of the encrypted data.

March 9, 2014

Education data often come in annual snapshots. Each year, students are able to identify anew, and while student identification numbers may stay the same, names, race, and gender can often change. Sometimes, even data that probably should not change, like a date of birth, is altered at some point. While I could spend all day talking about data collection processes and automated validation that should assist with maintaining clean data, most researchers face multiple characteristics per student, unsure of which one is accurate.

While it is true that identity is fluid, and sex/gender or race identifications are not inherently stable overtime, it is often necessary to “choose” a single value for each student when presenting data. The Strategic Data Project does a great job of defining the business rules for these cases in its diagnostic toolkits.

If more than one [attribute value is] observed, report the modal [attribute value]. If multiple modes are observed, report the most recent [attribute value] recorded.

This is their rule for all attributes considered time-invariant for analysis purposes. I think it is a pretty good one.

Implementing this rule turned out to be more complex than it appeared using R, especially with performant code. In fact, it was this business rule that led me to learn how to use the data.table package.

First, I developed a small test set of data to help me make sure my code accurately reflected the expected results based on the business rule:

1
2
3
4
5
6
7
8
9
# Generate test data for modal_attribute().
modal_test <- data.frame(sid = c('1000', '1001', '1000', '1000', '1005', 
                                 '1005', rep('1006',4)),
                         race = c('Black', 'White', 'Black', 'Hispanic',
                                  'White', 'White', rep('Black',2), 
                                  rep('Hispanic',2)),
                         year = c(2006, 2006, 2007, 2008,
                                  2010, 2011, 2007, 2008,
                                  2010, 2011))

The test data generated by that code looks like this:

sasid race year
1000 Black 2006
1001 White 2006
1000 Black 2007
1000 Hispanic 2008
1005 White 2010
1005 White 2011
1006 Black 2007
1006 Black 2008
1006 Hispanic 2010
1006 Hispanic 2011

And the results should be:

sasid race
1000 Black
1001 White
1005 White
1006 Hispanic

My first attempts at solving this problem using data.table resulted in a pretty complex set of code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Calculate the modal attribute using data.table
modal_person_attribute_dt <- function(df, attribute){
  # df: rbind of all person tables from all years
  # attribute: vector name to calculate the modal value
  # Calculate the number of instances an attributed is associated with an id
  dt <- data.table(df, key='sasid')
  mode <- dt[, rle(as.character(.SD[[attribute]])), by=sasid]
  setnames(mode, c('sasid', 'counts', as.character(attribute)))
  setkeyv(mode, c('sasid', 'counts'))
  # Only include attributes with the maximum values. This is equivalent to the
  # mode with two records when there is a tie.
  mode <- mode[,subset(.SD, counts==max(counts)), by=sasid]
  mode[,counts:=NULL]
  setnames(mode, c('sasid', attribute))
  setkeyv(mode, c('sasid',attribute))
  # Produce the maximum year value associated with each ID-attribute 
  # pairing    
  setkeyv(dt, c('sasid',attribute))
  mode <- dt[,list(schoolyear=max(schoolyear)), by=c("sasid", attribute)][mode]
  setkeyv(mode, c('sasid', 'schoolyear'))
  # Select the last observation for each ID, which is equivalent to the highest
  # schoolyear value associated with the most frequent attribute.
  result <- mode[,lapply(.SD, tail, 1), by=sasid]
  # Remove the schoolyear to clean up the result
  result <- result[,schoolyear:=NULL]
  return(as.data.frame(result))
}

This approached seemed “natural” in data.table, although it took me a while to refine and debug since it was my first time using the package 1. Essentially, I use rle, a nifty function I used in the past for my Net-Stacked Likert code to count the number of instances of an attribute each student had in their record. I then subset the data to only the max count value for each student and merge these values back to the original data set. Then I order the data by student id and year in order to select only the last observation per student.

I get a quick, accurate answer when I run the test data through this function. Unfortunately, when I ran the same code on approximately 57,000 unique student IDs and 211,000 total records, the results were less inspiring. My Macbook Air’s fans spin up to full speed and timings are terrible:

1
2
3
> system.time(modal_person_attribute(all_years, 'sex'))
 user  system elapsed 
 40.452   0.246  41.346 

Data cleaning tasks like this one are often only run a few times. Once I have the attributes I need for my analysis, I can save them to a new table in a database, CSV, or similar and never run it again. But ideally, I would like to be able to build a document presenting my data completely from the raw delivered data, including all cleaning steps, accurately. So while I may use a cached, clean data set for some the more sophisticated analysis while I am building up a report, in the final stages I begin running the entire analyses process, including data cleaning, each time I produce the report.

With the release of dplyr, I wanted to reexamine this particular function because it is one of the slowest steps in my analysis. I thought with fresh eyes and a new way of expressing R code, I may be able to improve on the original function. Even if its performance ended up being fairly similar, I hoped the dplyr code would be easier to maintain since I frequently use dplyr and only turn to data.table in specific, sticky situations where performance matters.

In about a tenth the time it took to develop the original code, I came up with this new function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
modal_person_attribute <- function(x, sid, attribute, year){
  grouping <- lapply(list(sid, attribute), as.symbol)
  original <- x
  max_attributes <- x %.% 
                    regroup(grouping) %.%
                    summarize(count = n()) %.%
                    filter(count == max(count))
  recent_max <- left_join(original, max_attributes) %.%
                regroup(list(grouping[[1]])) %.%
                filter(!is.na(count) & count == max(count))
  results <- recent_max %.% 
             regroup(list(grouping[[1]])) %.%
             filter(year == max(year))
  return(results[,c(sid, attribute)])
}

At least to my eyes, this code is far more expressive and elegant. First, I generate a data.frame with only the rows that have the most common attribute per student by grouping on student and attribute, counting the size of those groups, and filtering to most common group per student. Then, I do join on the original data and remove any records without a count from the previous step, finding the maximum count per student ID. This recovers the year value for each of the students so that in the next step I can just choose the rows with the highest year.

There are a few funky things (note the use of regroup and grouping, which are related to dplyr’s poor handling of strings as arguments), but for the most part I have shorter, clearer code that closely resembles the plain-English stated business rule.

But was this code more performant? Imagine my glee when this happened:

1
2
3
4
5
> system.time(modal_person_attribute_dplyr(all_years, sid='sasid', 
> attribute='sex', year='schoolyear'))
Joining by: c("sasid", "sex")
   user  system elapsed 
  1.657   0.087   1.852 

That is a remarkable increase in performance!

Now, I realize that I may have cheated. My data.table code isn’t very good and could probably follow a pattern closer to what I did in dplyr. The results might be much closer in the hands of a more adept developer. But the take home message for me was that dplyr enabled me to write the more performant code naturally because of its expressiveness. Not only is my code faster and easier to understand, it is also simpler and took far less time to write.

It is not every day that a tool provides powerful expressiveness and yields greater performance.

Update

I have made some improvements to this function to simplify things. I will be maintaining this code in my PPSDCollegeReadiness repository.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
modal_person_attribute <- function(x, sid, attribute, year){
  # Select only the important columns
  x <- x[,c(sid, attribute, year)]
  names(x) <- c('sid', 'attribute', 'year')
  # Clean up years
  if(TRUE %in% grepl('_', x$year)){
    x$year <- gsub(pattern='[0-9]{4}_([0-9]{4})', '\\1', x$year)
  }  
  # Calculate the count for each person-attribute combo and select max
  max_attributes <- x %.% 
                    group_by(sid, attribute) %.%
                    summarize(count = n()) %.%
                    filter(count == max(count)) %.%
                    select(sid, attribute)
  # Find the max year for each person-attribute combo
  results <- max_attributes %.% 
             left_join(x) %.%
             group_by(sid) %.%
             filter(year == max(year)) %.%
             select(sid, attribute)
  names(results) <- c(sid, attribute)
  return(results)
}

  1. It was over a year ago that I first wrote this code. ↩︎

February 26, 2014

We burden Latinos (and other traditionally underserved communities) with expensive housing because of the widespread practice of using homestead exemptions in Rhode Island. By lowering the real estate tax rate, typically by 50%, for owner occupied housing, we dramatically inflate the tax rate paid by Rhode Islanders who are renting.

Echoing a newly filed lawsuit in New York City over discriminatory real estate tax regimes, this new report emphasizes the racist incentives built into our property tax.

Homestead exemptions are built on the belief that renters are non-permanent residents of communities, care less for the properties they occupy and neighborhoods they live in, and are worse additions than homeowners. Frankly, it is an anti-White flight measure meant to assure people that only those with the means to purchase and the intent to stay will join their neighborhoods. Wealthy, largely White, property owners see homestead exemptions as fighting an influx of “slum lords”, which is basically the perception of anyone who purchases a home or builds apartments and rents them out.

Rather than encouraging denser communities with higher land utilization and more housing to reduce the cost of living in dignity, we subsidize low value (per acre) construction that maintain inflated housing costs.

Full disclosure: I own a condo in Providence and receive a 50% discount on my taxes. In fact, living in a condo Downcity, my home value is depressed because of the limited ways that I can use it. I could rent my current condo at market rate and lose money because of the doubling in taxes that I would endure versus turning a small monthly profit at the same rent with higher taxes. The flexibility to use my property as my own residence or as a rental unit more than pays for higher taxes.

So while I do have personal reasons to support removing the homestead exemption, even if I lived in a single family home on the East Side that was not attractive as a rental property, I would still think this situation is absurd. Homeowners’ taxes should easily be 20% higher to tax renters 30% less. Maybe some of our hulking, vacant infrastructure could be more viably converted into housing stock and lower the cost for all residents. Maybe we could even see denser development because there will actually be a market for renters at the monthly rates that would need to be charged to recuperate expenses. At least the rent wouldn’t be so damn high for too many people of color and people living in or near poverty.

February 17, 2014

Hadley Wickham has once again1 made R ridiculously better. Not only is dplyr incredibly fast, but the new syntax allows for some really complex operations to be expressed in a ridiculously beautiful way.

Consider a data set, course, with a student identifier, sid, a course identifier, courseno, a quarter, quarter, and a grade on a scale of 0 to 4, gpa. What if I wanted to know the number of a courses a student has failed over the entire year, as defined by having an overall grade of less than a 1.0?

In dplyr:

1
2
3
4
5
course %.% 
group_by(sid, courseno) %.%
summarise(gpa = mean(gpa)) %.%
filter(gpa <= 1.0) %.%
summarise(fails = n())

I refuse to even sully this post with the way I would have solved this problem in the past.


  1. Seriously, how many of the packages he has managed/written are indispensable to using R today? It is no exaggeration to say that the world would have many more Stata, SPSS, and SAS users if not for Hadleyverse. ↩︎

February 9, 2014

These quotes are absolutely striking, in that they give a clear glimpse into the ideological commitments of the Republican Party. From Sen. Blunt and Rep. Cole, we get the revelation that— for conservatives— the only “work” worth acknowledging is wage labor. To myself, and many others, someone who retires early to volunteer— or leaves a job to care for their children— is still working, they’re just outside the formal labor market. And indeed, their labor is still valuable— it just isn’t compensated with cash.

One of the greatest benefits of wealth is that it can liberate people to pursue happiness. When we tie a basic need for living complete lives of dignity to full time employment, people will find themselves willing to make many sacrifices to ensure this need. In our nation of great wealth with liberty and freedom as core values, it is hard to believe that the GOP would decry the liberating effect of ending the contingency of health care on work.

There is no work rule, regulation, or union that empowers workers more in their relationship with their employers than removing the threat of losing health care from the table. An increasingly libertarian right should be celebrating this as a key victory, rather than celebrate the existing coercive impact that health care has in our lives.

Republicans aren’t as worried as the idle rich, who— I suppose— have earned the right to avoid a life of endless toil. Otherwise— if Republicans really wanted everyone to work as much as possible— they’d support confiscatory tax rates. After all, nothing will drive an investment banker back to the office like the threat of losing 70 percent of her income to Uncle Sam.

Oh yeah, I forgot. For all their claims to loving liberty and freedom, what the GOP really stands for is protecting liberty and freedom for the existing “deserving” wealthy. They will fight tooth and nail to remove estate taxes because inheritance is a legitimate source of liberty. Removing the fear of entering a hospital uninsured after being unable to access preventive care is what deprives folks of “dignity”.

February 5, 2014

My Democracy Prep colleague Lindsay Malanga and I often say we should start an organization called the Coalition of Pretty Good Schools. We’d start with the following principles.

  1. Every child must have a safe, warm, disruption-free classroom as a non-negotiable, fundamental right.
  2. All children should be taught to read using phonics-based instruction.
  3. All children must master basic computational skills with automaticity before moving on to higher mathematics.
  4. Every child must be given a well-rounded education that includes science, civics, history, geography, music, the arts, and physical education.
  5. Accountability is an important safeguard of public funds, but must not drive or dominate a child’s education. Class time must not be used for standardized test preparation.

We have no end of people ready to tell you about their paradigmatic shift that will fix education overnight. There has been plenty of philosophizing about the goals, purpose, and means of education. Everyone is ready to pull out tropes about the “factory model” of education our system is built on.

The reality is that the education system too often fails at very basic delivery, period. I would love to see more folks draw a line in the sand of their minimum basic requirements, and not in an outrageous, political winky-wink where they are wrapping thier ideal in the language of the minimum. Lets have a deep discussion right now about the minimum basic requirements and lets get relentless about making that happen without the distraction of the dream. Frankly, whatever your dream is, so long as it involves kids going somewhere to learn 1, if we can’t deliver on the basics it will be dead on arrival.


  1. Of course, for a group of folks who are engaged in Dreamschooling, we cannot take for granted that schools will be places or that children will be students in any traditional sense of the word. However, I believe that if we have a frank conversation about the minimum expectations for education I suspect this will not be a particularly widely held sentiment. If our technofuturism does complete its mindmeld with the anarcho-____ movements on the left and right to lead to a dramatically different conceptualization of childhood in the developed world in my lifetime… ↩︎

January 6, 2014

James over at TransportPVD has a great post today talking about a Salt Lake City ordinance that makes property owners responsible for providing a bond that funds the landscaping and maintenance of vacant lots left after demolition. I love this as much as he does and would probably add several other provisions (like forfeiting any tax breaks on that property or any other property in the city and potentially forfeiture of the property itself if a demolition was approved based on site plans that are not adhered to within a given time frame). Ultimately, I do think the best solution to surface parking where it doesn’t belong, of either the temporary or permanent (and isn’t it all actually permanent?) kind, is a land value tax.

James goes one step further and suggests that we should adopt some similar rules around ALL parking developments and proposes a few. His hopes were that a mayoral candidate would chime in. For now, he will have to do with me.

His recommendations are built somewhat specific to the commission looking at building a state-funded parking garage in front of the Garrahy Complex in Downcity, about which many urbanists and transit advocates have expressed reservations or outright rejection. They are:

  1. The garage is parking neutral. As many spots need to be removed from the downtown as are added.
  2. An added bonus would be if some of the spots removed were on-street ones, to create protected bike lanes or transit lanes with greenery separating them from car traffic.
  3. The garage has the proposed bus hub.
  4. There are ground-level shops.
  5. The garage is left open 24-hours so that it can limit the need for other lots (this happens when a garage is used only during the day, or only at night, instead of letting it serve both markets).
  6. Cars pay full market price to park.

(Note: I’ve numbered rather than kept the bullets of the original to make responding easier.)

I disagree with the first and second point, which are really one and the same. We are in a district that has tremendously underutilized land. We want that space to be developed and as a result of that development we expect their to be much increased need for transit capacity. The goal should be both to increase accessibility and increase the share of transit capacity offered by walking, biking, or riding a bus or light rail. This does not require that we demand a spot-for-spot when building a public garage. I agree with the sentiment but disagree with the degree. Part of building rules and policies like this is to ensure comprehensive consideration of the transit context when developing parking. I see no reason to a priori assume that garages should only be permitted if they eliminate the same number of spaces they create.

The reason I combine these two points is because the city does not have the ability to remove off-street parking that is not publicly owned. Investing in smaller garages by footprint that have to be built taller and provide no change in capacity probably make no sense at all. If we’re going to build any kind of public garage at all, it should be with the goal of consolidating parking into infrastructure with reasonable land utilization. We would rather 3 or 4 large garages properly located than all of the current lots. Limiting their size because of the flexibility available due to reducing on-street parking or the footprint on existing lots doesn’t achieve that and doesn’t factor in orders-of-magnitude changes in capacity we should need for all transit modes in the next 20 years.

On point three, I am skeptical. I like the idea of improving bus infrastructure when building parking infrastructure in general. In fact, I voted against the \$40M Providence road paving bond even though that was much needed maintenance. My rationale was purely ideological– we should not use debt to pay for car maintenance without also investing in ways to reduce future maintenance costs through better utilization of those roads. However, I have a hard time believing that the Garrahy location is any good as a bus hub. If RIPTA did a great job identifying the need for an additional bus hub that the Garrahy location met the criteria for, I think it’s a reasonable idea. Short of that, it feels like throwing the transit community a wasteful bone.

I mostly agree on point four, but I doubt at the scale James would like to see. I think an appropriate level is probably not that different from the recently erected Johnson and Wales garage. The reality is that street-level retail is the right form, but there isn’t sufficient foot traffic to support it right now and won’t be for some time. There has to be street-level activation of any garage built in this area, but the square footage is likely fairly timid.

I absolutely agree with point five, without qualification. Not a dime should be spent on a public parking spot that is closed at any point in time, anywhere in the city. I would actually ditto this for surface parking lots on commercial properties of any kind after business hours. Not only should they have to be open, they should have to provide signs indicating the hours of commercial activity when parking is restricted and the hours when parking is available to the public. These hours of operations should require board approval. Owners could choose to charge during these off hours, but cars must be able to access the lot.

And point six should be a given for any public parking.

The real problem with Garrahy, in my opinion, is the cost is absurd, likely to be at least \$35,000 per space. There is plenty of existing parking, suggesting the demand right now is illusory and market rate for those spots right now means the investment is unlikely to ever be recovered. In a world with limited capacity for government spending on transit as a public good, I would rather subsidize transit infrastructure that benefits the poor and directly impacts the share of non-car transit as it increases capacity. Spending limited funds on parking infrastructure is ludicrous when demand isn’t sufficient to recover the investment. We already more than sufficiently subsidize parking in the area. And of course, the “study commission” is not really a study– it’s a meeting convened by those who want the project to happen putting the required usual suspects in the room to tepidly rubber stamp it. At least that’s my cynical take.

December 9, 2013

We find that public schools offered practically zero return education on the margin, yet they did enjoy significant political and financial support from local political elites, if they taught in the “right” language of instruction.

One thing that both progressives and libertarians agree upon are that social goals of education are woefully underappreciated and considered in the current school reform discussion. Both school choice and local, democratic control of schools are reactions to centralization resulting in “elites… [selecting] the ‘right’ language of instruction.”

I am inclined to agree with neither.

December 3, 2013

Update

Turns out the original code below was pretty messed up. All kinds of little errors I didn’t catch. I’ve updated it below. There are a lot of options to refactor this further that I’m currently considering. Sometimes it is really hard to know just how flexible something this big really should be. I think I am going to wait until I start developing tests to see where I land. I have a feeling moving toward a more test-driven work flow is going to force me toward a different structure.

I recently updated the function I posted about back in June that calculates the difference between two dates in days, months, or years in R. It is still surprising to me that difftime can only return units from seconds up until weeks. I suspect this has to do with the challenge of properly defining a “month” or “year” as a unit of time, since these are variable.

While there was nothing wrong with the original function, it did irk me that it always returned an integer. In other words, function returned only complete months or years. If the start date was on 2012-12-13 and the end date was on 2013-12-03, the function would return 0 years. Most of the time, this is the behavior I expect when calcuating age. But it is completely reasonable to want to include partial years or months, e.g. in the aforementioned example returning 0.9724605.

So after several failed attempts because of silly errors in my algorithm, here is the final code. It will be released as part of eeptools 0.3 which should be avialable on CRAN soon 1.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
age_calc <- function(dob, enddate=Sys.Date(), units='months', precise=TRUE){
  if (!inherits(dob, "Date") | !inherits(enddate, "Date")){
    stop("Both dob and enddate must be Date class objects")
  }
  start <- as.POSIXlt(dob)
  end <- as.POSIXlt(enddate)
  if(precise){
    start_is_leap <- ifelse(start$year %% 400 == 0, TRUE, 
                        ifelse(start$year %% 100 == 0, FALSE,
                               ifelse(start$year %% 4 == 0, TRUE, FALSE)))
    end_is_leap <- ifelse(end$year %% 400 == 0, TRUE, 
                        ifelse(end$year %% 100 == 0, FALSE,
                               ifelse(end$year %% 4 == 0, TRUE, FALSE)))
  }
  if(units=='days'){
    result <- difftime(end, start, units='days')
  }else if(units=='months'){
    months <- sapply(mapply(seq, as.POSIXct(start), as.POSIXct(end), 
                            by='months', SIMPLIFY=FALSE), 
                     length) - 1
    # length(seq(start, end, by='month')) - 1
    if(precise){
      month_length_end <- ifelse(end$mon==1, 28,
                                 ifelse(end$mon==1 & end_is_leap, 29,
                                        ifelse(end$mon %in% c(3, 5, 8, 10), 
                                               30, 31)))
      month_length_prior <- ifelse((end$mon-1)==1, 28,
                                     ifelse((end$mon-1)==1 & start_is_leap, 29,
                                            ifelse((end$mon-1) %in% c(3, 5, 8, 
                                                                      10), 
                                                   30, 31)))
      month_frac <- ifelse(end$mday > start$mday,
                           (end$mday-start$mday)/month_length_end,
                           ifelse(end$mday < start$mday, 
                            (month_length_prior - start$mday) / 
                                month_length_prior + 
                                end$mday/month_length_end, 0.0))
      result <- months + month_frac
    }else{
      result <- months
    }
  }else if(units=='years'){
    years <- sapply(mapply(seq, as.POSIXct(start), as.POSIXct(end), 
                            by='years', SIMPLIFY=FALSE), 
                     length) - 1
    if(precise){
      start_length <- ifelse(start_is_leap, 366, 365)
      end_length <- ifelse(end_is_leap, 366, 365)
      year_frac <- ifelse(start$yday < end$yday,
                          (end$yday - start$yday)/end_length,
                          ifelse(start$yday > end$yday, 
                                 (start_length-start$yday) / start_length +
                                end$yday / end_length, 0.0))
      result <- years + year_frac
    }else{
      result <- years
    }
  }else{
    stop("Unrecognized units. Please choose years, months, or days.")
  }
  return(result)
}

  1. I should note that my mobility function will also be included in eeptools 0.3. I know I still owe a post on the actual code, but it is such a complex function I have been having a terrible time trying to write clearly about it. ↩︎

December 2, 2013

PISA Results

I wanted to call attention to these interesting PISA results. Turns out that student anxiety in the United States is lower than the OECD average and belief in ability is higher 1. I thought that all of the moves in education since the start of standard’s based reform were supposed to be generating tremendous anxiety and failing to produce students who had high sense of self-efficacy?

It is also worth noting that students in the United States were more likely to skip out on school dand this had a higher than typical impact on student performance. One interpretation of this could be that students are less engaged, but also that schooling activities do have a large impact on students rather than schools being of lesser importance than student inputs.

I have always had a hard time reconciling the calls for higher teacher pay and better work conditions and evidence that missing even just 10% of schooling has a huge impact on student outcomes with the belief that addressing other social inequities is the key way to achieve better outcomes for kids.

This is all an exercise in nonsense. It is incredibly difficult to transfer findings from surveys across dramatical cultural differences. It is also hard to imagine what can be learned about the delivery of education in the dramatically different contexts that exists. The whole international comparison game seems like one big Rorschach test where the price of admission is leaving any understanding of culture, context, and external validity at the door.

P.S.: The use of color in this visualization is awful. There is a sense that they are trying to be “value neutral” with data that is ordinal in nature (above, same, or below), and in doing so chose two colors that are very difficult to distinguish between. Yuck.


  1. The site describes prevalence of anxiety as, “proportion of students who feel helpless when faced with math problems” and belief in ability as, “proportion of students who feel confident in their math abilitites”. Note, based on these defitions, one might also think that either curricula were not so misaligned with international benchmarks or that we are already seeing the fruits of partial transition to Common Core. Not knowing the trend for this data, or some of the specifics about the collection instrument, makes that difficult to assess. ↩︎

November 22, 2013

Although it clocks in at 40+ pages, this is a worthwhile and relatively fast read for anyone in education policy on the future of assessment if we’re serious about college and career readiness. There is a ton to unpack, with a fair amount it agree with and a lot I am quite a bit less sure on.

I think this paper is meant for national and state level policy-makers, and so my major quibble is I think this is much more valuable for a district-level audience. I am less bullish on the state’s role in building comprehensive assessment systems. That’s just my initial reaction.

The accountability section is both less rich and less convincing than the assessment portion. I have long heard cries for so-called reciprocal accountability, but it is still entirely unclear to me what this means and looks like and the implications for current systems.

November 20, 2013

“We are trying to work towards late-exit ELL programs so (students) can learn the concepts in (their) native language,” Lusi said. Administrative goals have recently shifted to a focus on proficiency in both languages because bilingual education is preferred, she added.

But instituting district-wide bilingual education would require funding to hire teachers certified in both languages and to buy dual-language materials, she said.

I am pretty sure this is new. I am surprised there has not been a stronger effort to pass a legislative package in Rhode Island that provides both the policy framework and funding necessary to achieve universal bilinguage education for English language learners in RI schools.

One of the great advantages of transitioning to common standards1 is there should be greater availability of curricular materials in languages other than English. I suspect most of what is needed for bilingual education is start up money for materials, curriculum supports and developments, and assessment materials. There are a few policy things that need to be in place, possibly around state exams, but also rules around flexible teacher assignment, hiring, and dismissal staffing needs dramatically change.

Someone should be putting this package together. I suspect there would be broad support.


  1. Note, this is not necessarily a feature of the Common Core State Standards, just having standards in common with many other states. ↩︎

November 19, 2013

De Blasio and his advisers are still figuring out how much rent to charge well-funded charter schools, his transition team told me. “It would depend on the resources of the charter school or charter network,” he told WNYC, in early October. “Some are clearly very, very well resourced and have incredible wealthy backers. Others don’t. So my simple point was that programs that can afford to pay rent should be paying rent.” (In an October debate with the Republican candidate Joseph Lhota, he put it more bluntly: “I simply wouldn’t favor charters the way Mayor Bloomberg did because, in the end, our city rises or falls on our traditional public schools.”)

My impression of DeBlasio was that he went around collecting every plausible complaint from every interest group that was mad at Bloomberg and promised whatever they wanted. There didn’t really seem to be a coherent theory or any depth whatsoever to his policy prescriptions.

Already working hard to confirm this impression.