Bye-Grams#
If I were to briefly summarize the writing in this corpus based a lifelong close read, here is what I expect might make an appearance (and a quick explanation):
Basketball (my journalism career mainly focused on covering the WNBA and NCAA women’s basketball)
Family (they’ve kind of just always been around, you know?)
Music (as a musician since age 7, I should hope so)
Mental Health (as a mentally ill person for at least the last 20 years, I should hope so)
Gender Stuff (I came out as nonbinary in 2020, and it’s a running joke among my friends how many of us during this time were like “in light of the global pandemic, I have some personal news…”)
School (I’ve been in some higher education program or another for 10 of the last 16 years)
In 2023, I attended a Code4Lib pre-conference workshop led by Eric Lease Morgan about The Distant Reader, a command-line app that helps you absorb a bunch of text at once. I have a hard time articulating what I prefer about The Distant Reader to something like Voyant Tools, which is much flashier and more colorful. I think it might be mental — I feel like I’m doing more work when I’m engaging in text-based queries with the tool directly, rather than uploading a corpus and clicking around. This is not fair to Voyant. Anyway.
It feels odd to have waxed poetic about anything at all when I’m about to paste a bunch of word clouds here, so let me elegantly segue. Here are some keywords that The Distant Reader said were important in my corpus (redacted words are names of people in my life; public figures’ names are not redacted):
I think this is fair. It also introduces an issue I was worried about when I first committed to this project: What if my own subjectivity impacts how I interpret the results of my text analysis? The most pressing example: “Drinking” is in pretty big letters. I did quit drinking alcohol in late 2019. I have featured this decision and its impact in a couple of things I’ve written. But when I look back on my life, the absence of something doesn’t immediately figure into it in the same way that the decision does. When I’ve written about how I don’t drink, I’ve put words to feelings I don’t typically dwell on, experiences I haven’t had since deciding to quit. But in the context of my entire life, this just registers as a relatively benign choice — but maybe one that was easier to write about?
Let’s look at some bigrams:
“High school” being in such huge letters is horrific and so, so true.
But at the end of the day, I’m more taken by how this demonstrates the limitations of bigrams. I wrote an entire piece about an experience I had at Challenge Day, so it’s much larger than “anxiety attack,” an experience that figures more prominently into more pieces but isn’t referred to by that name at all times (I’ll go over this one in particular when I discuss sentiment analysis). And “even though”? “Every time”? “Every day”? Are these just bad writing habits or could they mean something more that I’ll never uncover because they’re such common phrases?
I’m going to take a deep breath and move on before I let a bigram word cloud, of all things, ruin my life.