Discover more from Margins by Ranjan Roy and Can Duruk
Trust, OpenAI & The NYT Opinion Section
The coming flood of no-quality information
Ranjan here, talking about computationally generated text, trust and the NYT Opinion section.
Back in February, OpenAI announced the development of the GPT-2 natural language processing framework. They claimed this technology which artificially creates written text was so good, it was too dangerous to release. They released it a few months later and you can see the model in action at TalkToTransformer.
As an example, if I entered the opening sentence of this newsletter, it returns:
It's funny and certainly feels uncanny valley. But, that's when you try to generate logical and legible text off of just 12 words.
This week I saw a circulating research project, where someone trained the GPT-2 model with tens of millions of public comments on the government's Medicaid website. The entire synopsis is worth reading (and done by a senior at Harvard; Gen-Z FTW), and if you have a few minutes to spare, I highly recommend you take his Deepfake Survey Test. He presents you with a comment and you judge whether it was human or bot-generated.
I got 13 out of 20 right. Can got 11/20. If you take the quiz, I’d love to know your score.
This stuff is scary. It makes me think we're not very far from a point where we will lose all faith in the veracity, or at least, the assumption of human-ness for most text.
But that's not the end of the world.
THIS IS RANJAN, A HUMAN.
Some of you know me IRL. For others, you could probably look me up on LinkedIn and feel a bit more comfortable I'm a real person. But, the trust for you that I am real has more likely been established over weeks and weeks of reading this newsletter. We're a long way from robots being able to write in this style, length, and depth (pats myself on the back). You can have a general sense of safety that these missives are coming from Ranjan, and I am a real person and I mean the things I write.
That trust has been built over time, with great effort, and in the future, trust is going to become all we have in media.
The volume of bot-generated content will be so enormous, both nefarious and innocent things like automated sportswriting, because it will be so cheap to make. If you don't have a concrete sense of where the thing you're looking at is from, who created it, and “what it exactly is”, the default reaction will be to doubt and discard it.
But again, you "know" me, and can at least take solace that these Margins' newsletters are from me, Ranjan, a human.
Which brings me to, possibly, the best piece of journalism I've read all year. Twelve Million Phones, One Dataset, Zero Privacy by Charlie Warzel and Stuart Thompson of the NYT is, well, I know I'm exactly the target audience, but it’s so damn powerful. If you've enjoyed even one Margins email, go read that piece right now. It hits on so many of the themes Can and I are consumed by.
But then, the most depressing thing happened. I sent it to a friend, who's somewhat of a techbro, and has been on a bit of a "journalists are being unfair to tech companies"kick, but generally listens to me on data and privacy issues. His knee-jerk reaction was, "That's the Opinion section. It's not their actual journalism. That's the salacious commentary."
I wanted to debate but was just dejected. Because I didn't really have an argument to make. Right there in big letters up top was the word "Opinion." I wanted to argue how the Privacy Project has been one of the most important journalistic endeavors of the past year. But is it journalism or is it opinion?
It’s worse because I'm sure I mediasplained (I use that in a good way) to this friend how Fox News uses the distinction between journalism and opinion as an axe to wield against the truth. They blast us with Hannity and wiggle out of journalistic rigor by falling back to "that's just opinion."
The way the NYT separates opinion and journalism is a long-running media bugbear of mine. I’ve never understood it. My problem isn't even the conventional one: where liberal NYT readers threaten to cancel their subscriptions over some questionable Bari Weiss or Bret Stephens column. It's that I still have no understanding of what I am reading when I read an NYT Opinion piece.
For example, I can see Bret Stephens published a recent NYT Opinion piece about the youngs 😀. I see he's listed as a contributor alongside some external folks like Mimi Swartz (from the Texas Monthly) and Daniel McCarthy (from Modern Age: A Conservative Quarterly). Swartz and McCarthy make sense of what I’d imagine an Opinion section to be; like the old-school Letters to the Editor. You get outside voices to contribute a multitude of perspectives to your readers.
Looking at the caliber and nature of the Contributor Opinion content, it's clear this isn't some Forbes-ian "let anyone publish for cheap pageviews" model. They find smart, vetted people from outside the NYT to add value to the conversation (disclaimer: My co-host Can is one of those people). This is great!
But, I thought Bret Stephens works for the NYTimes. Does he get paid per piece or get a salary? I'm even more confused because...he's also listed as a Columnist.
On the columnist page I can see all the longtime, heavy hitters: Paul Krugman, David Leonhardt, Nick Kristof, Michelle Goldberg, and other familiar names. These are people I fully associate with the NYT brand.
I think they are paid salaries by the NY Times and are held to the same journalistic standards as a NYT journalist. I think? It's not All the President's Men, meeting a source in a trench coat journalism, but I assume they’re subject to journalistic rigor. I wonder if they sit in the newsroom.
I'm still a bit unclear as to how Bret Stephens falls under both. Maybe that's just some odd quirk in the NYT content management system and more a technological glitch than a breakdown in the fourth estate. But, still, I have no idea what my expectations should be when reading him. Is the NYT staking their brand on his writing the way they would when publishing a Harvey Weinstein exposé?
I've read Farhad Manjoo for a while now, and nothing really feels different. But he switched over from "journalism" to "opinion" a while back. Is our expectation of his rigor different now? When Bret Stephens casts doubt on climate change, I’m okay with it as an alternative perspective, but is this an NYT journalist? I know it's nitpicking, but this stuff matters.
And clearly I was not the only person confused by all of this with regards to the major Privacy Project investigation:
Did a machine write those words?
I have no idea what they mean and I live and breathe this stuff. How could a normal consumer possibly make educated distinctions?
That's the thing. Most readers probably don't think about this, at least, until something is called into question.
I started this piece mentioning AI-generated text because we need to collectively prepare to be inundated by the stuff. Even as I started writing this, a huge story broke about a disinformation network of AI-generated content on Facebook.
As a society, we will by default doubt the provenance of every single thing we read. How can we possibly expect readers to distinguish between human and machine, if one of the most important journalistic institutions in the world can’t even provide a clear distinction between opinion and journalism? This might seem like insider media nit-picking, but in the coming flood of misinformation*, trusted guides will become all we have in media. This stuff matters.
*misinformation doesn’t quite capture it. It will be more an incomprehensible volume of noise rather than purposefully misleading or false information. Does anyone have a better term?
Note #1: Before hitting send, I just read Nick Kristof’s latest newsletter. He doesn’t refer to his writing as opinion journalism. He refers to it, counter to the NYT PR tweets, as simply journalism. I agree.
Speaking of journalism, after my last column about my pieces that no one read in 2019, I received many warm messages from readers asking me to keep writing such pieces. Thanks for the notes. Here’s the deal: I’ll keep writing about these topics, if you keep reading!
Note #2: Thinking about a dystopian communications future reminded me of an incredible book I read a few years ago, about life in Russia under Putin: Nothing Is True and Everything Is Possible by Peter Pomerantsev. The book captures the almost comically-odd nature of a society where everyone has been worn down to not believe anything:
And on every channel is the President, who as a made-for-TV projection has fitted every Russian archetype into himself, so now he seems to burst with all of Russia, cutting ever quicker between gangster-statesman-conqueror-biker-believer-emperor, one moment diplomatically rational and the next frothing with conspiracies.
And on TV the President is chatting via live video-link to factory workers posing in overalls in front of a tank they’ve built, and the factory workers are promising the President that if protests against him continue, they will “come to Moscow and defend our stability.” But then it turns out the workers don’t actually exist; the whole thing is a piece of playacting organized by local political technologists (because everyone is a political technologist now), the TV spinning off to someplace where there is no reference point back to reality, where puppets talk to holograms when both are convinced they are real, where nothing is true and everything is possible.
And the result of all this delirium is a curious sense of weightlessness.
He’s got a new book out and it’s going on the early 2020 reading list.