Message Boards Message Boards

[WSS16] A Brief Exploration of Reddit

Posted 3 years ago
0 Replies
2 Total Likes

I had the opportunity to attend the Wolfram Science Summer Program in Waltham, MA over a three week period. It was amazing to be around such a diverse group of scholars who study the world. It was as if the disciplinary boundaries that sometimes define academia were dissolved, and as a result, free discussion was encouraged rather than stymied.

This is the now of the title, the then comes a little later.

Those on the science track were first assigned a homework, which was to find a two-dimensional three-color totalistic cellular automaton (CA) that was "interesting." Given there are billions of such combinations of CA, finding something interesting became complicated. However, I moved through some prime numbers, birthdays, and other trivial number combinations to come to a few interesting CA, in that their edges and growth patterns were irregular. This is not necessarily ground breaking, but it at least qualifies as interesting.

CA Code 144305877 at Step 500

enter image description here

CA Code 22675100 at Step 300

enter image description here

While these are beautiful patterns, the essential beauty lies in the fact these patterns emerged from simple initial conditions, in that patterns emerge from baseline instructions how neighboring cells behave. I will save the metaphysics for another time, but this is indeed interesting.

I also discovered something interesting involving a recently solved problem involving Pythagorean triplets and a weird coincidence with a prime number, but that is a separate blog post, forthcoming, maybe, as the current research probably stalled due to a bug in the computer program, and it is likely not a discovery at all. Such is curiosity.

Part of the summer program involved a project selected by Stephen Wolfram for each student. This was quite exciting. I was asked to explore the "sociology of Reddit." I had expressed interest in analyzing online social network data, though my limited network analysis experience lies in data from surveys. This amount of data was a welcome challenge.

As a reflexive sociologist, I would be remiss to not mention what was occurring at my home in Baton Rouge. Alton Sterling had been shot, and I had the sense events were going to unfold in a way that would be at once predictable but abhorrent. I will discuss this in a later epilogue.

That said, I had tasks, and the first was to figure out where to begin with such a large amount of data from Reddit. I decided tackling discrete question/answer pairs and/or comments with a defined start and end point would be best. To that end, r/IamA, or Ask Me Anything, was a good starting point. My mentor was helpful in providing me with some beginning code to identify similarities in language structure in particular AMAs. Here is the result from President Obama's AMA.

Language Structure of President Barack Obama's AMA

enter image description here

Using an algorithmic function of similarities defining communities, the following graph is possible.

President Obama AMA Community Graph of Language Structure

enter image description here

These are basic graphs that reveal a lot about comment structure. However, further analysis uses a sentiment algorithm beyond "positive," "negative," or "neutral," using sociological theory to better define natural language processing goals. It is one thing to say something is neutral, such as benevolence. It is quite another to have that classified neutral term nested in benevolence and the construction of sexism.

Just some neat things here to gawk at. Updates soon on progress with these data.

Epilogue 1: On a more personal note, the events occurring and that have occurred around the world, including in my hometown of Baton Rouge, have provided some urgency and importance to this work. While studying Reddit might not seem important on its face, the detection of harmful rhetoric and the causal mechanisms of that rhetoric are worthy of exploration, especially given the increasing nature of disclosure on online websites that challenges values of free speech in current time. To that end, authoritative monitoring of social media speech is a fuzzy area for decision making. However, it is a ripe ground for testing taken for granted theories and assumptions about the social world and, perhaps, how to better detect extremists who move from speech to physical action.

Epilogue 2: I lived in Boston in 2003-2004, where I met my partner. Then and now: it was at once the same and different. My apartment had turned into a bed and breakfast, beggars filled the once posh Newbury Street, and there was an air of despair. Depressing to say the least, though the spirit of community in Boston is something at first standoffish but then endearing. People in Boston care, but they don't express it like some do in the South. An expression of "baby" at the grocery store in south Louisiana is perhaps translated into a grunt in New England. Just an observation and nothing to generalize about.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract