some of you may have been following along with the rather contentious and certainly confusing conversation over at #DHpoco started by the ever awesome Roopika Risam and Adeline Koh.
I ran the comments (as of May 14) through a concordancer (AntConc) and tweeted. Adeline asked about topic modeling so I did that as well. ( using David Newman's mallet based tool).
0 5 identity post power institutions things building computer human today kinds
1 5 people gender issues projects don english research computing project hard
2 5 dh race history class field sexuality color intellectual disability examples
3 5 pm reply critique critical good working white whiteness women comments
4 5 digital humanities social technology technologies open identities making access issue
5 5 studies thinking point ways david question adeline comment queer historical
6 5 cultural discourse kind concerns ve book view agree level humanists
7 5 work questions scholars important http politics culture theory discussion postcolonial
8 5 make time institutional world language code practice experience ll center
9 5 reply tools find refuge lot golumbia thread specific privilege bit
I will leave it to others to interpret the topic modeling. My whole problem with topic modeling is the interpretation of it, but others are quite skilled at it.
You can see the most common word list here
I also ran collocates ("In corpus linguistics, a collocation is a sequence of words or terms that co-occur more often than would be expected by chance") and clusters (word that occur together at highest frequency) for race, identity, and culture, words that popped in my meeting with Adeline yesterday which we had serendipitously scheduled prior to the blog discussion emerging. (note Heather Froelich would and will do this totally differently, as is the constant pattern in our collaboration. You'll have to wait for our guest post though over at #DHPoco next week to see how that turns out.
Collocates
Identity - of, an, politics post, social
Race - class, of, an, gender, critical
Culture - and, digital, in, of, wars
Clusters
Identity
1 17 of identity
2 9 identity and
3 9 identity politics
4 5 about identity
5 5 identity in
Race
1 11 of race
2 7 race, class
3 5 and race
4 4 race and
5 4 race, gender
6 4 race/class
Culture
1 5 culture and
2 4 digital culture
3 3 and culture
4 2 culture, and
5 2 in culture
most fascinating so far is that identity kicks up way more clusters and collocates than race or culture, which surprised me. I would have anticipated culture, which as Raymond Williams once noted is one of the two or three more complicated words in English, to be highest.
I'm also surprised, but not surprised, to see the adjunctive discourse of race, class, and gender poking through here ( a discourse I found in late 1970s feminist periodicals as well). I'm also surprised to see "culture wars" which seems so last century peeking through, but I've not read in depth to see where that is occurring.
from twitter after I asked if he might take a stab at interpreting models Jonathan Goodwin @joncgoodwin
ReplyDelete@ProfessMoravec My feeling is that there aren't enough words here for topic modeling to be the best method. Antconc/voyant prob. better.