Visualizing the metadata reveals the characteristics of this corpus, which is not "balanced."
Date of publications (known for all but 15 of the items) ranges for 1831 to 2003 with items published in 118 of those years.
The items also vary widely in page count. Five complete texts run to over 300 pages, while 479 items are a single page alone. Similarly, some authors or editors are over-represented compared to others by item count (and there are issues with the controlled naming practices used in the database). For example, Mary Eliza Church Terrell is credited with 6 items versus 59 authors/editors who appear only once out of the total 91 unique authors.
My first question was to examine the embodied discourse that appeared in the first iteration of the BWSD relative to the History of Woman Suffrage. With this larger corpus, I was able to compare BWSD files against the Corpus of Historical American English (COHA) for the years 1830-1930.
As seen above, the embodied discourse is not particularly unique compared to the COHA. However, what does appear soul, mind, manner, presence fits in some ways with my earlier conclusions.
Because the corpus is unbalanced I wanted to explore which authors or editors might be responsible for the high uses of these words. her soul appears in about 34% of corpus, with Frances Ellen Watkins Harper using it most frequently, When compared to other authors or editors, works attributed to Harper are significantly more likely to contain soul (LL 21.525 the likelihood of her soul by transitional probability in harper's writing 0.012 v 0.004 in all other authors/editors)
However, mind, manner and presence do not appear to be skewed
No comments:
Post a Comment