You are on page 1of 14

Exploring the Collocates of the Do + Negator + Lexical Verb Construction in English Bryoney Hayes and Reem Alzaeem

Introduction: In reading The effect of usage on degrees of constituency: the reduction of dont in English by Joan Bybee and Joanne Scheibman, in which the phonology of dont is analyzed for data on the meanings that dont and its constituents take when tightly collocated in three phrases: I dont know, I dont think and why dont you, our curiosity was piqued about the relationship of dont to its most frequent right and left collocates. After conducting searches in the COCA corpus for its collocates, we found that the most frequent collocates for dont on the left are personal pronouns and the most frequent collocates on the right belong to the mental verb lexical class as categorized by chapter 5, verbs, in the Longman Student Grammar of Spoken and Written English. To expand our search and for comparison purposes, we decided to explore the collocates of do not to see if there is a change in the collocates that are bound with the auxiliary and its negator in its uncontracted state. We then wanted to see if the data yielded by these searches had any historical implications or evolved useswe wanted to see if the collocates had changed over time. Corpora choice: With more than 410 million words compiled from texts taken from popular sources, we decided that COCA would provide us a comprehensive data source for examining the behavior of dont in American English. We originally chose to compare BNC to the COCA because we assumed that they would be relative comparisons. However, in researching, the question came up of how our results

compare based on the size of the two corpora. We asked ourselves, How can we compare our data if we do not know the size of the two corpora? That led us to search for the size of the BNC, which we found is 100 million words. This itself does not pose an impetus that we cannot circumvent in analyzing our data as we could compare results with a 4:1 ratio. However, our research also yielded a comparison of the two corpora that posed a difficulty in comparing the results of low-frequency words, for which the BNC does not have as expansive a database. At the same time, we noticed that the Mark Davies/Brigham Young corpora included a Corpus of Historical American English with 400 million words. In discussing where we would like to make our next research move, we decided to conduct a few quick searches in the COHA corpus to satisfy our own curiosity and interest in the development of the collocates of dont and do not over time. In conducting our initial search, we noticed that the use of dont and do not were inversely proportional over time according to the COHAas the use of do not decreased over time, the use of dont increased. This sparked our curiosity about the verbs and their collocates and the changes theyve undergone over time, and that led us to narrowing our search to the COCA and COHA corpora. Data Analysis: To narrow our analysis, we decided to search the corpora for only the first preceded collocate and first following collocate. After analyzing our data results, we realized that personal pronouns tend to be among the top 5 preceding collocates in dont and do not in both contemporary and historical American English. After personal pronouns, the most frequent

collocates were common nouns, with adverbs rounding out the top lexical grammatical word classes for collocates. In the data found in the COCA (corpus of contemporary American English), we found that personal pronouns is the most frequent lexical category that tends to precede the lexical construction dont. This data is reflected below, in Table 1. Table 1: COCA (Most frequent personal pronouns that collocate most tightly) Personal Pronoun I You We They Dont Do Not

192590

5.16 4.05 4.11 3.9

9450 -6126 6782

3.79 --

35,665

4.48

The results found in the data from the dont construction differ from those found in the do not construction yielded in the COCA corpus, where the number of pronouns used is far fewer than those in the dont construction, with some personal pronouns, for example you, being omitted from the collocate list entirely. We then applied these methods to the Corpus of Historical American English (COHA) to research the preceding and following collocates of dont and do not in historical American English. In this corpus too, we found personal pronouns to be the most frequently used grammatical class to collocate before dont and do not. With respect to the changes in American English, however, we saw some personal pronouns added, for example ye, and some deleted from the personal pronoun list of collocates. These results can be seen in Table 2. Table 2: COHA (Most frequent personal pronouns that collocate most tightly)

Personal Pronoun I We You Ye They Yuh Ay Ya Yu You-All

Dont

Do Not

144,114

5.09 3.44 4.13 3.27 -4.71 3.04 3.36 5.18 4.97

30,085 11,038 7159 -8689 ------

4.74 5.03

38285 -76 53 45 22 20

-4.35 ------

As we have seen in the COCA corpus, we saw a difference in the frequency of some personal pronouns between the two constructions dont and do not, with some of them even deleted. The most frequent lexical category that tends to precede the constructions in both constructions and corpora is the closed lexical category of personal pronouns. The second most frequent lexical category is interrogative determiners, which is also a closed lexical category. Of the possible interrogative determiners, why and who collocate individually with these constructions, with why being bound to the dont construction and who being bound to the do not construction in both contemporary and historical American English. These results are shown in Tables 3 and 4. Table 3: COCA, Why versus who as most frequently collocated interrogative determiners in dont/do not constructions Interrogative Determiner Why Who Dont Do Not

9832

5.02 --

-2858

-4.1

Table 4: COHA Why versus who as most frequently collocated interrogative determiners in dont/do not constructions (with Vy as a variant of why in the dont construction) Interrogative Determiner Why Who Vy Dont Do Not

10,055

5.49 -6.68

-2573 --

-3.41 --

15

In terms of the most frequent lexical category to follow the dont and do not constructions in the corpora, the open class of verbs tends to collocate most tightly with these constructions. Due to the tremendous number of verbs that collocate with these constructions, we minimized our data by restricting our analysis to the ten most frequent verbs in each constructionto those with the highest frequencies and M.I. scores. These verbs are shown in Tables 4 and 5. Table 5: COCA Lexical Verb Know Think Have Want Get Need See Believe Care Understand Seem Appear Feel Dont 78,300 5.6 5.12 3.26 5.59 3.06 4.08 3.08 4.6 4.46 4.94 3.6 -3.69 Do Not 3692 1506 5739 2795 -973 -1756 -953 506 486 474 4.17 3.05

40,418 10,849 9380 9153 7657 6918 6606 1920 -4493

4.72 -3.79 -5.45 -5.12 4.65 5.51 3.42

Table 5: COHA Lexical Verb Know Want Think Believe See Care Get Mind Understand Need Wish Mean Dont 58,955 6.03 5.95 5.11 5.28 3.29 5.51 3.17 4.04 5.25 4.36 -4.53 Do Not 9949 2525 4113 3023 1955 1133 --1897 1039 1855 1722 5.38 4.57

8540 8132 7524 5881 5387 5185 4695 -4624

5.69 3.15 4.69 --5.71 4.09 5.88 5.02

Based on Tables 4 and 5, for both Contemporary and Historical American English, the data shows that mental verbs tend to occur most frequently and bind most tightly (based on the M.I. score) with the dont construction, for example the verb know frequently occurs in COHA with a frequency of 58, 955 and with an M.I. score of 6.03 and in COCA with a frequency of 78,300 and an M.I. score 5.6. This contrasts with the data from the same collocate constituency in the do not construction, where the verb know is still listed as the top collocate in both corpora, but with a frequency of 9949 and an M.I. score of 5.38 in COHA and a frequency of 3692 and an M.I. score of 4.17 in COCA. Additionally, all of the verbs in the charts fall into the semantic category of mental verbs as listed in the Longman Student Grammar of Spoken and Written English except for the verb get, which is considered an activity verb and is one of the most common lexical verbs used in the English language, and the verb have, which is considered a primary verb. In addition, the lexical activity verb get, as shown above in Tables 4 and 5, collocates only with the dont construction in both historical and contemporary American English. This collocation is consistent throughout time, but the data also shows changes in some

of the mental lexical verbs over time. The verb see appears in historical American English in both the dont and do not constructions, but it appears only in the dont construction in contemporary American English. In addition to this example, the verb care starts as a collocate of both the dont and do not constructions in historical American English, but drops as a constituent of the do not construction in Contemporary American English. Furthermore, the verb wish appears as a collocate of the do not construction in historical American English, but drops as a collocate of either the dont or do not constructions in Contemporary American English. Though there is a tendency for collocates to drop over time with these dont + lexical verb constructions, we see the verb feel, another mental verb, actually become more tightly bound with the dont construction over time, with an M.I. score of 3.58 in COHA becoming a 3.69 in COCA. Furthermore, the primary verb have does not appear at all as a collocate of either the dont or do not constructions in historical American English, but appears as a top collocate of both constructions in Contemporary American English. After examining our data, we realized that we would have to go into the corpora and examine its entries to uncover more information about the dont and do not constructions and their constitutants. To build a more comprehensive view of the behavior of the do + negator constructions, we decided to include the simple past didnt, did not and third person singular simple present doesnt, does not in our search for data that could generate a meaningful analysis of the verb dont as an auxiliary and its constituents. We went back to our original data and pulled the top four verb collocates of the dont structure (know, think, have, want) from our COCA search and began researching the behavior of the different forms of the do + negator constructions with those verbs. In our search, we sought to identify auxillary + verb patterns that appeared regularly and were syntactically bound tightly enough for us to tentatively bestow upon

them the title of phrasals. We hypothesized that the contracted constructions would have a significantly higher number of phrasals than the uncontracted constructions, and that they would appear most often in the spoken register. In our search for meaningful data, we decided it was important to narrow our research focus in order to approach our topic from several meaningful angles and probe deeply into the data to find patterns. Our first decision was to eliminate the COHA corpus, as we did not find the information we uncovered in it to that point to be significant or meaningful in terms of collocate binding. We also thought that in a language teaching environment, what we uncovered in COHA would not necessarily be relevant. We then searched for the collocates of didnt, did not, doesnt, does not, hypothesizing that they would share the collocates of dont and do not and be bound similarly. The results are shown below, in Tables 6 and 7.

Table 6 Lexical Verb Know Think Have Want Table 7 Lexical Verb Know Think Have Want Doesnt 3.65 -3.17 4.70 Does Not --Didnt 4.91 3.17 3.10 5.53 Did Not 3.86 --

4.72

3.84

It should be noted in looking at this data that these are not the top collocates of the verbs, but rather pulled for comparison purposes to the dont construction. Having said that, we can see from this data that one of the factors in collocation among different forms of the do+negator verbs is whether or not the verb is contractedour contracted verbs tended to share a high number of collocates among themselves that the uncontracted verbs shared. We hypothesized that the reason for this had to do with the level of formality of these verbs, and posited that the contracted verbs would have a significantly higher number of collocates that bound tightly enough to form a phrasal pattern than the uncontracted forms, as well as occur most frequently in the informal spoken register. To begin our search for the connection between registers and verbs collocates, we searched each verb individually to see in which register they most frequently occurred. The results are shown in Table 8.

Table 8 Spoken (per mil) 2,313.47 Fiction Newspaper 950.89 1,720.23 Doesnt Didnt Do Not Does Not Did Not Know Think Have Want 459.31 822.01 118.30 95.89 179.29 4,096.92 4,096.49 6975.78 1,573.19 357.39 1,355.77 96.28 60.73 307.35 1,903.10 1,112.06 3608.47 969.82 333.99 469.02 123.19 127.94 200.69 664.11 731.89 4362.40 648.17 51.34 69.42 315.29 304.42 379.79 315.84 259.22 3634.02 194.51 Academic 164.64

Dont

As wed hypothesized, the uncontracted forms of the do+negator construction occurred most frequently in the academic register, and the contracted forms occurred most frequently in the spoken register, with one exception. Didnt occurred most frequently in the fiction register and also had a comparatively high occurrence in the newspaper register, which we thought made sense given that printed storytelling tends to occur in the simple past tense in our reading experience. We also saw that the collocate verbs tended to occur most frequently in the spoken register as wed hypothesized, which confirmed for us the link between their bonds to the contracted verbs and the level of formality in their use. We appeared to confirm a link between the contracted verbs and their collocates appearing in the same registers, so our next step was to determine whether these verbs were bonded tightly enough to create a phrasal unit pattern. To do this, we examined samples of individual entries in the COCA corpus to determine the syntactic tightness of constituent bonding among the contracted verbs and the lexical verbs. To our surprise, we did not find the strong collocations that we expected. We at least expected to see examples where dont + know were bound tightly enough to create a phrasal meaning, but we actually found surprisingly few examples of this (except for when I dont know or dont know was used as an insert), except for in the case of do +negator+contraction + think, where dont + think seems to have the phrasal meaning of expressing doubt as a pattern more often than it has the meaning of denoting a lack of cognitive process, for example, I honestly don't think he needed me. I was like a comfort blanket (COCA 2008), or, It undermines their political clout. I don't think anybody really values being connected to Goldman at this point.

Because we did not see the number of phrasal meaning patterns that we hypothesized from the dont collocates, we decided to search the collocates of the contracted verbs didnt and doesnt for phrasal meanings. Here we found several collocate pairings that seemed to create a non-literal meaning when paired together, including doesnt matter (is not important); doesnt get (does not understand) and doesnt look. In examining the individual corpus entries of these pairings, we noticed that not only does doesnt look have a non-literal meaning, but it tends to have several meanings that are dependent on its syntactic use, as well as collocates of its own that contribute to these nonliteral, or phrasal meanings: doesnt look + [like] [good] [much] [so] and [too]. Our search uncovered that when speakers or writers say a subject doesnt look + [adverb/adjective], they are not saying that the subject keeps its eyes closed or will not turn its head. In fact, not only does the action of not looking not refer to seeing, but in these cases the subject is not the one enacting the verb. Instead, we see doesnt look + collocate used to convey nuanced meanings that we divided into the following categories: doesnt look as describing visual resemblance, doesnt look as describing appearance (figurative or literal) doesnt look as a projection on future circumstances, and doesnt look as describing subjects modes, or the way they seem. To determine whether a specific collocate paired with doesnt look to convey one of the more nuanced meanings of doesnt look, we went through the first hundred entries of each verb collocate pairing and categorized each phrasal units meaning as either visual resemblance, appearance, a future projection or their mode (seem). Because of the difficulty in distinguishing appear from seem (there were many times when those categories seemed to overlap), we combined the results of those. What we found was that the construct doesnt look good was the one most likely to be used to make a projection about the future when using the doesnt look

phrasal unit pattern, for example You look at it and it doesnt look good for him. At the same time, these are Democratic and primary voters (COCA 2010), with doesnt look good falling into the projection category 27 out of 100 times (approximately). The doesnt look like preposition construct was the one most likely to convey visual resemblance, for example, I have no idea. It doesnt look like my children (COCA 2010), with 39 of 100 corpus examples falling into the visual resemblance category. The other collocates, much so and too were used frequently to describe the way a subject appears or seems in a figurative sense, for example he doesnt look much like a celebrity, either (COCA 2010). We hypothesized that this is because of those words lexical category of intensifiers. Conclusion: After researching our do+negator+constituent structures, we have arrived at several conclusions. We have determined that though their use has changed over time, it has not changed in a manner that is exceptional to the changes that any living language endure. We have also concluded that the contracted versions of our verbs tend to share more collocates than they do with the uncontracted versions, and those verbs tend to appear in the spoken register most frequently (except for the didnt, which tends to appear in the fiction and newspaper register). We also concluded that the do+negator contracted verb variants are more likely to collocate with words tightly enough to form a phrasal unit pattern than the uncontracted variants. In addition, when closely examined, we see that the phrasal unit patterns do not create simply one non-literal meaning, but several separate meanings that each then have their own collocates that contribute to those phrasal meanings.

References Biber, D., Conrad, S., & Leech, G. (2002). Longman student grammar of spoken and written English. Essex, England: Longman. Bybee, J & Scheibman, J (2010). The Reduction of Dont Davies, Mark. (2010). The Corpus of Contemporary American English (COCA): 410 + millions words, 1990-present. Davies, M. (2010). Corpus of Historical American English. Corpus.byu.edu/coha

You might also like