505.94K

Presentation about CL

1.

APPLICATION OF CORPUS OF
CONTEMPORARY AMERICAN ENGLISH
Lecturer:
Dr. Ataboev N.B.

2.

WHY?
WHAT?
HOW?

3.

1. WHAT IS A CORPUS?
“a collection of naturally occurring language texts,
chosen to characterize a state or variety of a
language” (Sinclair, 1991)
“a collection of linguistic data, either written texts
or a transcription of recorded speech, which can be
used as a starting-point of linguistic description or
as a means of verifying hypotheses about a
language” (Crystal, 1991)
“a collection of actually occurring texts (either
spoken or written), stored and accessed by means of
computers, and useful for investigating language
use” (Thornbury, 2006)
can be analysed by computer software
Like a library, but in which you know not only
where each book is, but where every word in each
book is!

4.

THE PLACE OF CORPUS LINGUISTICS

5.

Corpus (see tour)
# words
Dialect
Time period
Genre(s)
News on the Web
(NOW)
iWeb: The
Intelligent Webbased Corpus
16.0 billion+
20 countries
2010-yesterday
Web: News
14 billion
6 countries
2017
Web
Global Web-Based
English (GloWbE)
1.9 billion
20 countries
2012-13
Web (incl blogs)
Wikipedia Corpus
1.9 billion
(Various)
2014
Wikipedia
Coronavirus
Corpus
Corpus of
Contemporary
American English
(COCA)
Corpus of
Historical
American English
(COHA)
The TV Corpus
1.5 billion+
20 countries
Jan 2020-yesterday
Web: News
1.0 billion
American
1990-2019
Balanced
475 million
American
1820-2019
Balanced
325 million
6 countries
1950-2018
TV shows
The Movie Corpus
200 million
6 countries
1930-2018
Movies
Corpus of
100 million
American
2001-2012
TV shows

6.

Hansard Corpus
1.6 billion
British
1803-2005
Parliament
Early English
Books Online
755 million
British
1470s-1690s
(Various)
Corpus of US
Supreme Court
Opinions
130 million
American
1790s-present
Legal opinions
TIME Magazine
Corpus
British National
Corpus (BNC) *
100 million
American
1923-2006
Magazine
100 million
British
1980s-1993
Balanced
Strathy Corpus
(Canada)
CORE Corpus
50 million
Canadian
1920s-2000s
Balanced
50 million
6 countries
2014
Web
155 billion
American
1500s-2000s
(Various)
34 billion
British
1500s-2000
(Various)
From Google
Books ngrams (compare)
American
English
British English

7.

The most used balanced corpus in the field:

8.

•However, a few months ago Agnelli was given a lifetime appointment
to...; However , a holding call on the Seahawks pushes it back to the...
.
•As he made his fitful way after high school, however, basketball abided.;
The line 's bright colors, however, are just the opposite of simple ,
allowing gamers to express....
•Conston said he had seen eeves '
dark
Nothing satisfied the fundamentalists , however.
side,
however .

9.

COMPARATIVE ANALYSIS OF CONCEPT OF
FAMILY IN BNC AND COCA

10.

FREQUENCY
Raw frequency:
BNC – 33369
COCA - 272121
Normalized Frequency:
BNC - 333,69,
COCA - 471,12

11.

SIZE: large (1,44), small (0,72), average large (0,86), big (0,85), average (0,56),
(0,60), big (0,55), normal (0,37), little small (0,52), little (0,40), median (0,36),
(0,26);
normal (0,24), stable (0,10);
BACKGROUND: royal (7,14), local
(0,59), traditional (0,40), Jewish (0,35),
holy (0,35), noble (0,31), black (0,26),
British (0,26), Christian (0,26), human
(0,25), poor (0,25), wealthy (0,21),
working-class (0,19), farming (0,15),
middle-class (0,13), English (0,18),
French (0,11), Indian (0,11), Asian
(0,10);
royal (3,28), American (2,01), black
(0,97), traditional (0,66), holy (0,55),
dysfunctional (0,46), human (0,46),
Kennedy (0,62), Bush (0,45), Jewish
(0,44), poor (0,34), white (0,31), middleclass (0,30), wealthy (0,30), military
(0,28), Christian (0,22), immigrant
(0,19),
African-American
(0,14),
working-class (0,13), suburban (0,10),
religious (0,10), Muslim (0,09), farming
(0,08), free (0,08);
FAMILY MEMBERS: extended (1,61),
nuclear (1,36), immediate (0,89), private
(0,23), individual (0,15), natural (0,18),
birth (0,14), one-parent (0,11), singleparent (0,10), two-child (0,9);
extended (3,37), immediate (1,08),
nuclear (0,91), host (0,21), adoptive
(0,18), private (0,17), biological (0,12),
two-parent (0,12), personal (0,09),
single-parent (0,09), broken (0,08),
surrogate (0,07);

12.

RELATIONSHIP: happy (0,69), close
(0,68), joint (0,25), loving (0,15),
ordinary (0,15), respectable (0,10),
lovely (0,10), friendly (0,09), close-knit
(0,09), ideal (0,06);
happy (0,61), close (0,12), great (0,41),
loving (0,40), perfect (0,24), wonderful
(0,23), close-knit (0,23), joint (0,20),
typical (0,18), supportive (0,15),
individual (0,15), devoted (0,10), ideal
(0,09), tight-knit (0,07), lovely (0,06),
beloved (0,06);
AGE: new (1,10), old (0,70), young new (1,19), old (0,76), modern (0,58),
(0,60), growing (0,25), modern (0,15), young (0,49), growing (0,25), younger
contemporary (0,06).
(0,07).

13.

KWIC RESULTS IN BNC
1) 1900 had an average family size of 4 children…;
2) …family is renewed and starts again…;
3)…the
single-parent
family becomes a dominant feature of modernity.;
4) he was an actual product of the family experience…;
5) …the family’s role in sports is redundant …;
6) the family fulfills fewer of the tasks of the
socialization …;
7) ..the family is a circle of friends…;
8) …youngsters should enjoy the family upbringing…;
9) experience of the family life is a greater mutual
understanding…

14.

KWIC RESULTS IN COCA
1) ...a good family is mother, father and a lot of money;
2) ...it is a family matter involving a father to love his
son;
3)
...
parental care and family structure are critical factor
s in educational success;
4) …family ties are important to …;
5) …family’s background provides a strong basis for
prediction;
6) …not tell the family secrets to strangers;
7) good life is shared in the circle of the family…;
8) the most important thing is that my family is alive
and happy;

15.

FAMILY + ABSTRACT NOUNS
COCA: family loyalty, family status, family honor,
family value,
family environment, family
concerns, family secrets, family circle, family
gathering, family planning, family breakdown,
family involvement, family love.
BNC: family rumour, family value, family crest,
family chronicles, family background, family
loyalty, family obligation, family welfare, family
priorities, family tradition, family reasons, family
interest, family rules, family responsibility.

16.

CONCLUSION
In the above-mentioned examples and analyzes based
on collocates, metaphors, ideas derived from the
corpora, the perception of the concept “Family” is
compared by means of corpora that are
representatives of two different nations speaking the
same language. This once again demonstrates the
importance of the meaning of linguistic corpora and
their analysis as a necessary source in
understanding linguocultural units. Since language
cannot exist outside of culture, culture cannot be
formed without the participation of language. It
should be considered reasonable that the language
corpus, capable of reflecting a full-fledged language
on itself, should be evaluated as a source of texts of
cultural concepts.

17.

REFERENCES:
Mark Davies Semantically-based, learner-oriented queries with the 400+ million
word Corpus of Contemporary American English//Explorations across Languages
and corpora edited by Stanislaw Gozdz-Roszkowski. Berlin – 2009. 622 p. 13-28 pp. P
– 16
Sinclair J. Соrpus, Concordance, Collocation. Oxford University Press. 1991. 179 pp.
Ch. McEnery and Andrew Wilson Corpus Linguistics, Edinburgh University Press,
Second edition, 2001 P 1-2, (205 p.)
V.A. Maslova Lingvokulturologiya: uchubnoye posobiye. (Linguoculturology: study
guide). Academy of Social Sciences, – Moscow. 2004. – 208 p.
Gries Stefan Th., Stefanowitsch Anatol (eds.). Corpora in Cognitive Linguistics.
Corpus Based Approaches to Syntax and Lexis. Trends in Linguistics: Studies and
Monographs 172. New York: Mouton de Gruyter, 2006, 352 pp.
Meyer C.F. English Corpus Linguistics. ‒ Cambridge: Cambridge University Press,
2002.

URL://
https://pdfs.semanticscholar.org/c775/4bfab1d0f770e26c0fc7c603e8cf38793be8.pdf
English     Русский Правила