Canon Rock

Using Corpus Analysis Software to Analyse Specialised Texts

Using Corpus Analysis Software to Analyse Specialised Texts


( reference : https://jaltranslation.com/)

 1.What is a corpus?

 In corpus linguistics, a corpus can be generally defined as… ‘a collection of naturally-occurring texts in a computer-readable format which can be retrieved and analyzed using corpus analysis software’ (Kennedy, 1998; McEnery & Wilson, 2001; OKeeffe, A., McCarthy, M., & Carter, R, 2007; Teubert & Cermakova, 2007)


           Corpus size

  •    There are no fixed ruled; depending on research purposes, availability of data and time.
  •     Large, general corpora may be less useful than small, focused corpora if searches are made on context-specific terms.
  •    There are limitations of too small’ corpora e.gnot enough concepts, terms, or patterns under investigation.
  •      It is preferable to create a monitor’ or open’ corpus because specialized words/usage are dynamic.

Text extracts vsfull texts

  •      Depends on the aim of corpus compilation.
  •     Whole text offers more coverage because words or terms to be looked at may be randomly distributed throughout the text.
  •     Specific sections may be helpful if we are looking for words or phrase under particular content areas or want to create purposeful sub-corpora


Number of texts

  •     Choices can be made between collect few texts of large size or a number of texts with smaller sizes.
  •       Choices can also be made between selecting texts written by one or two key writers or sources, or texts retrieved from different sources or written by different authors.
  •      Depends on your research focus e.gto study overall language use or to study idiosyncrasy or linguistic choices preferred by particular writers.

Medium

  •       Can be spoken or written texts or mixed.
  •       Depends on research questions.
  •      Some practical factors should also be considered e.g.compiling  spoken corpora can be time-consuming and needs special types of tagging.


Subject and text type

  •      Should mainly focus on the specialized text under investigation, although this is less clear-cut in multidisciplinary subjects.
  •     Texts may come from different subject if the research focus is on the study of particular language features rather than term extraction.
  •    Text types within a specialized subject field may vary fromexpert-to-expert’ texts to expert-to-non-expert’ texts, or in other words, from technical to popular texts.


Other considerations

  •         AuthorshipTexts written by experts in a field tend to present more reliable and authentic examples of specialized language.
  • ·     LanguageSpecialized texts can be stored and retrieved in the form of monolingual, comparable, or parallel corpora.
  •       Publication dateTexts should come from recent publications unless queries are made in relation to particular periods of time.

      4.     Sources of specialized texts

 ·Printed materials

· Word document

· CD-ROMs

· Texts on the Web

· Online databases


     5.     Getting started with Antconc

       Download the latest version of Antconc from http://www.laurenceanthony.net/software.html

( reference : http://www.laurenceanthony.net/software.html )

ความคิดเห็น

บทความที่ได้รับความนิยม