The present paper aims to explore methodological approaches to the examination of corpus linguistic data through various tools deriving from discourse analysis. During the examination of data collected from a small, specialised corpus, interesting and unexpected elements were identified which invited further analysis. A corpus of around 100,000 tokens was assembled over a single day in 2014 by collecting contributions to a weblog discussion of the Israel-Palestine conflict. The initial aims of the research were to observe linguistic behaviour in English over a very limited timescale and involving a specific and highly controversial topic, with a secondary aim being the examination of methodological issues concerned with the creation of small corpora and how they should be interrogated. Data were analysed both quantitatively and qualitatively, but there was a stated intention from the outset to attempt to employ a hands-on approach as much as possible. Various problems emerged during the study, perhaps the most salient being that of attribution and tagging, and the most suggestive being the employment of names and highly pragmatic discourse features in curious and perhaps unexpected ways. A rereading of the corpus concentrating on names used and other forms of self-presentation, along with observations of attempts to impose identities on others, suggested a need for discourse-level analysis of linguistic behaviour in weblogs and discussion pages, since the naming devices often appear as phrases rather than individual words (even if they present as a single token with no spaces) and they presuppose a textual environment and a form of dialogue or interaction. Prevalent instances of metaphor and a repeated use of highly varied informal discourse markers (again often with apparently self-identifying pragmatic purposes) encouraged examination of cohesion over significant textual distances (c.f. progressive relatedness), while issues surrounding coherence again challenged the potentially limited quantitative notions of what corpus linguistic analysis entails. In the paper, some fundamental assumptions in corpus linguistics are questioned, including the concept of repetition and so of the seemingly obvious binary contrast of token and type, the parsing of items, the use and interpretation of metaphor, metonymy and intertextuality, and the sociolinguistic and pragmatic elements inherent in all utterances. The complexity and richness of corpus linguistic data is seen to render qualitative analysis very demanding, but of unquestionable potential significance. Discourse level analysis is deemed a necessary tool, and the paper concludes with the suggestion that the future of corpus linguistic studies should indeed be two-fold, with constant comparison and triangulation of data from large-scale general language corpora and small-scale, specialised ones.

Naming or shaming? Presentations of the self in specialisec weblog discourse

richard chapman
2017

Abstract

The present paper aims to explore methodological approaches to the examination of corpus linguistic data through various tools deriving from discourse analysis. During the examination of data collected from a small, specialised corpus, interesting and unexpected elements were identified which invited further analysis. A corpus of around 100,000 tokens was assembled over a single day in 2014 by collecting contributions to a weblog discussion of the Israel-Palestine conflict. The initial aims of the research were to observe linguistic behaviour in English over a very limited timescale and involving a specific and highly controversial topic, with a secondary aim being the examination of methodological issues concerned with the creation of small corpora and how they should be interrogated. Data were analysed both quantitatively and qualitatively, but there was a stated intention from the outset to attempt to employ a hands-on approach as much as possible. Various problems emerged during the study, perhaps the most salient being that of attribution and tagging, and the most suggestive being the employment of names and highly pragmatic discourse features in curious and perhaps unexpected ways. A rereading of the corpus concentrating on names used and other forms of self-presentation, along with observations of attempts to impose identities on others, suggested a need for discourse-level analysis of linguistic behaviour in weblogs and discussion pages, since the naming devices often appear as phrases rather than individual words (even if they present as a single token with no spaces) and they presuppose a textual environment and a form of dialogue or interaction. Prevalent instances of metaphor and a repeated use of highly varied informal discourse markers (again often with apparently self-identifying pragmatic purposes) encouraged examination of cohesion over significant textual distances (c.f. progressive relatedness), while issues surrounding coherence again challenged the potentially limited quantitative notions of what corpus linguistic analysis entails. In the paper, some fundamental assumptions in corpus linguistics are questioned, including the concept of repetition and so of the seemingly obvious binary contrast of token and type, the parsing of items, the use and interpretation of metaphor, metonymy and intertextuality, and the sociolinguistic and pragmatic elements inherent in all utterances. The complexity and richness of corpus linguistic data is seen to render qualitative analysis very demanding, but of unquestionable potential significance. Discourse level analysis is deemed a necessary tool, and the paper concludes with the suggestion that the future of corpus linguistic studies should indeed be two-fold, with constant comparison and triangulation of data from large-scale general language corpora and small-scale, specialised ones.
2017
Chapman, Richard
File in questo prodotto:
File Dimensione Formato  
Chapman_20_2017.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Full text (versione editoriale)
Licenza: Creative commons
Dimensione 349.02 kB
Formato Adobe PDF
349.02 kB Adobe PDF Visualizza/Apri

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2382744
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact