Literary detective work on the computer:
Gespeichert in:
Beteilige Person: | |
---|---|
Format: | Buch |
Sprache: | Englisch |
Veröffentlicht: |
Amsterdam [u.a.]
Benjamins
2014
|
Schriftenreihe: | Natural language processing
12 |
Schlagwörter: | |
Links: | http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
Umfang: | X, 283 S. Ill., graph. Darst. |
ISBN: | 9027249997 9789027249999 9789027270139 |
Internformat
MARC
LEADER | 00000nam a2200000zcb4500 | ||
---|---|---|---|
001 | BV041913981 | ||
003 | DE-604 | ||
005 | 20171221 | ||
007 | t| | ||
008 | 140612s2014 xx ad|| |||| 00||| eng d | ||
010 | |a 2014007366 | ||
020 | |a 9027249997 |9 90-272-4999-7 | ||
020 | |a 9789027249999 |9 978-90-272-4999-9 | ||
020 | |a 9789027270139 |9 978-90-272-7013-9 | ||
035 | |a (OCoLC)884916017 | ||
035 | |a (DE-599)HEB340840838 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-12 |a DE-20 | ||
084 | |a ES 945 |0 (DE-625)27935: |2 rvk | ||
084 | |a 24,1 |2 ssgn | ||
100 | 1 | |a Oakes, Michael P. |e Verfasser |0 (DE-588)1057567302 |4 aut | |
245 | 1 | 0 | |a Literary detective work on the computer |c Michael P. Oakes |
264 | 1 | |a Amsterdam [u.a.] |b Benjamins |c 2014 | |
300 | |a X, 283 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Natural language processing |v 12 | |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Plagiat |0 (DE-588)4046196-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Literatur |0 (DE-588)4035964-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Sprachanalyse |0 (DE-588)4129916-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Autorschaft |0 (DE-588)4130545-0 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Literatur |0 (DE-588)4035964-5 |D s |
689 | 0 | 1 | |a Autorschaft |0 (DE-588)4130545-0 |D s |
689 | 0 | 2 | |a Plagiat |0 (DE-588)4046196-8 |D s |
689 | 0 | 3 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | 4 | |a Sprachanalyse |0 (DE-588)4129916-4 |D s |
689 | 0 | |5 DE-604 | |
830 | 0 | |a Natural language processing |v 12 |w (DE-604)BV013516598 |9 12 | |
856 | 4 | 2 | |m V:DE-603;B:DE-30 |q application/pdf |u http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
942 | 1 | 1 | |c 002 |e 22/bsb |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-027357628 |
Datensatz im Suchindex
_version_ | 1819382409332785152 |
---|---|
adam_text | Literary Detective Work
on the Computer
Michael P Oakes
University of Wolverhampton
John Benjamins Publishing Company
Amsterdam / Philadelphia
Table of contents
Preface
CHAPTER 1
Author identification
1 Introduction x
2 Feature selection 5
2 1 Evaluation of feature sets for authorship attribution 8
3 Inter-textual distances 11
3 1 Manhattan distance and Euclidean distance 12
3 2 Labbe and Labbes measure 14
3 3 Chi-squared distance 15
3 4 The cosine similarity measure 16
3 5 Kullback-Leibler Divergence (KLD) 18
3 6 Burrows Delta 18
3 7 Evaluation of feature-based measures for inter-textual distance
3 8 Inter-textual distance by semantic similarity 26
3 9 Stemmatology as a measure of inter-textual distance 28
4 Clustering techniques 30
4 1 Introduction to factor analysis 31
4 2 Matrix algebra 35
4 3 Use of matrix algebra for PCA 38
4 4 PCA case studies 44
4 5 Correspondence analysis 45
5 Comparisons of classifiers 47
6 Other tasks related to authorship 50
6 1 Stylochronometry 50
6 2 Affect dictionaries and psychological profiling 53
6 3 Evaluation of author profiling 58
7 Conclusion 58
vi Literary Detective Work on the Computer
CHAPTER 2
Plagiarism and spam filtering 59
1 Introduction 59
2 Plagiarism detection software 62
2 1 Collusion and plagiarism, external and intrinsic 63
2 2 Preprocessing of corpora and feature extraction 63
2 3 Sequence comparison and exact match 64
2 4 Source-suspicious document similarity measures 65
2 5 Fingerprinting 66
2 6 Language models 67
2 7 Natural language processing 68
2 8 Intrinsic plagiarism detection 70
2 9 Plagiarism of program code 73
2 10 Distance between translated and original text 74
2 11 Direction of plagiarism 76
2 12 The search engine-based approach used at PAN-13 78
2 13 Case study 1: Hidden influences from printed sources
in the Gaelic tales of Duncan and Neil MacDonald 81
2 14 Case study 2: General George Pickett and related writings 83
2 15 Evaluation methods 84
2 16 Conclusion 85
3 Spam filters 86
3 1 Content-based techniques 87
3 2 Building a labeled corpus for training 87
3 3 Exact matching techniques 88
3 4 Rule-based methods 89
3 5 Machine learning 90
3 6 Unsupervised machine learning approaches 92
3 7 Other spam-filtering problems 93
3 8 Evaluation of spam filters 94
3 9 Non-linguistic techniques 94
3 10 Conclusion 97
4 Recommendations for further reading 98
CHAPTER 3
Computer studies of Shakespearean authorship 99
1 Introduction 99
2 Shakespeare, Wilkins and Pericles 101
2 1 Correspondence analysis for Pericles and related texts 105
3 Shakespeare, Fletcher and The Two Noble Kinsmen 108
4 King John 110
Table of contents VII
5 The Raigne of King Edward III 111
5 1 Neural networks in stylometry 111
5 2 Cusum charts in stylometry 113
5 3 Burrows Zeta and Iota 116
6 Hand D in Sir Thomas More 118
6 1 Elliott, Valenza and the Earl of Oxford 118
6 2 Elliott and Valenza: Hand D 121
6 3 Bayesian approach to questions of Shakespearian authorship 122
6 4 Bayesian analysis of Shakespeare s second person pronouns 127
6 5 Vocabulary differences, LDA and the authorship of Hand D 130
6 6 Hand D: Conclusions 131
7 The three parts of Henry VI 132
8 Timon of Athens 132
9 The Puritan and A Yorkshire Tragedy 133
10 Arden of Faversham 134
11 Estimation of the extent of Shakespeare s vocabulary
and the authorship of the Taylor poem 136
12 The chronology of Shakespeare 141
13 Conclusion 147
CHAPTER 4
Stylometric analysis of religious texts 149
1 Introduction 149
1 1 Overview of the New Testament by correspondence analysis 151
12Q 153
1 3 Luke and Acts 169
1 4 Recent approaches to New Testament stylometry 171
1 5 The Pauline Epistles 175
1 6 Hebrews 188
1 7 The Signs Gospel 188
2 Stylometric analysis of the Book of Mormon 190
3 Stylometric studies of the Qu ran 198
4 Conclusion 206
CHAPTER 5
Computers and decipherment 207
1 Introduction 207
1 1 Differences between cryptography and decipherment 208
1 2 Cryptological techniques for automatic language recognition 209
1 3 Dictionary approaches to language recognition 212
1 4 Sinkov s test 212
VIII Literary Detective Work on the Computer
1 5 Index of coincidence 213
1 6 The log-likelihood ratio 214
1 7 The chi-squared test statistic 215
1 8 Entropy of language 215
1 9 Zipf s Law and Heaps Law coefficients 218
1 10 Modal token length 219
1 11 Autocorrelation analysis 220
1 12 Vowel identification 221
2 Rongorongo 224
2 1 History of Rongorongo 224
2 2 Characteristics of Rongorongo 226
2 3 Obstacles to decipherment 227
2 4 Encoding of Rongorongo symbols 227
2 5 The Mamari lunar calendar 228
2 6 Basic statistics of the Rongorongo corpus 228
2 7 Alignment of the Rongorongo corpus 229
28A concordance for Rongorongo 231
2 9 Collocations and collostructions 233
2 10 Classification by genre 234
2 11 Vocabulary richness 237
2 12 Podzniakov s approach to matching frequency curves 241
3 The Indus Valley texts 243
3 1 Why decipherment of the Indus texts is difficult 243
3 2 Are the Indus texts writing? 244
3 3 Other evidence for the Indus Script being writing 248
3 4 Determining the order of the Markov model 248
3 5 Missing symbols 249
3 6 Text segmentation and the log-likelihood measure 249
3 7 Network analysis of the Indus Signs 251
4 Linear A 252
5 The Phaistos disk 255
6 Iron Age Pictish symbols 256
7 Mayan glyphs 256
8 Conclusion 257
References 259
Index 281
|
any_adam_object | 1 |
author | Oakes, Michael P. |
author_GND | (DE-588)1057567302 |
author_facet | Oakes, Michael P. |
author_role | aut |
author_sort | Oakes, Michael P. |
author_variant | m p o mp mpo |
building | Verbundindex |
bvnumber | BV041913981 |
classification_rvk | ES 945 |
ctrlnum | (OCoLC)884916017 (DE-599)HEB340840838 |
discipline | Sprachwissenschaft Literaturwissenschaft |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02026nam a2200505zcb4500</leader><controlfield tag="001">BV041913981</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20171221 </controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">140612s2014 xx ad|| |||| 00||| eng d</controlfield><datafield tag="010" ind1=" " ind2=" "><subfield code="a">2014007366</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9027249997</subfield><subfield code="9">90-272-4999-7</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789027249999</subfield><subfield code="9">978-90-272-4999-9</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789027270139</subfield><subfield code="9">978-90-272-7013-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)884916017</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)HEB340840838</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield><subfield code="a">DE-20</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 945</subfield><subfield code="0">(DE-625)27935:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Oakes, Michael P.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1057567302</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Literary detective work on the computer</subfield><subfield code="c">Michael P. Oakes</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam [u.a.]</subfield><subfield code="b">Benjamins</subfield><subfield code="c">2014</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">X, 283 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Natural language processing</subfield><subfield code="v">12</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Plagiat</subfield><subfield code="0">(DE-588)4046196-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Literatur</subfield><subfield code="0">(DE-588)4035964-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachanalyse</subfield><subfield code="0">(DE-588)4129916-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Autorschaft</subfield><subfield code="0">(DE-588)4130545-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Literatur</subfield><subfield code="0">(DE-588)4035964-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Autorschaft</subfield><subfield code="0">(DE-588)4130545-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Plagiat</subfield><subfield code="0">(DE-588)4046196-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="4"><subfield code="a">Sprachanalyse</subfield><subfield code="0">(DE-588)4129916-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Natural language processing</subfield><subfield code="v">12</subfield><subfield code="w">(DE-604)BV013516598</subfield><subfield code="9">12</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">V:DE-603;B:DE-30</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="942" ind1="1" ind2="1"><subfield code="c">002</subfield><subfield code="e">22/bsb</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027357628</subfield></datafield></record></collection> |
id | DE-604.BV041913981 |
illustrated | Illustrated |
indexdate | 2024-12-20T16:57:42Z |
institution | BVB |
isbn | 9027249997 9789027249999 9789027270139 |
language | English |
lccn | 2014007366 |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027357628 |
oclc_num | 884916017 |
open_access_boolean | |
owner | DE-12 DE-20 |
owner_facet | DE-12 DE-20 |
physical | X, 283 S. Ill., graph. Darst. |
publishDate | 2014 |
publishDateSearch | 2014 |
publishDateSort | 2014 |
publisher | Benjamins |
record_format | marc |
series | Natural language processing |
series2 | Natural language processing |
spellingShingle | Oakes, Michael P. Literary detective work on the computer Natural language processing Computerlinguistik (DE-588)4035843-4 gnd Plagiat (DE-588)4046196-8 gnd Literatur (DE-588)4035964-5 gnd Sprachanalyse (DE-588)4129916-4 gnd Autorschaft (DE-588)4130545-0 gnd |
subject_GND | (DE-588)4035843-4 (DE-588)4046196-8 (DE-588)4035964-5 (DE-588)4129916-4 (DE-588)4130545-0 |
title | Literary detective work on the computer |
title_auth | Literary detective work on the computer |
title_exact_search | Literary detective work on the computer |
title_full | Literary detective work on the computer Michael P. Oakes |
title_fullStr | Literary detective work on the computer Michael P. Oakes |
title_full_unstemmed | Literary detective work on the computer Michael P. Oakes |
title_short | Literary detective work on the computer |
title_sort | literary detective work on the computer |
topic | Computerlinguistik (DE-588)4035843-4 gnd Plagiat (DE-588)4046196-8 gnd Literatur (DE-588)4035964-5 gnd Sprachanalyse (DE-588)4129916-4 gnd Autorschaft (DE-588)4130545-0 gnd |
topic_facet | Computerlinguistik Plagiat Literatur Sprachanalyse Autorschaft |
url | http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV013516598 |
work_keys_str_mv | AT oakesmichaelp literarydetectiveworkonthecomputer |