Text mining: a guidebook for the social sciences
Gespeichert in:
Beteiligte Personen: | , |
---|---|
Format: | Buch |
Sprache: | Englisch |
Veröffentlicht: |
Los Angeles ; London ; New Delhi ; Singapore ; Washington DC ; Melbourne
SAGE
[2017]
|
Schlagwörter: | |
Links: | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
Beschreibung: | Literaturangaben |
Umfang: | xvi, 188 Seiten Illustrationen, Diagramme |
ISBN: | 9781483369341 148336934X |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV043662021 | ||
003 | DE-604 | ||
005 | 20180921 | ||
007 | t| | ||
008 | 160708s2017 xx a||| |||| 00||| eng d | ||
010 | |a 2015044977 | ||
020 | |a 9781483369341 |c Print |9 978-1-4833-6934-1 | ||
020 | |a 148336934X |9 1-4833-6934-X | ||
035 | |a (OCoLC)946605149 | ||
035 | |a (DE-599)BSZ473668610 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-20 |a DE-739 |a DE-473 |a DE-824 |a DE-384 |a DE-355 |a DE-M347 |a DE-634 |a DE-523 |a DE-19 |a DE-11 |a DE-N2 |a DE-92 |a DE-29 | ||
050 | 0 | |a H61.3 | |
082 | 0 | |a 300.721 | |
084 | |a MR 2800 |0 (DE-625)123496: |2 rvk | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
084 | |a ST 515 |0 (DE-625)143677: |2 rvk | ||
100 | 1 | |a Ignatow, Gabe |d ca. 20./21. Jh. |e Verfasser |0 (DE-588)1106461576 |4 aut | |
245 | 1 | 0 | |a Text mining |b a guidebook for the social sciences |c Gabe Ignatow (University of North Texas) Rada Mihalcea (University of Michigan) |
264 | 1 | |a Los Angeles ; London ; New Delhi ; Singapore ; Washington DC ; Melbourne |b SAGE |c [2017] | |
264 | 4 | |c © 2017 | |
300 | |a xvi, 188 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Literaturangaben | ||
650 | 4 | |a Datenverarbeitung | |
650 | 4 | |a Sozialwissenschaften | |
650 | 4 | |a Social sciences / Research / Methodology | |
650 | 4 | |a Discourse analysis / Data processing | |
650 | 4 | |a Communication / Network analsysis | |
650 | 4 | |a Natural language processing (Computer science) | |
650 | 4 | |a Data mining | |
650 | 0 | 7 | |a Text Mining |0 (DE-588)4728093-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Sozialwissenschaften |0 (DE-588)4055916-6 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Text Mining |0 (DE-588)4728093-1 |D s |
689 | 0 | 1 | |a Sozialwissenschaften |0 (DE-588)4055916-6 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Mihalcea, Rada F. |d 1974- |e Verfasser |0 (DE-588)101335429X |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Augsburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-029075357 |
Datensatz im Suchindex
_version_ | 1819247638453682176 |
---|---|
adam_text | • Brief Contents •
Preface xii
Acknowledgments XV
About the Authors xvi
PARTI • DIGITAL TEXTS, DIGITAL SOCIAL SCIENCE 1
Chapter 1 • Social Science and the Digital Text Revolution 2
Chapter 2 • Research Design Strategies 16
PART II • TEXT MINING FUNDAMENTALS 33
Chapter 3 • Web Crawling and Scraping 34
Chapter 4 • Lexical Resources 42
Chapter 5 • Basic Text Processing 52
Chapter 6 • Supervised Learning 62
PART III • TEXT ANALYSIS METHODS FROM THE
HUMANITIES AND SOCIAL SCIENCES 73
Chapter 7 • Thematic Analysis, Qualitative Data Analysis
Software, and Visualization 74
Chapter 8 • Narrative Analysis 88
Chapter 9 • Metaphor Analysis 96
PART IV • TEXT MINING METHODS
FROM COMPUTER SCIENCE 105
Chapter 10 • Word and Text Relatedness 106
Chapter 11 • Text Classification 116
Chapter 12 © Information Extraction 130
Chapter 13 © Information Retrieval 136
Chapter 14 • Sentiment Analysis 148
Chapter 15 • Topic Models 156
PART V • CONCLUSIONS 163
Chapter 16 • Text Mining, Text Analysis, 164
and the Future of Social Science
References 168
Index 183
• Detailed Contents •
Preface xii
Acknowledgments xv
About the Authors xvi
PART I • DIGITAL TEXTSP DIGITAL SOCIAL SCIENCE 1
1. Social Science and the Digital Text Revolution 2
History of Text Analysis 3
Risks and Rewards of Text Mining for the Social Sciences 5
Social Data From Digital Environments 6
Theory and Metatheory 10
Ethics of Text Mining 12
Participant Consent, Privacy, and Anonymity 12
Prompted and Unprompted Data 13
Organization of This Volume 13
2. Research Design Strategies 16
Levels of Analysis 18
The Textual Level 18
The Contextual Level 18
The Sociological Level 18
Strategies for Document Selection and Sampling 19
Case Selection 19
Text Sampling 20
Types of Inferential Logic 22
Inductive Logic 23
Deductive Logic 24
Abductive Logic 25
Approaches to Research Design 27
Analysis of Discourse Positions 27
Conversation Analysis 28
Critical Discourse Analysis 28
Content Analysis 29
Foucauldian Intertextuality 30
Analysis of Texts as Social Information 31
PART II • TEXT MINING FUNDAMENTALS
3. Web Crawling and Scraping 34
Web Statistics 36
Web Crawling 37
Process Steps in Crawling 37
Traversal Strategies 38
Crawler Politeness 38
Web Scraping 39
Software for Web Crawling and Scraping 41
4. Lexical Resources 42
WordNet 43
WordNet-Affect 45
Roget’s Thesaurus 46
Linguistic Inquiry and Word Count 46
General Inquirer 48
Wikipedia 48
Wikt ionary 51
Downloadable Lexical Resources
and Application Program Interfaces 51
5. Basic Text Processing 52
Tokenization 54
Stop Word Removal 55
Stemming and Lemmatization 55
Text Statistics 56
Language Models 59
Other Text Processing 60
Part of Speech Tagging 60
Collocation Identification 60
Syn ta c tic Parsing 61
Named Entity Tagging 61
Word Sense Disambiguation 61
Software for Text Processing 61
6. Supervised Learning 62
Feature Representation and Weighting 65
Feature Weighting 65
Supervised Learning Algorithms 66
Decision Trees 67
Instance-Based Learning 68
Support Vector Machines 69
Evaluation of Supervised Learning 71
Software for Supervised Learning 71
PARTIII • TEXT ANALYSIS METHODS FROM
THE HUMANITIES AND SOCIAL SCIENCES
7« Thematic Analysis, Qualitative
Data Analysis Software, and Visualization 74
Thematic Analysis 75
Qualitative Data Analysis Software 77
Visualization Tools 83
Word Clouds 84
Word Trees and Phrase Nets 84
Matrices and Maps 85
Key Word in Context 86
Software for Thematic Analysis, Qualitative Data Analysis,
and Visualization 86
8. Narrative Analysis 88
Conceptual Foundations 90
Structural Approaches to Narrative 90
Functionalist Approaches to Narrative 91
Sociological Approaches to Narrative 92
Mixed Methods of Narrative Analysis 92
Automated Methods of Narrative Analysis 93
Future Directions 93
Software for Narrative Analysis 94
9a Metaphor Analysis 96
Theoretical Foundations 98
Qualitative Metaphor Analysis 99
Anthropology 99
Educational Research 99
Political Science 100
Psychology 100
Sociology 101
Mixed Methods of Metaphor Analysis 101
Management Research 101
Psychology 102
Sociology 102
Automated Metaphor identification Methods 103
Software for Metaphor Analysis 103
PART IV • TEXT MINING METHODS
FROM COMPUTER SCIENCE 105
10. Word and Text Relatedness 106
Theoretical Foundations 107
Corpus-Based and Knowledge-Based Measures of Relatedness 108
Corpus-Based Measures of Word Relatedness 108
Knowledge-Based Measures of Word Relatedness 110
Measures of Text Relatedness 112
Software and Data Sets for Word and Text Relatedness 114
11- Text Classification 116
A Brief History of Text Classification 118
Applications of Text Classification 119
Topic Classification 119
E-Mail Spam Detection 120
Sentiment Analysis/Opinion Mining 120
Gender Classification 120
Deception Detection 122
Other Applications 122
Representing Texts for Supervised Text Classification 122
Feature Weighting and Selection 123
Text Classification Algorithms 124
Naive Bayes 124
Rocchio Classifier 125
Bootstrapping in Text Classification 126
Evaluation of Text Classification 127
Software and Data Sets for Text Classification 127
12. Information Extraction 130
Entity Extraction 132
Relation Extraction 133
Web Information Extraction 134
Template Filling 135
Software and Data Sets for Information Extraction and Text Mining 135
13. Information Retrieval 136
Theoretical Foundations 138
Components of an Information Retrieval System 138
Information Retrieval Models 140
The Vector Space Model 142
Evaluation of Information Retrieval Models 144
Web-Based Information Retrieval 145
Software and Data Sets for Information Retrieval 147
14. Sentiment Analysis 148
Theoretical Foundations 150
Lexicons 151
Corpora 152
Tools 153
Software and Data Sets for Sentiment Analysis 154
15. Topic Models 156
Digital Humanities 160
Political Science 160
Sociology 161
Software for Topic Modeling 161
PARTY • CONCLUSIONS
163
16. Text Mining, Text Analysis,
and the Future of Social Science 164
Socialand Computer Science Collaboration 166
References
168
Index
183
Online communities generate massive volumes of
natural language data, and the social sciences continue
to learn howto best make use of this new information
and the technology available for analyzing it. Text Mining
brings together a broad range of contemporary
qualitative and quantitative methods to provide strategic
and practical guidance on analyzing large text collections.
This accessible book, written by a sociologist and a
computer scientist, surveys the fast-changing landscape
of data sources, programming languages, software
packages, and methods of analysis available today.
Suitable for novice and experienced researchers alike,
this book helps readers use text mining techniques
more efficiently and productively.
|
any_adam_object | 1 |
author | Ignatow, Gabe ca. 20./21. Jh Mihalcea, Rada F. 1974- |
author_GND | (DE-588)1106461576 (DE-588)101335429X |
author_facet | Ignatow, Gabe ca. 20./21. Jh Mihalcea, Rada F. 1974- |
author_role | aut aut |
author_sort | Ignatow, Gabe ca. 20./21. Jh |
author_variant | g i gi r f m rf rfm |
building | Verbundindex |
bvnumber | BV043662021 |
callnumber-first | H - Social Science |
callnumber-label | H61 |
callnumber-raw | H61.3 |
callnumber-search | H61.3 |
callnumber-sort | H 261.3 |
callnumber-subject | H - Social Science |
classification_rvk | MR 2800 ST 306 ST 515 |
ctrlnum | (OCoLC)946605149 (DE-599)BSZ473668610 |
dewey-full | 300.721 |
dewey-hundreds | 300 - Social sciences |
dewey-ones | 300 - Social sciences |
dewey-raw | 300.721 |
dewey-search | 300.721 |
dewey-sort | 3300.721 |
dewey-tens | 300 - Social sciences |
discipline | Informatik Soziologie |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02530nam a2200541 c 4500</leader><controlfield tag="001">BV043662021</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20180921 </controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">160708s2017 xx a||| |||| 00||| eng d</controlfield><datafield tag="010" ind1=" " ind2=" "><subfield code="a">2015044977</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781483369341</subfield><subfield code="c">Print</subfield><subfield code="9">978-1-4833-6934-1</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">148336934X</subfield><subfield code="9">1-4833-6934-X</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)946605149</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BSZ473668610</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-20</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-824</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-M347</subfield><subfield code="a">DE-634</subfield><subfield code="a">DE-523</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-N2</subfield><subfield code="a">DE-92</subfield><subfield code="a">DE-29</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">H61.3</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">300.721</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MR 2800</subfield><subfield code="0">(DE-625)123496:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 515</subfield><subfield code="0">(DE-625)143677:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ignatow, Gabe</subfield><subfield code="d">ca. 20./21. Jh.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1106461576</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Text mining</subfield><subfield code="b">a guidebook for the social sciences</subfield><subfield code="c">Gabe Ignatow (University of North Texas) Rada Mihalcea (University of Michigan)</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Los Angeles ; London ; New Delhi ; Singapore ; Washington DC ; Melbourne</subfield><subfield code="b">SAGE</subfield><subfield code="c">[2017]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2017</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xvi, 188 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturangaben</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Datenverarbeitung</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Sozialwissenschaften</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Social sciences / Research / Methodology</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Discourse analysis / Data processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Communication / Network analsysis</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Natural language processing (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data mining</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Text Mining</subfield><subfield code="0">(DE-588)4728093-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sozialwissenschaften</subfield><subfield code="0">(DE-588)4055916-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Text Mining</subfield><subfield code="0">(DE-588)4728093-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Sozialwissenschaften</subfield><subfield code="0">(DE-588)4055916-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mihalcea, Rada F.</subfield><subfield code="d">1974-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)101335429X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Augsburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-029075357</subfield></datafield></record></collection> |
id | DE-604.BV043662021 |
illustrated | Illustrated |
indexdate | 2024-12-20T17:41:55Z |
institution | BVB |
isbn | 9781483369341 148336934X |
language | English |
lccn | 2015044977 |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-029075357 |
oclc_num | 946605149 |
open_access_boolean | |
owner | DE-20 DE-739 DE-473 DE-BY-UBG DE-824 DE-384 DE-355 DE-BY-UBR DE-M347 DE-634 DE-523 DE-19 DE-BY-UBM DE-11 DE-N2 DE-92 DE-29 |
owner_facet | DE-20 DE-739 DE-473 DE-BY-UBG DE-824 DE-384 DE-355 DE-BY-UBR DE-M347 DE-634 DE-523 DE-19 DE-BY-UBM DE-11 DE-N2 DE-92 DE-29 |
physical | xvi, 188 Seiten Illustrationen, Diagramme |
publishDate | 2017 |
publishDateSearch | 2017 |
publishDateSort | 2017 |
publisher | SAGE |
record_format | marc |
spellingShingle | Ignatow, Gabe ca. 20./21. Jh Mihalcea, Rada F. 1974- Text mining a guidebook for the social sciences Datenverarbeitung Sozialwissenschaften Social sciences / Research / Methodology Discourse analysis / Data processing Communication / Network analsysis Natural language processing (Computer science) Data mining Text Mining (DE-588)4728093-1 gnd Sozialwissenschaften (DE-588)4055916-6 gnd |
subject_GND | (DE-588)4728093-1 (DE-588)4055916-6 |
title | Text mining a guidebook for the social sciences |
title_auth | Text mining a guidebook for the social sciences |
title_exact_search | Text mining a guidebook for the social sciences |
title_full | Text mining a guidebook for the social sciences Gabe Ignatow (University of North Texas) Rada Mihalcea (University of Michigan) |
title_fullStr | Text mining a guidebook for the social sciences Gabe Ignatow (University of North Texas) Rada Mihalcea (University of Michigan) |
title_full_unstemmed | Text mining a guidebook for the social sciences Gabe Ignatow (University of North Texas) Rada Mihalcea (University of Michigan) |
title_short | Text mining |
title_sort | text mining a guidebook for the social sciences |
title_sub | a guidebook for the social sciences |
topic | Datenverarbeitung Sozialwissenschaften Social sciences / Research / Methodology Discourse analysis / Data processing Communication / Network analsysis Natural language processing (Computer science) Data mining Text Mining (DE-588)4728093-1 gnd Sozialwissenschaften (DE-588)4055916-6 gnd |
topic_facet | Datenverarbeitung Sozialwissenschaften Social sciences / Research / Methodology Discourse analysis / Data processing Communication / Network analsysis Natural language processing (Computer science) Data mining Text Mining |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075357&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT ignatowgabe textminingaguidebookforthesocialsciences AT mihalcearadaf textminingaguidebookforthesocialsciences |