Laboratory experiments in information retrieval: sample sizes, effect sizes, and statistical power
Gespeichert in:
Beteilige Person: | |
---|---|
Format: | Buch |
Sprache: | Englisch |
Veröffentlicht: |
Singapore
Springer
[2018]
|
Schriftenreihe: | The information retrieval series
volume 40 |
Schlagwörter: | |
Links: | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
Umfang: | ix, 150 Seiten Diagramme |
ISBN: | 9789811311987 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV045295985 | ||
003 | DE-604 | ||
005 | 20181206 | ||
007 | t| | ||
008 | 181119s2018 xx |||| |||| 00||| eng d | ||
020 | |a 9789811311987 |9 978-981-1311-98-7 | ||
035 | |a (OCoLC)1054367518 | ||
035 | |a (DE-599)BVBBV045295985 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-355 |a DE-12 |a DE-11 | ||
084 | |a ST 270 |0 (DE-625)143638: |2 rvk | ||
100 | 1 | |a Sakai, Tetsuya |e Verfasser |0 (DE-588)1014956080 |4 aut | |
245 | 1 | 0 | |a Laboratory experiments in information retrieval |b sample sizes, effect sizes, and statistical power |c Tetsuya Sakai |
264 | 1 | |a Singapore |b Springer |c [2018] | |
300 | |a ix, 150 Seiten |b Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a The information retrieval series |v volume 40 | |
650 | 0 | 7 | |a Information Retrieval |0 (DE-588)4072803-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Information Retrieval |0 (DE-588)4072803-1 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-981-131-199-4 |
830 | 0 | |a The information retrieval series |v volume 40 |w (DE-604)BV024629453 |9 40 | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
942 | 1 | 1 | |c 025.04 |e 22/bsb |
942 | 1 | 1 | |c 004 |e 22/bsb |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-030683241 |
Datensatz im Suchindex
_version_ | 1819382831646769152 |
---|---|
adam_text | Contents
X Preliminaries.......................................................... 1
1.1 Principles of Significance Testing................................ 2
1.1.1 Sample Means and Population Means.......................... 2
1.1.2 Hypotheses, Test Statistics, and P-Values.................. 3
1.1.3 a, and Statistical Power.................................. 5
1.2 Well-Known Probability Distributions.............................. 6
1.2.1 Law of Large Numbers..................................... 6
1.2.2 Normal Distribution and the Central Limit Theorem.......... 8
1.2.3 x2 Distribution........................................... 11
1.2.4 t Distribution............................................ 13
1.2.5 F Distribution............................................ 16
1.3 Less Well-Known Probability Distributions........................ 17
1.3.1 Noncentral t Distribution................................. 17
1.3.2 Noncentral x2 Distribution................................ 22
1.3.3 Noncentral F Distributions................................ 22
References.......................................................... 24
2 t -Tests .............................................................. 27
2.1 Introduction................................................... 27
2.2 Paired i-Test.................................................... 29
2.3 Two-Sample ¿-Test............................................. 30
2.4 Welch’s Two-Sample t-Test........................................ 31
2.5 Which Two-Sample ¿-Test?......................................... 32
2.6 Conducting a t-Test with Excel ................................ 33
2.7 Conducting a r-Test with R....................................... 36
2.8 Confidence Intervals for the Difference in Population Means... 39
References............................................................ 41
3 Analysis of Variance................................................... 43
3.1 One-Way ANOVA ................................................... 44
3.1.1 One-Way ANOVA with Equal Group Sizes...................... 44
3.1.2 One-Way ANOVA with Unequal Group Sizes.................... 47
vii
viii Contents
3.2 Two-Way ANOVA Without Replication............................... 49
3.3 Two-Way ANOVA with Replication................................ 51
3.4 Conducting an ANOVA with Excel.................................. 53
3.5 Conducting an ANOVA with R...................................... 54
3.6 Confidence Intervals for System Means........................... 56
References........................................................... 57
4 Multiple Comparison Procedures........................................ 59
4.1 Introduction ................................................. 59
4.2 Family wise Error Rate........................................ 60
4.3 Bonferroni Correction .......................................... 61
4.3.1 Principles and Limitations of the Bonferroni Correction. 61
4.3.2 Bonferroni Correction with R............................ 62
4.4 Tukey HSD Test.................................................. 67
4.4.1 Tukey HSD with Unequal Group Sizes....................... 68
4.4.2 Tukey HSD with Equal Group Sizes....................... 69
4.4.3 Tukey HSD with Paired Observations....................... 70
4.4.4 Simultaneous Confidence Intervals........................ 71
4.4.5 Tukey HSD with R......................................... 71
4.5 Randomisation Test and Its Tukey HSD Version.................... 73
4.5.1 Randomisation Test for Two Systems ...................... 74
4.5.2 Randomised Tukey HSD Test................................ 77
References........................................................... 80
5 The Correct Ways to Use Significance Tests.......................... 81
5.1 Limitations of Significance Tests............................. 81
5.1.1 Criticisms from the Literature........................... 81
5.1.2 Three Problems, Among Others ............................ 83
5.2 Effect Sizes.................................................... 85
5.2.1 Effect Sizes for i-Tests................................. 85
5.2.2 Effect Sizes for Tukey HSD and RTHSD..................... 88
5.2.3 Effect Sizes for ANOVA .................................. 89
5.3 How to Report Your Results...................................... 92
5.3.1 Comparing Two Systems.................................... 93
5.3.2 Comparing More Than Two Systems......................... 95
References........................................................ 97
6 Topic Set Size Design Using Excel..................................... 99
6.1 Overview of Topic Set Size Design............................... 99
6.2 Topic Set Size Design with the Paired f-Test................... 102
6.2.1 How to Use the Paired i-Test-Based Tool................. 102
6.2.2 How the Paired ?-Test-Based Topic Set Size
Design Works......................................... 103
6.3 Topic Set Size Design with the Two-Sample f-Test............... 108
6.3.1 How to Use the Two-Sample f-Test-Based Tool............. 108
Contents ix
6.3.2 How the Two-Sample ¿-Test-Based Topic Set Size
Design Works.......................................... 109
6.4 Topic Set Size Design with One-Way ANOVA..................... Ill
6.4.1 How to Use the ANOVA-Based Tool...................... Ill
6.4.2 How the ANOVA-Based Topic Set Size Design Works.,.... 112
6.5 Topic Set Size Design with Confidence Intervals for Paired Data.... 115
6.5.1 How to Use the Paired-Data CI-Based Tool............. 115
6.5.2 How the Paired-Data CI-Based Tool Works.............. 116
6.6 Topic Set Size Design with Confidence Intervals
for Unpaired Data........................................... 118
6.6.1 How to Use the Two-Sample CI-Based Tool.............. 118
6.6.2 How the Two-Sample CI-Based Tool Works................ 119
6.7 Estimating Population Variances.............................. 120
6.8 Comparing the Different Topic Set Size Design Methods........ 122
6.8.1 Paired and Two-Sample ¿-Tests vs. One-Way ANOVA.... 124
6.8.2 CI-Based Topic Set Size Design: Paired
vs. Unpaired Data..................................... 128
6.8.3 One-Way ANOVA vs. Confidence Intervals....... 129
References....................................................... 131
7 Power Analysis Using R ........................................... 133
7.1 Introduction ................................................ 133
7.2 Overview of the R Scripts for Power Analysis................. 134
7.3 Power Analysis with the Paired ¿-Test........................ 134
7.4 Power Analysis with the Two-Sample ¿-Test................... 136
7.5 Power Analysis with One-Way ANOVA............................ 137
7.6 Power Analysis with Two-Way ANOVA Without Replication....... 139
7.7 Power Analysis with Two-Way ANOVA with Replication........... 140
7.8 Summary ..................................................... 143
References....................................................... 145
8 Conclusions..................................................... 147
8.1 A Quick Summary of the Book.................................. 147
8.2 Statistical Reform in IR?.................................... 148
References..................................................... 148
Index
149
The Information Retrieval Series
CliengXiang Zhai • Maarten de Rijke Series Editors
Tetsuya Sakai
Laboratory Experiments in Information Retrieval
Sample Sizes, Effect Sizes, and Statistical Power
Covering aspects from principles and limitations of statistical significance tests to topic
set size design and power analysis, this book guides readers to statistically well-designed
experiments. Although classical statistical significance tests are to some extent useful
in information retrieval (IR) evaluation, they can harm research unless they are used
appropriately with the right sample sizes and statistical power and unless the test results
are reported properly. The first half of the book is mainly targeted at undergraduate
students, and the second half is suitable for graduate students and researchers who
regularly conduct laboratory experiments in IR, natural language processing, recom-
mendations, and related fields.
Chapters 1-5 review parametric significance tests for comparing system means, namely,
¿-tests and ANOVAs, and show how easily they can be conducted using Microsoft
Excel or R. These chapters also discuss a few multiple comparison procedures for
researchers who are interested in comparing every system pair, including a ran-
domised version of Tukey’s Honestly Significant Difference test. The chapters then
deal with known limitations of classical significance testing and provide practical
guidelines for reporting research results regarding comparison of means.
Chapters 6 and 7 discuss statistical power. Chapter 6 introduces topic set size design to
enable test collection builders to determine an appropriate number of topics to create.
Readers can easily use the authors Excel tools for topic set size design based on the
paired and two-sample ¿-tests, one-way ANOVA, and confidence intervals. Chapter 7
describes power-analysis-based methods for determining an appropriate sample size
for a new experiment based on a similar experiment done in the past, detailing how
to utilize the author s R tools for power analysis and how to interpret the results. Case
studies from IR for both Excel-based topic set size design and R-based power analysis
are also provided.
|
any_adam_object | 1 |
author | Sakai, Tetsuya |
author_GND | (DE-588)1014956080 |
author_facet | Sakai, Tetsuya |
author_role | aut |
author_sort | Sakai, Tetsuya |
author_variant | t s ts |
building | Verbundindex |
bvnumber | BV045295985 |
classification_rvk | ST 270 |
ctrlnum | (OCoLC)1054367518 (DE-599)BVBBV045295985 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01848nam a2200385 cb4500</leader><controlfield tag="001">BV045295985</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20181206 </controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">181119s2018 xx |||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789811311987</subfield><subfield code="9">978-981-1311-98-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1054367518</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045295985</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-355</subfield><subfield code="a">DE-12</subfield><subfield code="a">DE-11</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 270</subfield><subfield code="0">(DE-625)143638:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Sakai, Tetsuya</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1014956080</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Laboratory experiments in information retrieval</subfield><subfield code="b">sample sizes, effect sizes, and statistical power</subfield><subfield code="c">Tetsuya Sakai</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Singapore</subfield><subfield code="b">Springer</subfield><subfield code="c">[2018]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">ix, 150 Seiten</subfield><subfield code="b">Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">The information retrieval series</subfield><subfield code="v">volume 40</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-981-131-199-4</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">The information retrieval series</subfield><subfield code="v">volume 40</subfield><subfield code="w">(DE-604)BV024629453</subfield><subfield code="9">40</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="942" ind1="1" ind2="1"><subfield code="c">025.04</subfield><subfield code="e">22/bsb</subfield></datafield><datafield tag="942" ind1="1" ind2="1"><subfield code="c">004</subfield><subfield code="e">22/bsb</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030683241</subfield></datafield></record></collection> |
id | DE-604.BV045295985 |
illustrated | Not Illustrated |
indexdate | 2024-12-20T18:23:17Z |
institution | BVB |
isbn | 9789811311987 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030683241 |
oclc_num | 1054367518 |
open_access_boolean | |
owner | DE-355 DE-BY-UBR DE-12 DE-11 |
owner_facet | DE-355 DE-BY-UBR DE-12 DE-11 |
physical | ix, 150 Seiten Diagramme |
publishDate | 2018 |
publishDateSearch | 2018 |
publishDateSort | 2018 |
publisher | Springer |
record_format | marc |
series | The information retrieval series |
series2 | The information retrieval series |
spellingShingle | Sakai, Tetsuya Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power The information retrieval series Information Retrieval (DE-588)4072803-1 gnd |
subject_GND | (DE-588)4072803-1 |
title | Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power |
title_auth | Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power |
title_exact_search | Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power |
title_full | Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power Tetsuya Sakai |
title_fullStr | Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power Tetsuya Sakai |
title_full_unstemmed | Laboratory experiments in information retrieval sample sizes, effect sizes, and statistical power Tetsuya Sakai |
title_short | Laboratory experiments in information retrieval |
title_sort | laboratory experiments in information retrieval sample sizes effect sizes and statistical power |
title_sub | sample sizes, effect sizes, and statistical power |
topic | Information Retrieval (DE-588)4072803-1 gnd |
topic_facet | Information Retrieval |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030683241&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV024629453 |
work_keys_str_mv | AT sakaitetsuya laboratoryexperimentsininformationretrievalsamplesizeseffectsizesandstatisticalpower |