结婚送什么| 梦见袜子破了是什么意思| 医学检验是干什么的| 落是什么意思| 男人左手断掌是什么命| 眼睛红红的是什么生肖| 落叶像什么飘落下来| 黑醋是什么醋| kcal是什么单位| hpv和tct有什么区别| 尿分叉吃什么药能治好| 什么是情商高| coupon什么意思| 用印是什么意思| kai是什么意思| 肚子疼是什么原因引起的| 鸟字旁与什么有关| 完璧归赵发生在什么时期| 成吉思汗属什么生肖| 腰肌劳损是什么原因引起的| 戒烟后为什么会发胖| 肛门潮湿瘙痒用什么药最好| 妨父母痣是什么意思| 轻歌曼舞是什么意思| faye是什么意思| 综合用地是什么性质| 蛋白质阴性是什么意思| ak是什么意思| 天五行属性是什么| 鱼油什么人不能吃| 学生是什么阶级| 不言而喻的喻是什么意思| 投资什么好| 低钾是什么原因引起的| 蚂蟥是什么| 大牙什么时候换| 马眼是什么意思| 单恋是什么意思| 给事中是什么官| 内眼角越揉越痒用什么眼药水| 蒙脱石散适合什么腹泻| 辟谷有什么好处| 为什么姨妈会推迟| 十月十二日是什么星座| 大熊猫是什么科| 本色出演是什么意思| 湿气太重吃什么好| 手指倒刺是什么原因| 撸猫是什么意思| 屁眼疼痛什么原因| hpv是什么原因引起的| 藿香正气水能治什么病| 320是什么意思| 狗皮膏药是什么意思| 吃榴莲不能和什么一起吃| 表哥的女儿叫什么| 5月26号是什么日子| 软绵绵的什么| 一月七号是什么星座| 蟑螂卵什么样| 户籍地址填什么| 下巴长痘痘是什么原因| 给医生送锦旗写什么| 胸口隐隐作痛挂什么科| 龟龄集适合什么人吃| 左侧卵巢囊肿是什么原因引起的| 十二指肠憩室是什么意思| 爽是什么结构| 人生格言是什么| 高祖父的爸爸叫什么| 肌肉跳动是什么原因| 名存实亡是什么意思| 黑头发有什么好处脑筋急转弯| 四大发明有什么| 男人吃什么壮阳最快| 日加匀念什么| 飞五行属什么| 大舌头是什么意思| 小孩下半夜咳嗽是什么原因| 年夜饭吃什么| 咒怨讲的是什么故事| 肚子很硬是什么原因| 冰妹是什么| 狐臭是什么味道| 尾盘拉升意味着什么| 面霜是干什么用的| 什么叫手淫| 精子为什么叫怂| 尿胆原normal是什么意思| 女生腰疼是什么原因| 今年七夕节是什么时候| 骨裂是什么感觉| 讳疾忌医是什么意思| 吃黄瓜有什么好处| 预设是什么意思| 什么扑鼻成语| 吃黑豆有什么好处和坏处| 吃什么长胖| 频频是什么意思| 夜尿增多是什么原因| 银河是什么| 有什么笑话| 剖腹产后可以吃什么食物| 桃子吃了有什么好处| 头晕挂什么科比较好| 血脂高吃什么油好| 唐氏儿是什么意思| 紫砂壶适合泡什么茶| 脚抽筋吃什么钙片好| 肆意洒脱 什么意思| 老年人喝什么蛋白粉好| 2月17日是什么星座| 总胆固醇高是什么原因| 鱼在鱼缸底部不动为什么| 什么地哭| 无感什么意思| 口羊读什么| 抑扬顿挫什么意思| 椰子不能和什么一起吃| hpu是什么意思| fd是什么意思| 定性是什么意思| 三宫六院是什么意思| 拉肚子用什么药| 哈字五行属什么| 公知是什么意思| 防冻液红色和绿色有什么区别| 心脏右束支传导阻滞是什么意思| 鼻子旁边长痘是什么原因| 甲磺酸倍他司汀片治什么病| 指导是什么意思| 点状血流信号是什么意思| 行长是什么级别| 一什么云彩| hpv检查什么| 统招生是什么意思| 萎缩性胃炎吃什么药好| 老百姓是什么意思| 从容不迫什么意思| 人棉是什么面料| 赤脚医生是什么意思| 什么药对伤口愈合快| 9月14号是什么星座| 拆穿是什么意思| columbia是什么牌子| 肝内脂肪浸润是什么意思| 黑吃黑是什么意思| 螳螂是什么生肖| 被舔下面什么感觉| 捡肥皂是什么意思| 湿疹是因为什么引起的| 早上起床咳嗽是什么原因| 一什么树叶| 肌酐高说明什么问题| 陈皮是什么| 胆汁反流吃什么药| 头发油的快是什么原因| 煮沸除氯是什么意思| 小孩疝气是什么症状| 跖疣是什么原因造成的| 开塞露的成分是什么| 30岁属什么的生肖| 小姨子是什么关系| 什么地赶来| alpha是什么意思| 潘海利根香水什么档次| 什么动物牙齿最多| 什么水果吃了对皮肤好| 虫见读什么| 175是什么码| 出火是什么意思| 颈部ct能检查出什么| 什么玉便宜又养人| 嗓子疼吃什么水果好| 大姨妈不能吃什么| 湿气太重吃什么药| 悬钟为什么叫绝骨| 肾气不足吃什么药| 白介素是什么| 什么手串最好| 喜形于色是什么意思| 肝内强回声是什么意思| 喝酒之前吃什么保护胃| 手上长小水泡很痒是什么原因| 有没有什么| 吃什么能让月经量增多| 什么粥最养胃| 淋巴结清扫是什么意思| 小孩肠套叠什么症状| 月经过后腰酸疼是什么原因| 巨蟹男和什么星座最配| 朱红色是什么颜色| 公务员辞职做什么| 耳朵里长痘是什么原因| 晚上剪指甲有什么禁忌| 黄疸是什么原因引起的| 人的本质属性是什么| 右胸上部隐痛什么原因| 碗摔碎了预示着什么| 长寿花用什么肥料最好| 温文尔雅是什么意思| 胃窦粘膜慢性炎是什么病| 净身高是什么意思| 胃酸吃什么可以缓解| 女人白带多是什么原因| 逝者如斯夫是什么意思| 风度是什么意思| 锶对人体有什么好处| 男人做噩梦是什么预兆| 后援会是什么意思| 人体最大的器官是什么| 打完除皱针注意事项有什么| 舌苔厚白吃什么食物好| CA是什么激素| 七九年属什么| 心口疼吃什么药| 血糖仪什么牌子的好用又准确| 女命七杀代表什么| 六月26日是什么日子| 叶酸起什么作用| 打胶原蛋白针有什么副作用吗| 怀孕前期有什么征兆| 肠胀气吃什么药| 喉咙干是什么病的前兆| 七月初八是什么星座| 咳嗽打什么点滴效果好| 肝病不能吃什么| 胰腺癌晚期什么症状| 1979年出生属什么生肖| 有始无终是什么生肖| 广西属于什么方向| pc是什么意思| 梦中梦是什么原因| 脂肪肝吃什么药效果好| 七个月宝宝可以吃什么水果| 抗hp治疗是什么意思| 终其一生下一句是什么| maybach是什么车| 睡觉流口水是什么情况| 怀孕了什么不可以吃| 什么是肛裂| 膀胱在什么位置图片| 内啡肽是什么意思| 鸡蛋过敏什么症状| 甲基化是什么意思| 肝叶钙化灶是什么意思| 什么情况下会缺钾| 女性为什么会感染巨细胞病毒| 回民为什么不吃猪| 眼珠子发黄是什么原因| 骨关节炎是什么原因引起的| 为什么会有阴道炎| 尿隐血弱阳性什么意思| 胆囊炎不能吃什么| 开除党籍有什么后果| 和什么细什么的成语| 微信头像 用什么好| 嘴巴旁边长痘痘是为什么| 梦见两口子吵架是什么意思| 智商是什么意思| 牙为什么会疼| 每次上大便都出血是什么原因| 钾低是什么原因| 昂字五行属什么| 百度

安徽丙纶防水卷材 想买口碑好的丙纶防水卷材上哪

(Redirected from Significance level)
百度 火神台庙会遵循的是地道民俗、民族特色,兼有文化活动、民间艺术、传统小吃等。

In statistical hypothesis testing,[1][2] a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true.[3] More precisely, a study's defined significance level, denoted by , is the probability of the study rejecting the null hypothesis, given that the null hypothesis is true;[4] and the p-value of a result, , is the probability of obtaining a result at least as extreme, given that the null hypothesis is true.[5] The result is said to be statistically significant, by the standards of the study, when .[6][7][8][9][10][11][12] The significance level for a study is chosen before data collection, and is typically set to 5%[13] or much lower—depending on the field of study.[14]

In any experiment or observation that involves drawing a sample from a population, there is always the possibility that an observed effect would have occurred due to sampling error alone.[15][16] But if the p-value of an observed effect is less than (or equal to) the significance level, an investigator may conclude that the effect reflects the characteristics of the whole population,[1] thereby rejecting the null hypothesis.[17]

This technique for testing the statistical significance of results was developed in the early 20th century. The term significance does not imply importance here, and the term statistical significance is not the same as research significance, theoretical significance, or practical significance.[1][2][18][19] For example, the term clinical significance refers to the practical importance of a treatment effect.[20]

History

edit

Statistical significance dates to the 18th century, in the work of John Arbuthnot and Pierre-Simon Laplace, who computed the p-value for the human sex ratio at birth, assuming a null hypothesis of equal probability of male and female births; see p-value §?History for details.[21][22][23][24][25][26][27]

In 1925, Ronald Fisher advanced the idea of statistical hypothesis testing, which he called "tests of significance", in his publication Statistical Methods for Research Workers.[28][29][30] Fisher suggested a probability of one in twenty (0.05) as a convenient cutoff level to reject the null hypothesis.[31] In a 1933 paper, Jerzy Neyman and Egon Pearson called this cutoff the significance level, which they named ?. They recommended that ? be set ahead of time, prior to any data collection.[31][32]

Despite his initial suggestion of 0.05 as a significance level, Fisher did not intend this cutoff value to be fixed. In his 1956 publication Statistical Methods and Scientific Inference, he recommended that significance levels be set according to specific circumstances.[31]

edit

The significance level ? is the threshold for ? below which the null hypothesis is rejected even though by assumption it were true, and something else is going on. This means that ? is also the probability of mistakenly rejecting the null hypothesis, if the null hypothesis is true.[4] This is also called false positive and type I error.

Sometimes researchers talk about the confidence level γ = (1 ? α) instead. This is the probability of not rejecting the null hypothesis given that it is true.[33][34] Confidence levels and confidence intervals were introduced by Neyman in 1937.[35]

Role in statistical hypothesis testing

edit
?
In a two-tailed test, the rejection region for a significance level of α = 0.05 is partitioned to both ends of the sampling distribution and makes up 5% of the area under the curve (white areas).

Statistical significance plays a pivotal role in statistical hypothesis testing. It is used to determine whether the null hypothesis should be rejected or retained. The null hypothesis is the hypothesis that no effect exists in the phenomenon being studied.[36] For the null hypothesis to be rejected, an observed result has to be statistically significant, i.e. the observed p-value is less than the pre-specified significance level ?.

To determine whether a result is statistically significant, a researcher calculates a p-value, which is the probability of observing an effect of the same magnitude or more extreme given that the null hypothesis is true.[5][12] The null hypothesis is rejected if the p-value is less than (or equal to) a predetermined level, ?. ? is also called the significance level, and is the probability of rejecting the null hypothesis given that it is true (a type I error). It is usually set at or below 5%.

For example, when ? is set to 5%, the conditional probability of a type I error, given that the null hypothesis is true, is 5%,[37] and a statistically significant result is one where the observed p-value is less than (or equal to) 5%.[38] When drawing data from a sample, this means that the rejection region comprises 5% of the sampling distribution.[39] These 5% can be allocated to one side of the sampling distribution, as in a one-tailed test, or partitioned to both sides of the distribution, as in a two-tailed test, with each tail (or rejection region) containing 2.5% of the distribution.

The use of a one-tailed test is dependent on whether the research question or alternative hypothesis specifies a direction such as whether a group of objects is heavier or the performance of students on an assessment is better.[3] A two-tailed test may still be used but it will be less powerful than a one-tailed test, because the rejection region for a one-tailed test is concentrated on one end of the null distribution and is twice the size (5% vs. 2.5%) of each rejection region for a two-tailed test. As a result, the null hypothesis can be rejected with a less extreme result if a one-tailed test was used.[40] The one-tailed test is only more powerful than a two-tailed test if the specified direction of the alternative hypothesis is correct. If it is wrong, however, then the one-tailed test has no power.

Significance thresholds in specific fields

edit

In specific fields such as particle physics and manufacturing, statistical significance is often expressed in multiples of the standard deviation or sigma (σ) of a normal distribution, with significance thresholds set at a much stricter level (for example 5σ).[41][42] For instance, the certainty of the Higgs boson particle's existence was based on the 5σ criterion, which corresponds to a p-value of about 1 in 3.5 million.[42][43]

In other fields of scientific research such as genome-wide association studies, significance levels as low as 5×10?8 are not uncommon[44][45]—as the number of tests performed is extremely large.

Limitations

edit

Researchers focusing solely on whether their results are statistically significant might report findings that are not substantive[46] and not replicable.[47][48] There is also a difference between statistical significance and practical significance. A study that is found to be statistically significant may not necessarily be practically significant.[49][19]

Effect size

edit

Effect size is a measure of a study's practical significance.[49] A statistically significant result may have a weak effect. To gauge the research significance of their result, researchers are encouraged to always report an effect size along with p-values. An effect size measure quantifies the strength of an effect, such as the distance between two means in units of standard deviation (cf. Cohen's d), the correlation coefficient between two variables or its square, and other measures.[50]

Reproducibility

edit

A statistically significant result may not be easy to reproduce.[48] In particular, some statistically significant results will in fact be false positives. Each failed attempt to reproduce a result increases the likelihood that the result was a false positive.[51]

Challenges

edit

Overuse in some journals

edit

Starting in the 2010s, some journals began questioning whether significance testing, and particularly using a threshold of α=5%, was being relied on too heavily as the primary measure of validity of a hypothesis.[52] Some journals encouraged authors to do more detailed analysis than just a statistical significance test. In social psychology, the journal Basic and Applied Social Psychology banned the use of significance testing altogether from papers it published,[53] requiring authors to use other measures to evaluate hypotheses and impact.[54][55]

Other editors, commenting on this ban have noted: "Banning the reporting of p-values, as Basic and Applied Social Psychology recently did, is not going to solve the problem because it is merely treating a symptom of the problem. There is nothing wrong with hypothesis testing and p-values per se as long as authors, reviewers, and action editors use them correctly."[56] Some statisticians prefer to use alternative measures of evidence, such as likelihood ratios or Bayes factors.[57] Using Bayesian statistics can avoid confidence levels, but also requires making additional assumptions,[57] and may not necessarily improve practice regarding statistical testing.[58]

The widespread abuse of statistical significance represents an important topic of research in metascience.[59]

Redefining significance

edit

In 2016, the American Statistical Association (ASA) published a statement on p-values, saying that "the widespread use of 'statistical significance' (generally interpreted as 'p?≤ 0.05') as a license for making a claim of a scientific finding (or implied truth) leads to considerable distortion of the scientific process".[57] In 2017, a group of 72 authors proposed to enhance reproducibility by changing the p-value threshold for statistical significance from 0.05 to 0.005.[60] Other researchers responded that imposing a more stringent significance threshold would aggravate problems such as data dredging; alternative propositions are thus to select and justify flexible p-value thresholds before collecting data,[61] or to interpret p-values as continuous indices, thereby discarding thresholds and statistical significance.[62] Additionally, the change to 0.005 would increase the likelihood of false negatives, whereby the effect being studied is real, but the test fails to show it.[63]

In 2019, over 800 statisticians and scientists signed a message calling for the abandonment of the term "statistical significance" in science,[64] and the ASA published a further official statement [65] declaring (page 2):

We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term "statistically significant" entirely. Nor should variants such as "significantly different," "?," and "nonsignificant" survive, whether expressed in words, by asterisks in a table, or in some other way.

See also

edit

References

edit
  1. ^ a b c Sirkin, R. Mark (2005). "Two-sample t tests". Statistics for the Social Sciences (3rd?ed.). Thousand Oaks, CA: SAGE Publications, Inc. pp.?271–316. ISBN?978-1-4129-0546-6.
  2. ^ a b Borror, Connie M. (2009). "Statistical decision making". The Certified Quality Engineer Handbook (3rd?ed.). Milwaukee, WI: ASQ Quality Press. pp.?418–472. ISBN?978-0-87389-745-7.
  3. ^ a b Myers, Jerome L.; Well, Arnold D.; Lorch, Robert F. Jr. (2010). "Developing fundamentals of hypothesis testing using the binomial distribution". Research design and statistical analysis (3rd?ed.). New York, NY: Routledge. pp.?65–90. ISBN?978-0-8058-6431-1.
  4. ^ a b Dalgaard, Peter (2008). "Power and the computation of sample size". Introductory Statistics with R. Statistics and Computing. New York: Springer. pp.?155–56. doi:10.1007/978-0-387-79054-1_9. ISBN?978-0-387-79053-4.
  5. ^ a b "Statistical Hypothesis Testing". www.dartmouth.edu. Archived from the original on 2025-08-14. Retrieved 2025-08-14.
  6. ^ Johnson, Valen E. (October 9, 2013). "Revised standards for statistical evidence". Proceedings of the National Academy of Sciences. 110 (48): 19313–19317. Bibcode:2013PNAS..11019313J. doi:10.1073/pnas.1313476110. PMC?3845140. PMID?24218581.
  7. ^ Redmond, Carol; Colton, Theodore (2001). "Clinical significance versus statistical significance". Biostatistics in Clinical Trials. Wiley Reference Series in Biostatistics (3rd?ed.). West Sussex, United Kingdom: John Wiley & Sons Ltd. pp.?35–36. ISBN?978-0-471-82211-0.
  8. ^ Cumming, Geoff (2012). Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. New York, USA: Routledge. pp.?27–28.
  9. ^ Krzywinski, Martin; Altman, Naomi (30 October 2013). "Points of significance: Significance, P values and t-tests". Nature Methods. 10 (11): 1041–1042. doi:10.1038/nmeth.2698. PMID?24344377.
  10. ^ Sham, Pak C.; Purcell, Shaun M (17 April 2014). "Statistical power and significance testing in large-scale genetic studies". Nature Reviews Genetics. 15 (5): 335–346. doi:10.1038/nrg3706. PMID?24739678. S2CID?10961123.
  11. ^ Altman, Douglas G. (1999). Practical Statistics for Medical Research. New York, USA: Chapman & Hall/CRC. pp.?167. ISBN?978-0-412-27630-9.
  12. ^ a b Devore, Jay L. (2011). Probability and Statistics for Engineering and the Sciences (8th?ed.). Boston, MA: Cengage Learning. pp.?300–344. ISBN?978-0-538-73352-6.
  13. ^ Craparo, Robert M. (2007). "Significance level". In Salkind, Neil J. (ed.). Encyclopedia of Measurement and Statistics. Vol.?3. Thousand Oaks, CA: SAGE Publications. pp.?889–891. ISBN?978-1-4129-1611-0.
  14. ^ Sproull, Natalie L. (2002). "Hypothesis testing". Handbook of Research Methods: A Guide for Practitioners and Students in the Social Science (2nd?ed.). Lanham, MD: Scarecrow Press, Inc. pp.?49–64. ISBN?978-0-8108-4486-5.
  15. ^ Babbie, Earl R. (2013). "The logic of sampling". The Practice of Social Research (13th?ed.). Belmont, CA: Cengage Learning. pp.?185–226. ISBN?978-1-133-04979-1.
  16. ^ Faherty, Vincent (2008). "Probability and statistical significance". Compassionate Statistics: Applied Quantitative Analysis for Social Services (With exercises and instructions in SPSS) (1st?ed.). Thousand Oaks, CA: SAGE Publications, Inc. pp.?127–138. ISBN?978-1-4129-3982-9.
  17. ^ McKillup, Steve (2006). "Probability helps you make a decision about your results". Statistics Explained: An Introductory Guide for Life Scientists (1st?ed.). Cambridge, United Kingdom: Cambridge University Press. pp.?44–56. ISBN?978-0-521-54316-3.
  18. ^ Myers, Jerome L.; Well, Arnold D.; Lorch, Robert F. Jr. (2010). "The t distribution and its applications". Research Design and Statistical Analysis (3rd?ed.). New York, NY: Routledge. pp.?124–153. ISBN?978-0-8058-6431-1.
  19. ^ a b Hooper, Peter. "What is P-value?" (PDF). University of Alberta, Department of Mathematical and Statistical Sciences. Archived from the original (PDF) on March 31, 2020. Retrieved November 10, 2019.
  20. ^ Leung, W.-C. (2025-08-14). "Balancing statistical and clinical significance in evaluating treatment effects". Postgraduate Medical Journal. 77 (905): 201–204. doi:10.1136/pmj.77.905.201. ISSN?0032-5473. PMC?1741942. PMID?11222834.
  21. ^ Brian, éric; Jaisson, Marie (2007). "Physico-Theology and Mathematics (1710–1794)". The Descent of Human Sex Ratio at Birth. Springer Science & Business Media. pp.?1–25. ISBN?978-1-4020-6036-6.
  22. ^ John Arbuthnot (1710). "An argument for Divine Providence, taken from the constant regularity observed in the births of both sexes" (PDF). Philosophical Transactions of the Royal Society of London. 27 (325–336): 186–190. doi:10.1098/rstl.1710.0011.
  23. ^ Conover, W.J. (1999), "Chapter 3.4: The Sign Test", Practical Nonparametric Statistics (Third?ed.), Wiley, pp.?157–176, ISBN?978-0-471-16068-7
  24. ^ Sprent, P. (1989), Applied Nonparametric Statistical Methods (Second?ed.), Chapman & Hall, ISBN?978-0-412-44980-2
  25. ^ Stigler, Stephen M. (1986). The History of Statistics: The Measurement of Uncertainty Before 1900. Harvard University Press. pp.?225–226. ISBN?978-0-674-40341-3.
  26. ^ Bellhouse, David (2001), "John Arbuthnot", in C.C. Heyde; E. Seneta (eds.), in Statisticians of the Centuries, Springer, pp.?39–42, ISBN?978-0-387-95329-8
  27. ^ Hald, Anders (1998), "Chapter 4. Chance or Design: Tests of Significance", A History of Mathematical Statistics from 1750 to 1930, Wiley, p.?65
  28. ^ Cumming, Geoff (2011). "From null hypothesis significance to testing effect sizes". Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. Multivariate Applications Series. East Sussex, United Kingdom: Routledge. pp.?21–52. ISBN?978-0-415-87968-2.
  29. ^ Fisher, Ronald A. (1925). Statistical Methods for Research Workers. Edinburgh, UK: Oliver and Boyd. pp.?43. ISBN?978-0-05-002170-5. {{cite book}}: ISBN / Date incompatibility (help)
  30. ^ Poletiek, Fenna H. (2001). "Formal theories of testing". Hypothesis-testing Behaviour. Essays in Cognitive Psychology (1st?ed.). East Sussex, United Kingdom: Psychology Press. pp.?29–48. ISBN?978-1-84169-159-6.
  31. ^ a b c Quinn, Geoffrey R.; Keough, Michael J. (2002). Experimental Design and Data Analysis for Biologists (1st?ed.). Cambridge, UK: Cambridge University Press. pp.?46–69. ISBN?978-0-521-00976-8.
  32. ^ Neyman, J.; Pearson, E. S. (1933). "The testing of statistical hypotheses in relation to probabilities a priori". Mathematical Proceedings of the Cambridge Philosophical Society. 29 (4): 492–510. Bibcode:1933PCPS...29..492N. doi:10.1017/S030500410001152X. S2CID?119855116.
  33. ^ "Conclusions about statistical significance are possible with the help of the confidence interval. If the confidence interval does not include the value of zero effect, it can be assumed that there is a statistically significant result." Prel, Jean-Baptist du; Hommel, Gerhard; R?hrig, Bernd; Blettner, Maria (2009). "Confidence Interval or P-Value?". Deutsches ?rzteblatt Online. 106 (19): 335–9. doi:10.3238/arztebl.2009.0335. PMC?2689604. PMID?19547734.
  34. ^ StatNews #73: Overlapping Confidence Intervals and Statistical Significance
  35. ^ Neyman, J. (1937). "Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability". Philosophical Transactions of the Royal Society A. 236 (767): 333–380. Bibcode:1937RSPTA.236..333N. doi:10.1098/rsta.1937.0005. JSTOR?91337. S2CID?19584450.
  36. ^ Meier, Kenneth J.; Brudney, Jeffrey L.; Bohte, John (2011). Applied Statistics for Public and Nonprofit Administration (3rd?ed.). Boston, MA: Cengage Learning. pp.?189–209. ISBN?978-1-111-34280-7.
  37. ^ Healy, Joseph F. (2009). The Essentials of Statistics: A Tool for Social Research (2nd?ed.). Belmont, CA: Cengage Learning. pp.?177–205. ISBN?978-0-495-60143-2.
  38. ^ McKillup, Steve (2006). Statistics Explained: An Introductory Guide for Life Scientists (1st?ed.). Cambridge, UK: Cambridge University Press. pp.?32–38. ISBN?978-0-521-54316-3.
  39. ^ Health, David (1995). An Introduction To Experimental Design And Statistics For Biology (1st?ed.). Boston, MA: CRC press. pp.?123–154. ISBN?978-1-85728-132-3.
  40. ^ Hinton, Perry R. (2010). "Significance, error, and power". Statistics explained (3rd?ed.). New York, NY: Routledge. pp.?79–90. ISBN?978-1-84872-312-2.
  41. ^ Vaughan, Simon (2013). Scientific Inference: Learning from Data (1st?ed.). Cambridge, UK: Cambridge University Press. pp.?146–152. ISBN?978-1-107-02482-3.
  42. ^ a b Bracken, Michael B. (2013). Risk, Chance, and Causation: Investigating the Origins and Treatment of Disease (1st?ed.). New Haven, CT: Yale University Press. pp.?260–276. ISBN?978-0-300-18884-4.
  43. ^ Franklin, Allan (2013). "Prologue: The rise of the sigmas". Shifting Standards: Experiments in Particle Physics in the Twentieth Century (1st?ed.). Pittsburgh, PA: University of Pittsburgh Press. pp.?Ii–Iii. ISBN?978-0-8229-4430-0.
  44. ^ Clarke, GM; Anderson, CA; Pettersson, FH; Cardon, LR; Morris, AP; Zondervan, KT (February 6, 2011). "Basic statistical analysis in genetic case-control studies". Nature Protocols. 6 (2): 121–33. doi:10.1038/nprot.2010.182. PMC?3154648. PMID?21293453.
  45. ^ Barsh, GS; Copenhaver, GP; Gibson, G; Williams, SM (July 5, 2012). "Guidelines for Genome-Wide Association Studies". PLOS Genetics. 8 (7): e1002812. doi:10.1371/journal.pgen.1002812. PMC?3390399. PMID?22792080.
  46. ^ Carver, Ronald P. (1978). "The Case Against Statistical Significance Testing". Harvard Educational Review. 48 (3): 378–399. doi:10.17763/haer.48.3.t490261645281841. S2CID?16355113.
  47. ^ Ioannidis, John P. A. (2005). "Why most published research findings are false". PLOS Medicine. 2 (8): e124. doi:10.1371/journal.pmed.0020124. PMC?1182327. PMID?16060722.
  48. ^ a b Amrhein, Valentin; Korner-Nievergelt, Fr?nzi; Roth, Tobias (2017). "The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research". PeerJ. 5: e3544. doi:10.7717/peerj.3544. PMC?5502092. PMID?28698825.
  49. ^ a b Hojat, Mohammadreza; Xu, Gang (2004). "A Visitor's Guide to Effect Sizes". Advances in Health Sciences Education. 9 (3): 241–9. doi:10.1023/B:AHSE.0000038173.00909.f6. PMID?15316274. S2CID?8045624.
  50. ^ Pedhazur, Elazar J.; Schmelkin, Liora P. (1991). Measurement, Design, and Analysis: An Integrated Approach (Student?ed.). New York, NY: Psychology Press. pp.?180–210. ISBN?978-0-8058-1063-9.
  51. ^ Stahel, Werner (2016). "Statistical Issue in Reproducibility". Principles, Problems, Practices, and Prospects Reproducibility: Principles, Problems, Practices, and Prospects. pp.?87–114. doi:10.1002/9781118865064.ch5. ISBN?978-1-118-86497-5.
  52. ^ "CSSME Seminar Series: The argument over p-values and the Null Hypothesis Significance Testing (NHST) paradigm". www.education.leeds.ac.uk. School of Education, University of Leeds. Retrieved 2025-08-14.
  53. ^ Novella, Steven (February 25, 2015). "Psychology Journal Bans Significance Testing". Science-Based Medicine.
  54. ^ Woolston, Chris (2025-08-14). "Psychology journal bans P values". Nature. 519 (7541): 9. Bibcode:2015Natur.519....9W. doi:10.1038/519009f.
  55. ^ Siegfried, Tom (2025-08-14). "P value ban: small step for a journal, giant leap for science". Science News. Retrieved 2025-08-14.
  56. ^ Antonakis, John (February 2017). "On doing better science: From thrill of discovery to policy implications" (PDF). The Leadership Quarterly. 28 (1): 5–21. doi:10.1016/j.leaqua.2017.01.006.
  57. ^ a b c Wasserstein, Ronald L.; Lazar, Nicole A. (2025-08-14). "The ASA's Statement on p-Values: Context, Process, and Purpose". The American Statistician. 70 (2): 129–133. doi:10.1080/00031305.2016.1154108.
  58. ^ García-Pérez, Miguel A. (2025-08-14). "Thou Shalt Not Bear False Witness Against Null Hypothesis Significance Testing". Educational and Psychological Measurement. 77 (4): 631–662. doi:10.1177/0013164416668232. ISSN?0013-1644. PMC?5991793. PMID?30034024.
  59. ^ Ioannidis, John P. A.; Ware, Jennifer J.; Wagenmakers, Eric-Jan; Simonsohn, Uri; Chambers, Christopher D.; Button, Katherine S.; Bishop, Dorothy V. M.; Nosek, Brian A.; Munafò, Marcus R. (January 2017). "A manifesto for reproducible science". Nature Human Behaviour. 1 (1): 0021. doi:10.1038/s41562-016-0021. PMC?7610724. PMID?33954258.
  60. ^ Benjamin, Daniel; et?al. (2018). "Redefine statistical significance". Nature Human Behaviour. 1 (1): 6–10. doi:10.1038/s41562-017-0189-z. hdl:10281/184094. PMID?30980045.
  61. ^ Chawla, Dalmeet (2017). "'One-size-fits-all' threshold for P values under fire". Nature. doi:10.1038/nature.2017.22625.
  62. ^ Amrhein, Valentin; Greenland, Sander (2017). "Remove, rather than redefine, statistical significance". Nature Human Behaviour. 2 (1): 0224. doi:10.1038/s41562-017-0224-0. PMID?30980046. S2CID?46814177.
  63. ^ Vyse, Stuart (November 2017). "Moving Science's Statistical Goalposts". csicop.org. CSI. Retrieved 10 July 2018.
  64. ^ McShane, Blake; Greenland, Sander; Amrhein, Valentin (March 2019). "Scientists rise up against statistical significance". Nature. 567 (7748): 305–307. Bibcode:2019Natur.567..305A. doi:10.1038/d41586-019-00857-9. PMID?30894741.
  65. ^ Wasserstein, Ronald L.; Schirm, Allen L.; Lazar, Nicole A. (2025-08-14). "Moving to a World Beyond "p < 0.05"". The American Statistician. 73 (sup1): 1–19. doi:10.1080/00031305.2019.1583913.

Further reading

edit
edit
金庸的原名叫什么 风疹病毒igg阳性是什么意思 尿酸高什么意思 妊娠囊是什么 金价下跌意味着什么
唾液有臭味是什么原因 手上掉皮什么原因 知行合一是什么意思 青筋暴起是什么原因 巴斯光年是什么意思
为什么会起水泡 小孩夜里哭闹是什么原因 月经不来吃什么 肠梗阻吃什么药 什么药是消炎药
毛主席女儿为什么姓李 中药饮片是什么 维生素c阳性是什么意思 为什么蝙蝠会飞进家里 女人梦见鞋子什么预兆
hcy是什么检查项目hcv8jop2ns5r.cn 做梦梦到蜈蚣是什么意思hcv8jop7ns9r.cn 为什么掉发严重hcv7jop6ns6r.cn 1111是什么意思hebeidezhi.com 小肚子痛吃什么药hcv8jop7ns0r.cn
2016年属猴是什么命hcv9jop1ns9r.cn 受持是什么意思xjhesheng.com 为什么会得湿疹hcv9jop6ns1r.cn 心衰竭是什么病hcv8jop9ns0r.cn 2月13号是什么星座cj623037.com
膳食是什么意思hcv9jop8ns3r.cn 一日清闲一日仙是什么生肖hcv8jop7ns1r.cn 鉴定是什么意思hcv9jop3ns6r.cn 食物中毒什么症状hcv7jop9ns2r.cn 感光食物是什么意思hcv8jop1ns8r.cn
上焦中焦下焦是什么hcv9jop3ns3r.cn 梦见把头发剪短了是什么意思hcv8jop5ns0r.cn ambush是什么牌子hcv8jop4ns0r.cn 最近嗜睡是什么原因hcv8jop7ns1r.cn 旺夫脸是什么脸型hcv8jop0ns5r.cn
百度