dc.contributor.author | Kocabas, Ilker | |
dc.contributor.author | Dincer, Bekir Taner | |
dc.contributor.author | Karaoglan, Bahar | |
dc.date.accessioned | 2020-11-20T16:33:39Z | |
dc.date.available | 2020-11-20T16:33:39Z | |
dc.date.issued | 2011 | |
dc.identifier.issn | 1300-0632 | |
dc.identifier.issn | 1303-6203 | |
dc.identifier.uri | https://doi.org/10.3906/elk-1003-448 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12809/4439 | |
dc.description | Dincer, Bekir Taner/0000-0002-0660-7239 | en_US |
dc.description | WOS: 000295497900013 | en_US |
dc.description.abstract | In this study, we show how Luhn's claim about the degree of importance of a word in a document can be related to information retrieval. His basic idea is transformed into z -scores as the weights of terms for the purpose of modeling terra frequency (If) within documents. The Luhn-based models represented in this paper are considered as the TF component of proposed TF x IDF weighing schemes. Moreover, the final term weighting functions appropriate for the TF x IDF weighting scheme are applied to TREC-6, -7, and -8 databases. The experimental results show relevance to Luhn's claim by having high mean average precision (MAP) for the terms with frequencies around the mean frequency of terms within a document. On the other hand, the weighting, which significantly discriminates the importance between low/high frequencies and medium frequencies, degrades the retrieval performance. Therefore, any weighting scheme (TF) that is directly proportional to If has a probability of high retrieval performance, if this can optimally indicate the difference of the importance regarding tf values and also optimally eliminate the terms that have high frequencies. | en_US |
dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TUBITAK)Turkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) [107E192] | en_US |
dc.description.sponsorship | This work was supported by the Scientific and Technological Research Council of Turkey (TUBITAK) within the scope of Project No. 107E192. The authors thank TUBITAK for supporting this project. | en_US |
dc.item-language.iso | eng | en_US |
dc.publisher | Tubitak Scientific & Technical Research Council Turkey | en_US |
dc.item-rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Luhn | en_US |
dc.subject | Information Retrieval | en_US |
dc.subject | Term Weighting | en_US |
dc.subject | Indexing | en_US |
dc.title | Investigation of Luhn's claim on information retrieval | en_US |
dc.item-type | article | en_US |
dc.contributor.department | MÜ | en_US |
dc.contributor.departmentTemp | [Kocabas, Ilker; Karaoglan, Bahar] Ege Univ, Int Comp Inst, TR-35100 Izmir, Turkey -- [Dincer, Bekir Taner] Mugla Univ, Dept Stat, TR-48100 Mugla, Turkey | en_US |
dc.identifier.doi | 10.3906/elk-1003-448 | |
dc.identifier.volume | 19 | en_US |
dc.identifier.issue | 6 | en_US |
dc.identifier.startpage | 993 | en_US |
dc.identifier.endpage | 1004 | en_US |
dc.relation.journal | Turkish Journal of Electrical Engineering and Computer Sciences | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |