{"id":844,"date":"2017-10-29T23:50:32","date_gmt":"2017-10-29T23:50:32","guid":{"rendered":"http:\/\/www.dongpingzhang.com\/?p=844"},"modified":"2017-11-01T04:07:52","modified_gmt":"2017-11-01T04:07:52","slug":"information-retrieval","status":"publish","type":"post","link":"http:\/\/www.dongpingzhang.com\/?p=844","title":{"rendered":"Information Retrieval"},"content":{"rendered":"<p><a href=\"http:\/\/www.dongpingzhang.com\/wordpress\/wp-content\/uploads\/2017\/11\/IR.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-845\" src=\"http:\/\/www.dongpingzhang.com\/wordpress\/wp-content\/uploads\/2017\/11\/IR-279x300.jpg\" alt=\"\" width=\"279\" height=\"300\" srcset=\"http:\/\/www.dongpingzhang.com\/wordpress\/wp-content\/uploads\/2017\/11\/IR-279x300.jpg 279w, http:\/\/www.dongpingzhang.com\/wordpress\/wp-content\/uploads\/2017\/11\/IR-768x826.jpg 768w, http:\/\/www.dongpingzhang.com\/wordpress\/wp-content\/uploads\/2017\/11\/IR-952x1024.jpg 952w\" sizes=\"auto, (max-width: 279px) 100vw, 279px\" \/><\/a><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">My book of this week is <\/span><i><span style=\"font-weight: 400;\">Information Retrieval: Implementing and Evaluating Search Engines<\/span><\/i><span style=\"font-weight: 400;\">, by <\/span><span style=\"font-weight: 400;\">\u00a0<\/span><a href=\"http:\/\/www.stefan.buettcher.org\"><span style=\"font-weight: 400;\">Stefan B\u00fcttcher<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/plg.uwaterloo.ca\/~claclark\/\"><span style=\"font-weight: 400;\">Charles L. A. Clarke<\/span><\/a><span style=\"font-weight: 400;\"> and <\/span><a href=\"http:\/\/cormack.uwaterloo.ca\"><span style=\"font-weight: 400;\">Gordon V. Cormack<\/span><\/a><span style=\"font-weight: 400;\">, published in February 2016. This appeared to be the latest and most comprehensive book on information retrieval and search engines that I found back in August when I wanted to learn more about this field. <\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">Clearly this book is very different from all the other books I have written about this year, except two: <\/span><a href=\"http:\/\/www.dongpingzhang.com\/?p=794\"><span style=\"font-weight: 400;\">Introduction to Information Retrieval<\/span><\/a><span style=\"font-weight: 400;\"> by Manning et al., <\/span><a href=\"http:\/\/www.dongpingzhang.com\/?p=656\"><span style=\"font-weight: 400;\">Text Data Management and Analysis<\/span><\/a><span style=\"font-weight: 400;\"> by Zhai et al. The former, available freely <\/span><a href=\"https:\/\/nlp.stanford.edu\/IR-book\/\"><span style=\"font-weight: 400;\">online<\/span><\/a><span style=\"font-weight: 400;\">, is a great place to start reading about information retrieval, if you are unsure whether you want to invest in the topic yet. <\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">Here is my paradox. (a): I enjoy reading about computer science, more broadly, science and technology in general, and I work in this field. (b): It would be cheating if I were to read and write a book a week about the subjects that would directly connect to my profession. It might advance my career and make me more an expert, but would not broaden my general view. But I do feel very tempted to read some, at least. So, here goes another book in the arena of computing. I hope I strike a reasonable balance in terms of my choice of books. <\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">There is one more computer science book that I would very much like to read as part of this project, which is the upcoming computer architecture book that <\/span><a href=\"https:\/\/web.stanford.edu\/~hennessy\/\"><span style=\"font-weight: 400;\">John Hennessy<\/span><\/a><span style=\"font-weight: 400;\"> et al. have been working on, if available before the year of 2017 draws its curtain. \u00a0<\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">Back to this week\u2019s book, it is very impressively comprehensive. I love the plain explanations of the concepts, the right amount of equations that are clearly annotated and explained, and the superb discussions about practical implementation matters. There are many papers passing by my desk with symbols, equations and concepts that are poorly explained. I do realise I am ignorant of many subjects and by no means very bright at all, but I am under the impression that some papers are written to \u201cimpress\u201d people rather than to broadcast knowledge or to educate people on the topic covered. It is committing a crime to write like that. Just imagine how many bright young students might have taken up interesting research projects in that field and advance the science frontier, had they been able to understand what they read from those papers rather than feeling deeply doubtful about their own intellectual potential in pursuing advanced research. The good news is that this book does not fall into that category. <\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">Thanks to being more recent than the IR book by Manning et al., this book has updated some topics covered there and includes some new content such as learning to rank. A great amount of attention is given to evaluation. It also has a slightly more implementation-oriented flavor. There are many discussions around the algorithms, data structures, search effectiveness, efficiency and so on. The authors provide a few sample chapters <\/span><a href=\"http:\/\/www.ir.uwaterloo.ca\/book\/\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">. Content-wise, the book covers: the fundamentals of information retrieval, search engine indexing, retrieval and ranking, measuring search engine effectiveness and efficiency, parallelisation of IR, and specifics related to web search. One great feature of this book is its coverage of computer performance, e.g., discussions of caching and data placement (such as in-memory or on-disk). <\/span><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">Overall, it is a great textbook for this field. By no means have I mastered all. My colorful markers show me what sections I need to revisit. <\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>My book of this week is Information Retrieval: Implementing and Evaluating Search Engines, by \u00a0Stefan B\u00fcttcher, Charles L. A. Clarke and Gordon V. Cormack, published in February 2016. This appeared to be the latest and most comprehensive book on information retrieval and search engines that I found back in August when I wanted to learn &hellip; <a href=\"http:\/\/www.dongpingzhang.com\/?p=844\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Information Retrieval<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[4],"tags":[],"class_list":["post-844","post","type-post","status-publish","format-standard","hentry","category-computer-science"],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/paFL7T-dC","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=\/wp\/v2\/posts\/844","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=844"}],"version-history":[{"count":3,"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=\/wp\/v2\/posts\/844\/revisions"}],"predecessor-version":[{"id":848,"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=\/wp\/v2\/posts\/844\/revisions\/848"}],"wp:attachment":[{"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=844"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=844"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.dongpingzhang.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=844"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}