The Linguist

The Linguist 52,4

The Linguist is a languages magazine for professional linguists, translators, interpreters, language professionals, language teachers, trainers, students and academics with articles on translation, interpreting, business, government, technology

Issue link: https://thelinguist.uberflip.com/i/148589

Contents of this Issue

Navigation

Page 22 of 35

TRANSLATORS' WEB operators, which offer numerous options to expand and/or narrow down users' search queries, according to their information needs. Google Advanced Search, for example, lets the user search for all query terms, an exact query phrase, at least one of the query terms, and/or none of the specified query terms. The latter is typically used to remove ambiguity, eg in the case of polysemous words such as 'cat', which can be both a mammal and an acronym, as in 'computer-aided translation'. The minus ('-' or NOT), plus ('+' or AND) and double quote operators narrow a search down (eg, cat -animal). The synonym ("~") and OR search operators broaden it; the former is also useful for finding different spellings of the same word ('gray' vs. 'grey'). Some search engines also allow users to construct 'nested searches', ie, more complex queries, by grouping search operator statements using parentheses (y OR (NOT x) AND z). Most of today's commercial search engines, including Google, Yahoo! and Ask.com, support basic Boolean operators. Yet only a few, including Yahoo! and Exalead, allow for nesting or the use of proximity operators, such as NEAR (to find documents where the query terms are in a short range of words) and ADJ (to find documents where the query terms are next to each other). Nevertheless, proximity searching can be performed in most commercial search engines via phrase searching (which allows for adjacency in ordered searches), as well as the use of the asterisk (*) to find variations of the exact phrase when used with double quotes (eg, "the server was unable to * file on *"). Only a few web search engines, such as Exalead, support the use of the asterisk for text truncation, ie to find documents containing words with the same root (eg, educat*). Other operators that may prove particularly useful for translators are define: to search for term definitions; site: to restrict searches to a single website, top-level domain (such as .org and .edu) or country top-level domain (eg, .es and .jp); and filetype: to filter results according to the type of document (eg, pdf or Word). Providing the right information In general, the more information we provide about a specific search need, the higher the likelihood of the IR system (Google for most of us) retrieving relevant search results. Providing more information is, however, not as simple as typing in more query terms. 'The terms need to be highly relevant to the task, and they need to be entered in combination with system-specific correct syntax.'5 Hence, the notion of 'term selection' becomes highly significant in web search. The underlying implication is that not all terms carry the same weight or importance. Another important notion is that of 'term co-occurrence' – the fact that a search engine (via automatic query expansion) or a searcher (via interactive query expansion) can use similar terms to those specified by other users to expand the current query and improve retrieval performance. Google Suggest is one of the most popular systems for interactive query expansion, providing realtime suggestions to complete a search query as the user types. Term co-occurrence is, however, a tricky area, as frequently occurring terms tend to discriminate poorly between relevant and non-relevant documents. One of the main rules that search engines use to rank matches involves the location and frequency of keywords on a given webpage. Yet not all search engines retrieve and rank results in the same way, due to their different proprietary methods for indexing the web. To mitigate the lack of overlap in results, users can employ metasearch engines (such as Metacrawler and Dogpile), which leverage the content and ranking capabilities of top search engines to provide the top-ranked results. Other technologies that promise to leverage the power of search engines and improve the user's search and navigation experience are those used in the attempt to expand the web as we know it today – ie, to transform it from a medium of documents for humans to read, into a semantic web that includes data and information for computers to process and manipulate. For the time being, however, search success will continue to depend largely on users' knowledge about search engine capabilities. Translation and Web Searching by Vanessa Enríquez Raído will be published in October by Routledge. Notes 1 Purcell, K, Brenner, J and Rainie, L, 2012, 'Search Engine Use 2012', Pew Internet and American Life Project. Available from www.pewinternet.org/~/media//Files/Reports/20 12/PIP_Search_Engine_Use_2012.pdf 2 Battelle, J, 2006, The Search: How Google and its rivals rewrote the rules of business and transformed our culture, Nicholas Brealey, London and Boston, 32 3 Aula, A, 2005, 'Studying User Strategies and Characteristics for Developing Web Search Interfaces', unpublished doctoral thesis, University of Tampere 4 Carpineto, C and Romano, G, 2012, 'A Survey of Automatic Query Expansion in Information Retrieval' in ACM Computing Surveys, 44(1), 1-50 © ROBERT KNESCHKE | DREAMSTIME.COM 5 Op cit, Aula, 18 USER PERCEPTIONS Experience of search engines differs widely depending on the educational background of users. University educated users have a more negative experience Vol/52 No/4 2013 AUGUST/SEPTEMBER The Linguist 23

Articles in this issue

Links on this page

Archives of this issue

view archives of The Linguist - The Linguist 52,4