Annotating Web Search Results

Filed in Articles by on September 24, 2020

Annotating Web Search Results.

ABSTRACT

With more than millions of pages, the Web has become a greatly enormous information source. This information is in form of documents, images, videos as well as text. With such vast sizes of data, it is a common problem to get the right information that one wants.

Oftentimes users have to search for the right content they are looking for from the Web with the help of search engines. Searching can be done manually by use of available platforms like Google or automatically in form of web crawlers.

Since the semantic web is not structured, search results can include varying types of information relating to the same query. Sometimes these results cannot be directly analyzed to meet the specific interpretation need.

The search result records (SRRs) returned from the Web following manual or automatic queries are in form of web pages that hold results obtained from underlying databases. Such results can further be used in many applications such as data collection, comparison of prices etc.

Thus, there is a need to make the SRRs machine processable. To achieve that, it is important that the SRRs are annotated in a meaningful fashion. Annotation adds value to the SRRs in that the collected data can be stored for further analysis and makes the collection easier to read and understand.

Also annotation prepares the data for data visualization. The SRRs bearing same concepts are grouped together thus making it easier to make comparisons and analyze and go through the collection.

The purpose of this research is to find out how search results from the Web can be automatically annotated and restructured to allow for data visualization for users in a specific domain of discourse.

A case study application is implemented that uses a web crawler to retrieve web pages about any topic in public health domain.

This research is a continuation of the work done by Mr. Emanuel Onu in the project “Proposal of a Tool to Enhance Competitive Intelligence on the Web”.

INTRODUCTION

People of all walks of life use the internet for so many different tasks such as buying and selling items, social networking, digital libraries, news, etc.

Researchers need information from digital libraries and other online document repositories to conduct their research and share information; scholars need books to get information and knowledge from; people communicate to one another through emails via the Web.

Others utilize social media to exchange information as well as having casual chat; some conduct transactions like purchasing items and paying for bills via the web. The World Wide Web is today the main “all kind of information” repository and has been so far very successful in disseminating information to humans.

The Web has become the preferred medium for many database applications, such as e-commerce and digital libraries. Many database applications store information in huge databases that users’ access, query and update through the Web.

The improvement in hardware technologies has seen increase in computers and server’s storage capacity. As such, many web servers store a lot of data in their storage drives.

In some social media websites e.g. Facebook[1], users can upload pictures, videos as well as other documents. YouTube [2] allows its users to post videos of varying lengths to their servers.

There are other automated systems that collect a lot of data on daily basis. For example, bank systems need to store daily Auto Teller Machine (ATM) transactions as well as other customers’ transactions.

Some monitoring systems collect data about some aspect of life e.g. climate change, online shopping systems that keep information about the clients’ daily shopping experience.

REFERENCES

Sriramoju1 S. B., (2014). An Application for Annotating Web Search Results. International Journal of Innovative Research in Computer and Communication Engineering. 2(3). 3306-3312.

Embley D.W., Campbell D.M., Jiang Y.S., Liddle S.W., Lonsdale D.W., Smith R.D., (1999). Conceptual-model-based data extraction from multiple-record Web pages. Data & Knowledge Engineering (31). 227-251

Jadhao1 S., Kulkarni R. P., (2014). Review of Semantic Web, Annotation Methods and Automatic Annotation for Web Search Results. International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622. International Conference on Industrial Automation and Computing (ICIAC- 12-13th April 2014)

Meagher P., (2004). Implement Bayesian inference using PHP, Part 1. Build intelligent Web applications through conditional probability. IBM Developer Works. Document available at: http://www.ibm.com/developerworks/library/wa-bayes1/wa-bayes1-pdf.pdf.

Handschuh S., Volz R., Staab S., (2004). Annotation for the Deep Web. IEEE INTELLIGENT SYSTEMS. Pp 43-48.

Handschuh S. and Staab S., (2002). “Authoring and Annotation of Web Pages in CREAM,” Proc. 11th Int’l World Wide Web Conf., ACM Press, pp. 462–473.

Comments are closed.

Hey Hi

Don't miss this opportunity

Enter Your Details