Web Usage Mining, also known as Web Log Mining, is the result of user interaction with a Web server including Web logs, click streams and database transaction or the visits of search engine crawlers at a Website. Log files provide an immense source of information about the behavior of users as well as search engine crawlers. Web Usage Mining concerns the usage of common browsing patterns, i.e. pages requested in sequence from Web logs. These patterns can be utilized to enhance the design and modification of a Website. Analyzing and discovering user behavior is helpful for understanding what online information users inquire and how they behave. The analyzed result can be used in intelligent online applications, refining Websites, improving search accuracy when seeking information and lead decision makers towards better decisions in changing markets, for instance by putting advertisements in ideal places. Similarly, the crawlers or spiders are accessing the Websites to index new and updated pages. These traces help to analyze the behavior of search engine crawlers.
The log files are unstructured files and of huge size. These files need to be extracted and pre-processed before any data mining functionality to follow. Pre-processing is done in unique ways for each application. Two pre-processing algorithms are proposed based on indiscernibility relations in rough set theory which generates Equivalence Classes. The first algorithm generates a pre-processed file with successful user requests while the second one generates a pre-processed file for pre-fetching and caching purposes. Two algorithms are proposed to extract usage analytics. The first algorithm identifies the origin of visits, the top referring sites and the most popular keywords used by the visitor to arrive at a Website. The second algorithm extracts user agents like browsers and operating systems used by a visitor to access a Website.
In this study, clustering of users based on Entry Pages to a Website is done to analyze the deep linked traffic at a Website. The Top Ten Entry Pages, the traffic and the temporal information of the Top Ten Entry Pages are also studied.
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
Prof. Jeeva Jose was awarded PhD in Computer Science from Mahatma Gandhi University, Kerala, India and is a faculty member at BPC College, Kerala. Her passion is teaching and areas of interests include World Wide Web, Data Mining and Cyber laws. She has been in higher education since year 2000 years and has completed three research projects funded by UGC and KSCSTE. She has authored and published five books. She has published more than twenty research papers in various refereed journals and conference proceedings. She has edited three books and has given many invited talks in various conferences. She is a recipient of ACM-W Scholarship provided by Association for Computing Machinery, New York.
Text sample:
Chapter 2: Pre-processing of Web Logs and Web Usage Analytics:
Web Usage Mining needs tremendous amount of pre-processing before any data mining functionality to follow. The pre-processing will remove irrelevant records which otherwise may affect the mining results. This chapter is divided into 2 sections namely pre-processing of Web logs and Web usage analytics. Two pre-processing algorithms are proposed based on indiscernibility relations in rough set theory which generates Equivalence Classes. The first algorithm pre-processes the raw file for further identification of users and user sessions. The second algorithm pre-processes the log file and gives the pages accessed, ist frequency and total bytes transferred. Two algorithms are proposed to extract usage analytics. The first algorithm identifies the origin of user visits, top referring sites and most popular keywords used by the visitor to arrive at a Website. The second algorithm extracts browsers with ist version and operating system with ist version used by various visitors to access a Website. The browser and operating system are together known as user agents. All algorithms are tested on two different data sets and the results are displayed.
2.1: Pre-processing of Web Logs:
The need for pre-processing is explained in section 1.3. The advantages of pre-processing include the elimination of considerable amount of space needed to store irrelevant records and the precision of mining results can be improved. This Chapter deals with pre-processing of Web log files related to mine user behavior and hence all the search engine crawler requests, unsuccessful requests, other irrelevant requests containing .jpg, .mpg, .gif, .png, .txt, .wav etc. are removed. The indiscernibility relation in rough set theory is used for pre-processing [234Jose12] [240Jose12]. Table 2.1 shows various status codes of Hyper Text Transfer Protocol [27indicating response status.
2.1.1: Indiscernibility Relations in Rough Set Theory:
A rough set based feature selection for Web Usage Mining is used in [94Inbarani07]. The experimental result shows the importance of the Web data pre-processing and it reduces the size of the log file. Feature selection is a preprocessing step in data mining and is very effective in reducing dimensions. Feature selection process refers to choose a subset of attributes from the set of original attributes. The purpose of feature selection is to identify the significant features, eliminate the irrelevant of dispensable features to the learning task and build a good learning model. The indiscernibility relation in rough set theory is used for clustering in [95Hirano05]. The main advantage of this method is that it can be applied to proximity measures that do not satisfy the triangular inequality and very well handles relative proximity. Relative proximity is a class of proximity measures that is suitable for representing subjective similarity or dissimilarity such as the degree of likeness between people. Indiscernibility relations in rough set theory [96Pawalak02] can be used for the data cleaning of Web log files. Rough set is based on the assumption that with every object of the universe of discourse, some information is associated. Objects characterized by the same information are indiscernible (similar) in view of the available information about them. Any set of all indiscernible (similar) objects is called an elementary set and forms a basic granule of knowledge about the universe. Any union of some elementary sets is referred to as crisp (precise) set otherwise the set is rough (imprecise, vague).
Let a given pair S= (U,A) of non-empty finite sets U and A, where U is the Universe of objects and A is the set consisting of attributes. The function a: U Va , where Va is the set of values of attribute a called the domain of a. The pair S=(U,A) is called an information system. Any information system can be represented by a data t
„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.
Anbieter: BuchWeltWeit Ludwig Meier e.K., Bergisch Gladbach, Deutschland
Taschenbuch. Zustand: Neu. This item is printed on demand - it takes 3-4 days longer - Neuware -Web Usage Mining, also known as Web Log Mining, is the result of user interaction with a Web server including Web logs, click streams and database transaction or the visits of search engine crawlers at a Website. Log files provide an immense source of information about the behavior of users as well as search engine crawlers. Web Usage Mining concerns the usage of common browsing patterns, i.e. pages requested in sequence from Web logs. These patterns can be utilized to enhance the design and modification of a Website. Analyzing and discovering user behavior is helpful for understanding what online information users inquire and how they behave. The analyzed result can be used in intelligent online applications, refining Websites, improving search accuracy when seeking information and lead decision makers towards better decisions in changing markets, for instance by putting advertisements in ideal places. Similarly, the crawlers or spiders are accessing the Websites to index new and updated pages. These traces help to analyze the behavior of search engine crawlers.The log files are unstructured files and of huge size. These files need to be extracted and pre-processed before any data mining functionality to follow. Pre-processing is done in unique ways for each application. Two pre-processing algorithms are proposed based on indiscernibility relations in rough set theory which generates Equivalence Classes. The first algorithm generates a pre-processed file with successful user requests while the second one generates a pre-processed file for pre-fetching and caching purposes. Two algorithms are proposed to extract usage analytics. The first algorithm identifies the origin of visits, the top referring sites and the most popular keywords used by the visitor to arrive at a Website. The second algorithm extracts user agents like browsers and operating systems used by a visitor to access a Website.In this study, clustering of users based on Entry Pages to a Website is done to analyze the deep linked traffic at a Website. The Top Ten Entry Pages, the traffic and the temporal information of the Top Ten Entry Pages are also studied. 212 pp. Englisch. Bestandsnummer des Verkäufers 9783960670872
Anzahl: 2 verfügbar
Anbieter: Biblios, Frankfurt am main, HESSE, Deutschland
Zustand: New. PRINT ON DEMAND pp. 212. Bestandsnummer des Verkäufers 18378469394
Anzahl: 4 verfügbar
Anbieter: buchversandmimpf2000, Emtmannsberg, BAYE, Deutschland
Taschenbuch. Zustand: Neu. This item is printed on demand - Print on Demand Titel. Neuware -Web Usage Mining, also known as Web Log Mining, is the result of user interaction with a Web server including Web logs, click streams and database transaction or the visits of search engine crawlers at a Website. Log files provide immense source of information about the behavior of users as well as search engine crawlers. Web Usage Mining concerns usage of common browsing patterns i.e. pages requested in sequence from Web logs. These patterns can be utilized to enhance the design and modification of a Website. Analyzing and discovering user behavior is helpful for understanding what online information users inquire and how they behave. The analyzed result can be used in intelligent online applications, refining Websites, improving search accuracy when seeking information and lead decision makers towards better decisions in changing markets like putting advertisements in ideal places. Similarly, the crawlers or spiders are accessing the Websites to index new and updated pages. These traces help to analyze the behavior of search engine crawlers.The log files are unstructured files and of huge size. These files need to be extracted and pre-processed before any data mining functionality to follow. Pre-processing is done in unique ways for each application. Two pre-processing algorithms are proposed based on indiscernibility relations in rough set theory which generates Equivalence Classes. The first algorithm generates a pre-processed file with successful user requests while the second one generates a pre-processed file for pre-fetching and caching purposes. Two algorithms are proposed to extract usage analytics. The first algorithm identifies the origin of visits, the top referring sites and the most popular keywords used by the visitor to arrive at a Website. The second algorithm extracts user agents like browser with its version and operating system with its version used by a visitor to access a Website.In this study, clustering of users based on Entry Pages to a Website is done to analyze the deep linked traffic at a Website. The Top Ten Entry Pages, the traffic and the temporal information of the Top Ten Entry Pages are also studied.Diplomica Verlag, Hermannstal 119k, 22119 Hamburg 212 pp. Englisch. Bestandsnummer des Verkäufers 9783960670872
Anzahl: 1 verfügbar
Anbieter: AHA-BUCH GmbH, Einbeck, Deutschland
Taschenbuch. Zustand: Neu. nach der Bestellung gedruckt Neuware - Printed after ordering - Web Usage Mining, also known as Web Log Mining, is the result of user interaction with a Web server including Web logs, click streams and database transaction or the visits of search engine crawlers at a Website. Log files provide an immense source of information about the behavior of users as well as search engine crawlers. Web Usage Mining concerns the usage of common browsing patterns, i.e. pages requested in sequence from Web logs. These patterns can be utilized to enhance the design and modification of a Website. Analyzing and discovering user behavior is helpful for understanding what online information users inquire and how they behave. The analyzed result can be used in intelligent online applications, refining Websites, improving search accuracy when seeking information and lead decision makers towards better decisions in changing markets, for instance by putting advertisements in ideal places. Similarly, the crawlers or spiders are accessing the Websites to index new and updated pages. These traces help to analyze the behavior of search engine crawlers.The log files are unstructured files and of huge size. These files need to be extracted and pre-processed before any data mining functionality to follow. Pre-processing is done in unique ways for each application. Two pre-processing algorithms are proposed based on indiscernibility relations in rough set theory which generates Equivalence Classes. The first algorithm generates a pre-processed file with successful user requests while the second one generates a pre-processed file for pre-fetching and caching purposes. Two algorithms are proposed to extract usage analytics. The first algorithm identifies the origin of visits, the top referring sites and the most popular keywords used by the visitor to arrive at a Website. The second algorithm extracts user agents like browsers and operating systems used by a visitor to access a Website.In this study, clustering of users based on Entry Pages to a Website is done to analyze the deep linked traffic at a Website. The Top Ten Entry Pages, the traffic and the temporal information of the Top Ten Entry Pages are also studied. Bestandsnummer des Verkäufers 9783960670872
Anzahl: 1 verfügbar
Anbieter: GreatBookPrices, Columbia, MD, USA
Zustand: New. Bestandsnummer des Verkäufers 29692174-n
Anzahl: Mehr als 20 verfügbar
Anbieter: preigu, Osnabrück, Deutschland
Taschenbuch. Zustand: Neu. Gaining Insight into User and Search Engine Behaviour by Analyzing Web Logs | Jeeva Jose (u. a.) | Taschenbuch | 212 S. | Englisch | 2016 | Anchor Academic Publishing | EAN 9783960670872 | Verantwortliche Person für die EU: Dryas Verlag, ein Imprint der Bedey und Thoms Media GmbH, Hermannstal 119k, 22119 Hamburg, kontakt[at]dryas[dot]de | Anbieter: preigu Print on Demand. Bestandsnummer des Verkäufers 107753923
Anzahl: 5 verfügbar
Anbieter: Lucky's Textbooks, Dallas, TX, USA
Zustand: New. Bestandsnummer des Verkäufers ABLING22Oct2817100641421
Anzahl: Mehr als 20 verfügbar
Anbieter: PBShop.store US, Wood Dale, IL, USA
PAP. Zustand: New. New Book. Shipped from UK. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Bestandsnummer des Verkäufers L0-9783960670872
Anzahl: Mehr als 20 verfügbar
Anbieter: PBShop.store UK, Fairford, GLOS, Vereinigtes Königreich
PAP. Zustand: New. New Book. Delivered from our UK warehouse in 4 to 14 business days. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Bestandsnummer des Verkäufers L0-9783960670872
Anzahl: Mehr als 20 verfügbar
Anbieter: Ria Christie Collections, Uxbridge, Vereinigtes Königreich
Zustand: New. In. Bestandsnummer des Verkäufers ria9783960670872_new
Anzahl: Mehr als 20 verfügbar