Save to My Lists Export Return to Browse

Previous Record Next Record

Book Cover

E-book

Author

Wachsmuth, Henning.

Title Text analysis pipelines : towards ad-hoc large scale text mining / Henning Wachsmuth

Published Cham : Springer, [2015]

©2015

Click on the following:

Springer Computer Science eBooks

Springer eBooks

Copies

Description 1 online resource (xx, 302 pages) : illustrations

Series LNCS sublibrary. SL 1, Theoretical computer science and general issues

Lecture notes in computer science ; 9383. 1611-3349

LNCS sublibrary. SL 1, Theoretical computer science and general issues.

Contents Intro -- Foreword -- Preface -- Symbols -- Contents -- 1 Introduction -- 1.1 Information Search in Times of Big Data -- 1.1.1 Text Mining to the Rescue -- 1.2 A Need for Efficient and Robust Text Analysis Pipelines -- 1.2.1 Basic Text Analysis Scenario -- 1.2.2 Shortcomings of Traditional Text Analysis Pipelines -- 1.2.3 Problems Approached in This Book -- 1.3 Towards Intelligent Pipeline Design and Execution -- 1.3.1 Central Research Question and Method -- 1.3.2 An Artificial Intelligence Approach -- 1.4 Contributions and Outline of This Book

1.4.1 New Findings in Ad-Hoc Large-Scale Text Mining -- 1.4.2 Contributions to the Concerned Research Fields -- 1.4.3 Structure of the Remaining Chapters -- 1.4.4 Published Research Within This Book -- 2 Text Analysis Pipelines -- 2.1 Foundations of Text Mining -- 2.1.1 Text Mining -- 2.1.2 Information Retrieval -- 2.1.3 Natural Language Processing -- 2.1.4 Data Mining -- 2.1.5 Development and Evaluation -- 2.2 Text Analysis Tasks, Processes, and Pipelines -- 2.2.1 Text Analysis Tasks -- 2.2.2 Text Analysis Processes -- 2.2.3 Text Analysis Pipelines -- 2.3 Case Studies in This Book

2.3.1 InfexBA -- Information Extraction for Business Applications -- 2.3.2 ArguAna -- Argumentation Analysis in Customer Opinions -- 2.3.3 Other Evaluated Text Analysis Tasks -- 2.4 State of the Art in Ad-Hoc Large-Scale Text Mining -- 2.4.1 Text Analysis Approaches -- 2.4.2 Design of Text Analysis Approaches -- 2.4.3 Efficiency of Text Analysis Approaches -- 2.4.4 Robustness of Text Analysis Approaches -- 3 Pipeline Design -- 3.1 Ideal Construction and Execution for Ad-Hoc Text Mining -- 3.1.1 The Optimality of Text Analysis Pipelines

3.1.2 Paradigms of Designing Optimal Text Analysis Pipelines -- 3.1.3 Case Study of Ideal Construction and Execution -- 3.1.4 Discussion of Ideal Construction and Execution -- 3.2 A Process-Oriented View of Text Analysis -- 3.2.1 Text Analysis as an Annotation Task -- 3.2.2 Modeling the Information to Be Annotated -- 3.2.3 Modeling the Quality to Be Achieved by the Annotation -- 3.2.4 Modeling the Analysis to Be Performed for Annotation -- 3.2.5 Defining an Annotation Task Ontology -- 3.2.6 Discussion of the Process-Oriented View -- 3.3 Ad-Hoc Construction via Partial Order Planning

3.3.1 Modeling Algorithm Selection as a Planning Problem -- 3.3.2 Selecting the Algorithms of a Partially Ordered Pipeline -- 3.3.3 Linearizing the Partially Ordered Pipeline -- 3.3.4 Properties of the Proposed Approach -- 3.3.5 An Expert System for Ad-Hoc Construction -- 3.3.6 Evaluation of Ad-Hoc Construction -- 3.3.7 Discussion of Ad-Hoc Construction -- 3.4 An Information-Oriented View of Text Analysis -- 3.4.1 Text Analysis as a Filtering Task -- 3.4.2 Defining the Relevance of Portions of Text -- 3.4.3 Specifying a Degree of Filtering for Each Relation Type

Summary This monograph proposes a comprehensive and fully automatic approach to designing text analysis pipelines for arbitrary information needs that are optimal in terms of run-time efficiency and that robustly mine relevant information from text of any kind. Based on state-of-the-art techniques from machine learning and other areas of artificial intelligence, novel pipeline construction and execution algorithms are developed and implemented in prototypical software. Formal analyses of the algorithms and extensive empirical experiments underline that the proposed approach represents an essential step towards the ad-hoc use of text mining in web search and big data analytics. Both web search and big data analytics aim to fulfill peoplesℓ́ℓ needs for information in an adhoc manner. The information sought for is often hidden in large amounts of natural language text. Instead of simply returning links to potentially relevant texts, leading search and analytics engines have started to directly mine relevant information from the texts. To this end, they execute text analysis pipelines that may consist of several complex information-extraction and text-classification stages. Due to practical requirements of efficiency and robustness, however, the use of text mining has so far been limited to anticipated information needs that can be fulfilled with rather simple, manually constructed pipelines

Notes English

Subject Data mining.

Text processing (Computer science)

Computer science.

Computers.

Logic, Symbolic and mathematical.

Database management.

Information retrieval.

Artificial intelligence.

Word processing operations.

Word processing.

Electronic data processing.

Electronic digital computers.

data processing.

computer science.

computers.

information retrieval.

artificial intelligence.

Information retrieval.

Artificial intelligence.

Mathematical theory of computation.

Databases.

User interface design & usability.

Computers -- Information Technology.

Computers -- Intelligence (AI) & Semantics.

Mathematics -- Logic.

Computers -- Database Management -- General.

Computers -- Machine Theory.

Computers -- System Administration -- Storage & Retrieval.

Word processing operations

Word processing

Electronic digital computers

Electronic data processing

Artificial intelligence

Computer science

Computers

Data mining

Database management

Information retrieval

Logic, Symbolic and mathematical

Text processing (Computer science)

Form Electronic book

ISBN 9783319257419

3319257412

3319257404

9783319257402

Permalink