Limit search to available items
Book Cover
Author Conway, Drew.

Title Machine learning for email / Drew Conway and John Myles White
Edition First edition
Published Sebastopol, CA : O'Reilly, [2012]
Online access available from:
Safari O'Reilly books online    View Resource Record  


Description 1 online resource (xi, 130 pages) : illustrations
Contents Machine generated contents note: 1. Using R -- R for Machine Learning -- Downloading and Installing R -- IDEs and Text Editors -- Loading and Installing R Packages -- R Basics for Machine Learning -- Further Reading on R -- 2. Data Exploration -- Exploration vs. Confirmation -- What is Data? -- Inferring the Types of Columns in Your Data -- Inferring Meaning -- Numeric Summaries -- Means, Medians, and Modes -- Quantiles -- Standard Deviations and Variances -- Exploratory Data Visualization -- Modes -- Skewness -- Thin Tails vs. Heavy Tails -- Visualizing the Relationships between Columns -- 3. Classification: Spam Filtering -- This or That: Binary Classification -- Moving Gently into Conditional Probability -- Writing Our First Bayesian Spam Classifier -- Defining the Classifier and Testing It with Hard Ham -- Testing the Classifier Against All Email Types -- Improving the Results -- 4. Ranking: Priority Inbox -- How Do You Sort Something When You Don't Know the Order? -- Ordering Email Messages by Priority -- Priority Features Email -- Writing a Priority Inbox -- Functions for Extracting the Feature Set -- Creating a Weighting Scheme for Ranking -- Weighting from Email Thread Activity -- Training and Testing the Ranker
Summary If you're an experienced programmer willing to crunch data, this concise guide will show you how to use machine learning to work with email. You'll learn how to write algorithms that automatically sort and redirect email based on statistical patterns. Authors Drew Conway and John Myles White approach the process in a practical fashion, using a case-study driven approach rather than a traditional math-heavy presentation. This book also includes a short tutorial on using the popular R language to manipulate and analyze data. You'll get clear examples for analyzing sample data and writing machine learning programs with R. Mine email content with R functions, using a collection of sample files Analyze the data and use the results to write a Bayesian spam classifier Rank email by importance, using factors such as thread activity Use your email ranking analysis to write a priority inbox program Test your classifier and priority inbox with a separate email sample set
Bibliography Includes bibliographical references (pages 129-130)
Notes Print version record
Subject Electronic mail messages -- Management.
Electronic mail systems.
Machine learning.
Spam (Electronic mail) -- Prevention.
Spam filtering (Electronic mail)
Form Electronic book
Author White, John Myles.
ISBN 1449320708 (electronic bk.)
1449320716 (electronic bk.)
9781449320706 (electronic bk.)
9781449320713 (electronic bk.)