|
The book presents a breakdown of each variant of machine learning, how it works and how it is used within certain industries. Also covered are various algorithm types (supervised, unsupervised and so on) during training phases of machine learning. The reader learns that with the right tools any developer or technology professional can glean information from their existing data. The book outlines the key types of machine learning, providing coded solutions for real world examples. There is a strong focus on data preparation and data cleaning, the core fundamental of machine learning. Each chapter includes how the code works and running examples. Coverage includes:
Languages for Machine Learning: Hadoop, Mahout, Weka Planning for Machine Learning: Data Storage/Data Cleaning Decision Trees: Types/Working Examples Bayesian Networks: Types/Working Examples Artificial Neural Networks: Types/Examples/Working Code Association Rule Learning Support Vector Machines: Coded Examples Clustering Machine Learning as Batch: Hadoop, Mahout, MapReduce: Examples Learning in Real Time: RabbitMQ Introduction Chapter 1 What Is Machine Learning? History of Machine Learning Alan Turing Arthur Samuel Tom M. Mitchell Summary Definition Algorithm Types for Machine Learning Supervised Learning Unsupervised Learning The Human Touch Uses for Machine Learning Software Stock Trading Robotics Medicine and Healthcare Advertising Retail and E-Commerce Gaming Analytics The Internet of Things Languages for Machine Learning Python R Matlab Scala Clojure Ruby Software Used in This Book Checking the Java Version Weka Toolkit Mahout SpringXD Hadoop Using an IDE Data Repositories UC Irvine Machine Learning Repository Infochimps Kaggle Summary Chapter 2 Planning for Machine Learning The Machine Learning Cycle It All Starts with a Question I Don`t Have Data! Starting Local Competitions One Solution Fits All? Defining the Process Planning Developing Testing Reporting Refining Production Building a Data Team Mathematics and Statistics Programming Graphic Design Domain Knowledge Data Processing Using Your Computer A Cluster of Machines Cloud-Based Services Data Storage Physical Discs Cloud-Based Storage Data Privacy Cultural Norms Generational Expectations The Anonymity of User Data Don`t Cross "The Creepy Line" Data Quality and Cleaning Presence Checks Type Checks Length Checks Range Checks Format Checks The Britney Dilemma What`s in a Country Name? Dates and Times Final Thoughts on Data Cleaning Thinking about Input Data Raw Text Comma Separated Variables JSON YAML XML Spreadsheets Databases Thinking about Output Data Don`t Be Afraid to Experiment Summary Chapter 3 Working with Decision Trees The Basics of Decision Trees Uses for Decision Trees Advantages of Decision Trees Limitations of Decision Trees Different Algorithm Types How Decision Trees Work Decision Trees in Weka The Requirement Training Data Using Weka to Create a Decision Tree Creating Java Code from the Classification Testing the Classifier Code Thinking about Future Iterations Summary Chapter 4 Bayesian Networks Pilots to Paperclips A Little Graph Theory A Little Probability Theory Coin Flips Conditional Probability Winning the Lottery Bayes Theorem How Bayesian Networks Work Assigning Probabilities Calculating Results Node Counts Using Domain Experts A Bayesian Network Walkthrough Java APIs for Bayesian Networks Planning the Network Coding Up the Network Summary Chapter 5 Artificial Neural Networks What Is a Neural Network? Artificial Neural Network Uses High-Frequency Trading Credit Applications Data Center Management Robotics Medical Monitoring Breaking Down the Artificial Neural Network Perceptrons Activation Functions Multilayer Perceptrons Back Propagation Data Preparation for Artificial Neural Networks Artificial Neural Networks with Weka Generating a Dataset Loading the Data into Weka Configuring the Multilayer Perceptron Training the Network Altering the Network Increasing the Test Data Size Implementing a Neural Network in Java Create the Project The Code Converting from CSV to Arff Running the Neural Network Summary Chapter 6 Association Rules Learning Where Is Association Rules Learning Used? Web Usage Mining Beer and Diapers How Association Rules Learning Works Support Confidence Lift Conviction Defining the Process Algorithms Apriori FP-Growth Mining the Baskets-A Walkthrough Downloading the Raw Data Setting Up the Project in Eclipse Setting Up the Items Data File Setting Up the Data Running Mahout Inspecting the Results Putting It All Together Further Development Summary Chapter 7 Support Vector Machines What Is a Support Vector Machine? Where Are Support Vector Machines Used? The Basic Classification Principles Binary and Multiclass Classification Linear Classifiers Confidence Maximizing and Minimizing to Find the Line How Support Vector Machines Approach Classification Using Linear Classification Using Non-Linear Classification Using Support Vector Machines in Weka Installing LibSVM A Classification Walkthrough Implementing LibSVM with Java Summary Chapter 8 Clustering What Is Clustering? Where Is Clustering Used? The Internet Business and Retail Law Enforcement Computing Clustering Models How the K-Means Works Calculating the Number of Clusters in a Dataset K-Means Clustering with Weka Preparing the Data The Workbench Method The Command-Line Method The Coded Method Summary Chapter 9 Machine Learning in Real Time with Spring XD Capturing the Firehose of Data Considerations of Using Data in Real Time Potential Uses for a Real-Time System Using Spring XD Spring XD Streams Input Sources, Sinks, and Processors Learning from Twitter Data The Development Plan Configuring the Twitter API Developer Application Configuring Spring XD Starting the Spring XD Server Creating Sample Data The Spring XD Shell Streams 101 Spring XD and Twitter Setting the Twitter Credentials Creating Your First Twitter Stream Where to Go from Here Introducing Processors How Processors Work within a Stream Creating Your Own Processor Real-Time Sentiment Analysis How the Basic Analysis Works Creating a Sentiment Processor Spring XD Taps Summary Chapter 10 Machine Learning as a Batch Process Is It Big Data? Considerations for Batch Processing Data Volume and Frequency How Much Data? Which Process Method? Practical Examples of Batch Processes Hadoop Sqoop Pig Mahout Cloud-Based Elastic Map Reduce A Note about the Walkthroughs Using the Hadoop Framework The Hadoop Architecture Setting Up a Single-Node Cluster How Map Reduce Works Mining the Hashtags Hadoop Support in Spring XD Objectives for This Walkthrough What`s a Hashtag? Creating the Map Reduce Classes Performing ETL on Existing Data Product Recommendation with Mahout Mining Sales Data Welcome to My Coffee Shop! Going Small Scale Writing the Core Methods Using Hadoop and Map Reduce Using Pig to Mine Sales Data Scheduling Batch Jobs Summary Chapter 11 Apache Spark Spark: A Hadoop Replacement? Java, Scala, or Python? Scala Crash Course Installing Scala Packages Data Types Classes Calling Functions Operators Control Structures Downloading and Installing Spark A Quick Intro to Spark Starting the Shell Data Sources Testing Spark Spark Monitor Comparing Hadoop MapReduce to Spark Writing Standalone Programs with Spark Spark Programs in Scala Installing SBT Spark Programs in Java Spark Program Summary Spark SQL Basic Concepts Using SparkSQL with RDDs Spark Streaming Basic Concepts Creating Your First Stream with Scala Creating Your First Stream with Java MLib: The Machine Learning Library Dependencies Decision Trees Clustering Summary Chapter 12 Machine Learning with R Installing R Mac OSX Windows Linux Your First Run Installing R-Studio The R Basics Variables and Vectors Matrices Lists Data Frames Installing Packages Loading in Data Plotting Data Simple Statistics Simple Linear Regression Creating the Data The Initial Graph Regression with the Linear Model Making a Prediction Basic Sentiment Analysis Functions to Load in Word Lists Writing a Function to Score Sentiment Testing the Function Apriori Association Rules Installing the A Rules Package The Training Data Importing the Transaction Data Running the Apriori Algorithm Inspecting the Results Accessing R from Java Installing the rJava Package Your First Java Code in R Calling R from Java Programs Setting Up an Eclipse Project Creating the Java/R Class Running the Example Extending Your R Implementations R and Hadoop The RHadoop Project A Sample Map Reduce Job in RHadoop Connecting to Social Media with R Summary Appendix A SpringXD Quick Start Installing Manually Starting SpringXD Creating a Stream Adding a Twitter Application Key Appendix B Hadoop 1.x Quick Start Downloading and Installing Hadoop Formatting the HDFS Filesystem Starting and Stopping Hadoop Process List of a Basic Job Appendix C Useful Unix Commands Using Sample Data Showing the Contents: cat, more, and less Example Command Expected Output Filtering Content: grep Example Command for Finding Text Example Output Sorting Data: sort Example Command for Basic Sorting Example Output Finding Unique Occurrences: unique Showing the Top of a File: head Counting Words: wc Locating Anything: find Combining Commands and Redirecting Output Picking a Text Editor Colon Frenzy: Vi and Vim Nano Emacs Appendix D Further Reading Machine Learning Statistics Big Data and Data Science Hadoop Visualization Making Decisions Datasets Blogs Useful Websites The Tools of the Trade Index ISBN - 9788126553372
|
|
Pages : 304
|