Data Mining Catalogue

Welcome to the Catalogue of Data Mining tools and service providers. The material in this catalogue is migrating to form a new book, The Data Mining Desktop Survival Guide.

Many Data Mining tools, vendors, and service providers are now available to service the Data Mining market. This Catalogue provides pointers to Data Mining tool vendors and service providers. The information is provided as is, without any guarantees. If information is incorrect or out of date, or you have information relating to other Data Mining tools or services, please let me know (

This compilation is Copyright © 2000-2006 Graham J. Williams. Redistribution of this catalogue is permitted under the general terms and intensions of the GNU General Public License. This catalogue is provided for reference only. Neither I nor Togaware, endorse nor in any way recommend the products and organisations listed and expressly exclude liability for any damage, loss or injury that a person may suffer as a result of any dealing with any product or organisation listed.

Free Data Mining Software

Bayesian Knowledge Discoverer
Techniques: Bayesian Networks
Platforms: Win32
Vendor: Freely available from the Open University

Updated 1999/11/30 13:39:42


Polytomous Logistic regression trees with Unbiased Split. PLUS is implemented in a set of Fortran 90 routines and accepts numerical/continuous as well as categorical variables. Missing covariate values are allowed. If a test data set is available, an estimate of the misclassification error rate will be provided. Executables are available for Digital Alpha (Digital UNIX 4.0), Sun SPARCstation/Ultra (Sun Solaris 2.6), Pentium (Linux), and Pentium (Windows 95/98/NT, coming soon).

Techniques: Logistic Regression
Platforms: Unix, Linux, Win32
Author: Tjen-Sien Lim

Updated 1999/11/01 09:36:41


An open source graphical data mining tool written in R and ideally suited for binary classification tasks and quick data mining.

Techniques: decision trees, random forests, support vector machines, boosting, logistic regression, kmeans
Platforms: Unix, Linux, Win32, MacOS
Author: Graham Williams


Updated 2006-07-31 07:00:36 Graham Williams

See The Data Mining Desktop Survival Guide

Vendors and Service Providers


4cData offers two state of the art DM tools:

  • 4cRuleBuilder;
  • 4cDiscretizer.

The products are characterized by easy to use GUI, and availability of free trial versions.

4cData also provides a full range of data mining and knowledge discovery services including:

  • building a data model
  • generating classification rules to perform a variety of tasks
  • discovering major trends in your data
  • performing data prediction
  • providing statistical analysis of the data
Techniques: decision rules, discretization
Platforms: Win95, Win98, WinNT, Win2000, WinXP
Vendor: 4cData, Golden, CO

Updated 2004/01/23 09:06:16


Angoss Software Corporation (Angoss), headquartered in Toronto, Canada, is a provider of Predictive Analytics Systems, for sales, marketing and risk management.

Angoss offer:

  • KnowledgeSTUDIO: multiple data mining models
  • KnowledgeSEEKER: high-performance decision tree tool
  • KnowledgeSERVER: deploy models, scoring, and rules
  • StrategyBUILDER: manage models and user business rules
Techniques: decision trees, k-means clustering, neural networks, linear and logistic regression
Platforms: Win32, Unix
Vendor: Angoss

Updated 1999/12/16 06:09:20

Bayesia: BayesiaLab

Bayesia is a company created by researchers specialising in Bayesian Networks and Machine Learning. The commercial product, BayesiaLab, offers a set of functionalities for modeling and data mining:

  • Ergonomic graphical interface allowing creating your models by simple clicks, with equation editor for a compact description of your probability distributions
  • Complete set of Bayesian networks learning algorithms: Supervised learning algorithms for profiling target variables, unsupervised learning for discovering all the probabilistic relations that hold your data, Clustering for new concept discovery
  • Data importation Wizards allowing data preprocessing
  • Missing values and hidden variables processing
  • Evaluation of the models with confusion matrices, lift and ROC curve
  • Analysis toolbox for easy models' understanding: strength and type of the probabilistic relations, sensitivity analysis, causal analysis, contradiction analysis, HTML reports generation
  • Powerful automatic layout algorithms for both tree networks and highly connected and complex graphs
  • High interoperability, with JDBC/ODBC connections to data bases, SQL interface, and exportation of Bayesian networks, tables, equations, graphs, matrices and reports just by copying and pasting.
  • Dynamic Bayesian networks for taking into account the temporal dimension
  • Decision Aiding tools with Decision and Utility nodes, direct representation of action policies, and reinforcement learning algorithms for automatic discovering of policies for static as well as dynamic Bayesian networks
Techniques: Bayesian nets
Platforms: Win32
Vendor: Bayesia

Updated 2004/02/17 08:54:07

Equbits Foresight
See The Data Mining Desktop Survival Guide

Fujitsu: GhostMiner
See The Data Mining Desktop Survival Guide


A company whose primary business is consulting in commercial data mining & VLDB decision support across Europe. Provide services in Technical Strategy and Tool Selection, Data Mining projects using CRISP-DM, Data Warehousing management consulting, technical design and implementation. Provide bureau services in Data Mining and Visualisation, and Web Information Publishing.
Markets: Communication, Finance, Retail, Manufacturing.
Training: Running Data Mining Projects using CRISP-DM
Understanding Data Mining in Business
Vendor: OpenMIND Consulting, Oxfordshire, UK

Updated 1999/12/15 15:20:51

PredictionWorks: InductionEngine
See The Data Mining Desktop Survival Guide

Purple Insight: MineSet

Purple Insight acquired the MineSet product from SGI in October, 2003. MineSet provide interactive exploration of data through an advanced suite of visual tools for faster discovery of meaningful trends and relationships. The Splat Visualizer and the Scatter Visualizer represent complex data in up to eight dimensions. The Map Visualizer displays data with geographical relationships by using a map metaphor. Animation and view synchronization techniques are used to reveal patterns over critical dimensions such as time. The Tree Visualizer depicts data with hierarchical relationships utilizing a fly-through technique set in a 3D landscape. The Statistics Visualizer presents a visual summary of basic statistical information. Advanced drill-through techniques give you fast access to the original records that created entities within your visualization for additional analysis. The Record Viewer allows quick access to the original data with column sorting and HTML output capabilities.
Techniques: mlc++, visualisation, rules, associations, clustering
Platforms: MSWindows, IRIX
Vendor: Purple Insight
Purple Insight

Updated 2004/02/18 06:03:14


RapAnalyst allows you to visualize and mine complex data.
Techniques: neural networks, genetic algorithms, self organizing maps
Platforms: MSWindows
Vendor: Raptor International
Raptor International

Updated 2006-01-06 15:30:27


Phil Sherrod has developed two statistical programs:

  • DTREG generates classification and regression decision trees. It uses V-fold cross-valication with pruning to generate the optimal size tree, and it uses surrogate splitters to handle missing data. A free demonstration copy is available;
  • NLREG performs general nonlinear regression. NLREG will fit a general function, whose form you specify, to a set of data values. A free demonstration copy is available.
Techniques: classification and regression decision trees, nonlinear regression
Platforms: MSWindows
Author: Phil Sherrod
Internet: for DTREG and for NLREG

Updated 2004/01/23 09:06:16

AgentBase   from Daz Systems

An agent-oriented knowledge discovery system.

Vendor: Daz Systems, Inc, California
World Wide Web:

Updated 24 Jul 1997 09:41:18

RuleQuest   C5.0 and See5

This is an updated, commercial, and significantly faster, version of C4.5, a very popular decision tree induction algorithm.

Techniques: Decision Trees, Rules
Platforms: Unix, Win95, WinNT
Vendor: RuleQuest, Sydney
World Wide Web:

Updated 25 Jul 1997 09:39:24


CART® (Classification and Regression Trees) is a widely used and respected Decision Tree induction and Regression system from the Statistical community with many applications to Data Mining.

Techniques: Classification and Regression Trees with Boosting
Applications: classification and predictions
Platforms: Win95, MacOS, Unix
Vendor: Salford Systems, San Diego, California
World Wide Web:

Updated 9 Sep 1998 10:01:36

Clementine   from SPSS

See The Data Mining Desktop Survival Guide

FICS   Cube-It

Techniques: OLAP, Visualisation
Market: Finance
Vendor: Financial Information Consulting Services, Belgium
World Wide Web:

Updated 30 Jul 1997 07:38:27

Darwin   from Thinking Machines

See Oracle Data Mining in The Data Mining Desktop Survival Guide

DataMind DataCruncher  

DataMind DataCruncher is a server based data mining software product. Able to handle an unlimited number of records, DataCruncher supports advanced data mining features such as incremental modeling, model merge, and native connectivity to Oracle and Informix database management systems.

Techniques: Agent Network Technology
Market: Telecomunications Marketing (Telecom Customer Life Cycle)
Platforms: Server: Sun, HP, IBM, SGI, and WinNT. Client: WinNT and Win95
Vendor: DataMind Corporation, San Mateo, California
World Wide Web:

Updated 10 Oct 1997 20:06:47

MIT    Data Engine

DataEngine combines conventional data analysis methods with fuzzy logic and neural networks.

Techniques: Fuzzy Logic, Neural Networks, Fuzzy Clustering.
Markets: Industry, Trade, Financial Services.
Applications: Quality Control, Process Optimization Forecasting, Database Marketing, Diagnosis
Platforms: Win95, WinNT
Vendor: MIT - Management Intelligenter Technologien GmbH, Aachen, Germany
World Wide Web:

Updated 30 Jan 1998 12:33:54

DataMite from LPA    from Logic Programming Associates

A tool for investigating data from ODBC-compliant relational databases through the synthesis of clusters and rule induction with queries direct to the database engine.

Techniques: Rule Induction, Clustering
Platforms: WinNT, Win95
Vendor: Logic Programming Associates (LPA), England
World Wide Web:

Updated 22 Jul 1998 06:27:56

Information Discovery, Inc    Data Mining Suite

The Data Mining Suite works directly on large SQL repositories with no need for sampling nor to extract files. It accesses large volumes of multi-table relational data on the server, incrementally discovers patterns and delivers automatically generated English text and graphs as explainable documents on the intranet.

Techniques: Rule-based Influence; Dimensional Affinity; Trends
Vendor: Information Discovery, Inc, Hermosa Beach, California
World Wide Web:

Updated 13 Mar 1998 14:38:47


An OLAM tool supporting Data Cubes. An educational version (restricted to 1000 rows of data) available for free.

Techniques: OLAP, Associations, Decision Trees, Regression
Platforms: Win32
Vendor: DBMiner Technology Inc, Canada
World Wide Web:

Updated 18 Jul 1998 17:16:08

Internetivity Inc

dbProbe is a lightweight Java Applet that enables multi-dimensional cube analysis via any of the leading web browsers. The user is presented with an effective, true GUI rather than an HTML-based dialoge. The server-based cube-builder component can access various data sources including ODBC and OLE/DB. dbProbe uses a unique architecture which enables the whole cube to be transferred to the client allowing off-line analysis and a highly scalable server model. A client-server model is also available to allow analysis of massive cubes.

Techniques: Multi-dimensional cube construction and analysis, GUI representation, Web-based deployment
Platforms: MS-WindowsNT, MS-Windows95, Linux, UNIX
Vendor: InterNetivity Inc, Canada
World Wide Web:
Distributor: DBNet

Updated 1999/06/11 15:59:01

Trajecta   dbProphet

Techniques: Neural Network
Market: Sales and Marketing
Applications: Business retention, resource allocation, prospecting, cross selling and upselling, market basket analysis, pricing, and fraud detection
Platforms: WinNT, Unix
Vendor: Trajecta, Austin, Texas
World Wide Web:

Updated 28 Jan 1998 09:08:53

Quadstone Decisionhouse   

Decisionhouse is a scalable data mining product for (marketing and risk) customer modelling applications in finance, retial and telco markets.

Techniques: Visualisation, Selection, Drill-Down, Decision Trees, Discriminant Analysis, Logistic and Probit Regression, Gini and Kolmogorov-Smirnov Analysis
Markets: Finance, Retial, Telecommunications
Applications: Customer profiling, targeting, segmentation, assessment of risk, customer worth, lifetime value, purchasing propensity and churn modelling.
Users: Barclays Bank, J Sainsbury plc, Liverpool Victoria Group, Barclaycard, British Airways
Platforms: Unix Servers; Unix and Win95 Clients
Vendor: Quad Stone, Scotland
World Wide Web:

Updated 19 Feb 1998 12:02:16

Pilot   Discovery Server

Techniques: Segmentation
Market: Sales, Marketing
Platforms: Win95, WinNT
Vendor: Pilot Software, Massachusetts
World Wide Web:

Updated 25 Jul 1997 10:12:25

Enterprise Miner    from SAS

See The Data Mining Desktop Survival Guide


Built on top of SAS, GainSmarts provides a statistically-based approach for the exploration of patterns and data.

Techniques: CHAID, Genetic Algorithms, Predictive Modelling, Expert Systems
Markets: Direct Marketing
Platforms: SAS, WinNT, Unix
Vendor: Urban Science, Detroit, Michigan
World Wide Web:

Updated 3 Apr 1998 16:25:51

IBM   Intelligent Miner

A collection of tools to handle organisational/transactional data for Data Mining.

Techniques: Association, Segmentation, Time Sequence, NN, Rules
Platforms: AIX, AS/400, OS/390, MVS, Win95, WinNT, OS/2
Vendor: IBM
World Wide Web:

Updated 24 Jul 1997 09:58:21

Kate   from AcknoSoft

Techniques: Decision Trees, Case-Based Reasoning
Vendor: AcknoSoft, Paris
World Wide Web:

Updated 20 Feb 1998 17:01:17


Market: Retail
Vendor: Knowledge Discovery One, Inc, Texas
World Wide Web:

Updated 30 Jul 1997 16:02:57


MARS® (Multivariate Adaptive Regression Splines) was developed by Jerry Friedman and is another of his widely respected data mining tools with sound statistical foundations.

Techniques: Multivariate Adaptive Regression Splines
Applications: classification and prediction
Vendor: Salford Systems, San Diego, California
World Wide Web:

Updated 19 Sep 1998 09:28:43

Unica    Model 1

A Windows-based family of data mining modules for database marketing applications. Model 1 includes a wizard-driven GUI, automated model building process using a variety of algorithms, easy-to-interpret reports, built-in campaign optimization, and run-time module deployment.

Applications: Response Modeling, Cross-Selling, Customer Valuation, Segmentation, Profiling
Platforms: Win32
Vendor: Unica
World Wide Web:

Updated 19 Sep 1998 06:29:25

ABTech    ModelQuest

Products include ModelQuest MarketMiner and ModelQuest Enterprise. Techniques automatically capture relationships in data using a network of mathematical functions that subdivide problems into manageable pieces, or nodes, and automatically apply regression techniques to solve each of these simpler problems. ABTech also offer Data Mining services.

Techniques: Neural Nets, Regression
Platforms: Win95, WinNT
Vendor: ABTech
World Wide Web:

Updated 14 Nov 1997 18:08:18

NeoVista   Decision Series

Techniques: Neural Nets, Association Rules, Clustering, Genetic Algorithms
Market: Retail and market basket analysis
Platforms: SMP
Vendor: Neovista
World Wide Web:

Updated 28 Aug 1997 13:57:10


Offer a variety of products and services for data mining.

Techniques: NN
Market: Telecom Fraud, Market Churn, Credit Scoring
Vendor: Neural Technologies, England
World Wide Web:
Distributor: Neural Mining Solutions Pty Ltd (Australia)

Updated 25 Jul 1997 11:35:22


ODBCMine generates decision rules from ODBC databases using the C4.5 classification model algorithm. It analyzes the data in any ODBC data source, and writes decision rules in ASCII to the standard output device. The 2.0 release outputs graphical decision trees in Scalable Vector Graphics (SVG) format. It also provides a new method for users to pre-categorize continuous (numeric) variables into discrete ranges, providing much better performance for these situations. It is intended to be a simple and inexpensive - yet powerful - implementation of a classic data mining algorithm. A single user-license is USD $99.95. An academic license is available for free by request.

Techniques: Decision Tree Induction
Platforms: MSWindows
Vendor: Intelligent Systems Research, Chicago
World Wide Web:

Updated 2003/03/31

Torrent    Orchestrate

A scalable, parallel, large volume data management tool, providing a data mining platform.

Techniques: Data Management, Statistics, Neural Networks, Decision Tree Induction
Platforms: Unix, Unix SMP
Vendor: Torrent Systems, Inc, Massachusetts
World Wide Web:

Updated 4 Mar 1998 13:16:30

Partek    Partek

A comprehensive pattern recognition and data visualization system. Direct manipulation visual data mining combined with statistical, neural, and other pattern recognition techniques. Capabilities: Interactive Visualization, Statistical Analysis, Normalization & Transformation, Variable Selection, Model Development & Deployment.

Platforms: Unix (SGI, Sun, HP, Linux), Windows 9x/NT
Vendor: Partek Inc, Missouri
World Wide Web:

Updated 24 Jul 1997 08:37:51

Magnify   PATTERN

Magnify is a provider of data mining services with their own sophisticated tool for data mining very large datasets.

Platforms: Unix
Vendor: Magnify Inc, Chicago
World Wide Web:

Updated 24 Jul 1997 08:37:51

Megaputer    PolyAnalyst

PolyAnalyst is a multi-strategy data mining environment with a selection of exploration engines to predict values of continuous variables, explicitly model complex phenomena, determine the most influential independent variables, and solve classification and clustering tasks. Features include object-oriented design, point-and-click GUI, versatile data manipulation, visualization, and reporting capabilities, minimum of statistics, and a simple interface to various data storage architectures.

Products: PolyAnalyst 4.0 (MSWindows)
Techniques: Genetic Algorithms, Multidimensional Analysis, Stepwise Linear Regression, Statistics, Localization of anomalies, Neural Networks, Associations, Memory-Based Reasoning
Platforms: Win32
Market: Finance, Marketing, Medicine, Telecom, Retailing
Vendor: Megaputer Intelligence, Indianna
Distributor: Megaputer
World Wide Web:

Updated 2000/01/21 08:30:39


A freeware (academic/personal use) rule finding tool for small to medium data mining tasks, with a commercial tool available for larger tasks.

Techniques: Rules
Platforms: Win95, WinNT
Vendor: Quintillion Corporation
World Wide Web:

Updated 24 Jul 1997 10:24:43

Cognos   Scenario

Techniques: Statistics, Segmentation, Rules, Visualisation
Platforms: Win95
Vendor: Cognos, Canada
World Wide Web:

Updated 24 Jul 1997 12:07:37

SPSS    Data Mining

"SPSS offers an integrated family of products and services that makes it easy to share intelligence found via data mining. From simple to sophisticated analysis, SPSS offers a range of products for modeling."

Techniques: CHAID, decision trees, neural networks, data visualization, multidimensional reporting, extensive range of statistics, plus statistical components.
Platforms: Win95, WinNT, UNIX
Vendor: SPSS, Chicago, IL
World Wide Web:
Press: DM Review

Updated 6 Aug 1998 06:41:10


See The Data Mining Desktop Survival Guide


Techniques: Rule Induction
Platforms: Win95, WinNT
Vendor: Azmy Thinkware Inc.
World Wide Web:

Updated 24 Jul 1997 07:46:17

Tandem   Object Relational Data Mining

Tandem provides the platform for running a variety of partner data mining products.

Vendor: Tandem, Silicon Valley
World Wide Web:

Updated 25 Jul 1997 11:04:47

VG Lab   Virtual Predict

See The Data Mining Desktop Survival Guide

Computer Science Innovations   Visualizer Workstation

Techniques: N-dimensional graphics, feature extraction and enhancement, sample, autocluster, descriptive knowledge-based trainer
Platforms: WinNT
Vendor: Computer Science Innovations, Inc, Melbourne, Florida
World Wide Web:

Updated 29 Jan 1998 15:33:11


Techniques: Rules
Platforms: Win95, WinNT
Vendor: WizSoft
World Wide Web:

Updated 24 Jul 1997 09:15:37


XELOPES is a an open platform-independent and data-source- independent library for Embedded Data Mining. It is based on the CWM standard of the OMG and supports other Data Mining standards like PMML, OLE DB for DM, and MLC++. It is available under Java, C++, and C#.

Techniques: Decision Trees, Parallel SQL
Platforms: Win95, WinNT
Vendor: Attar Software, England
World Wide Web:

Updated 2003/02/22 08:36:17

Attar   XpertRule

Techniques: Decision Trees, Parallel SQL
Platforms: Win95, WinNT
Vendor: Attar Software, England
World Wide Web:

Updated 25 Jul 1997 09:53:16

This document is maintained by Graham Williams

KDD Nuggets
Free Tools
Service Providers
Purple Insight
Raptor International
Equbits Foresight

Discovery Server
Data Mining Suite
Enterprise Miner
Intelligent Miner
Model 1
Neural Technologies
Virtual Predict
Visualizer Workstation

Last modified: Mon Jul 31 07:56:43 EST 2006


Ads Follow - These are Not Endorsed by Togaware
Shop at Amazon