rminer - A open-source library that facilitates the use of Machine Learning methods in R
This package was used in several Machine Learning / Data Mining applications: intensive care medicine, meat and wine quality assessment, civil engineering, forest fires prediction, modeling student performance, time series forecasting, spam e-mail detection and others. Available at the Comprehensive R Archive Network (CRAN). The rminer package tutorial pdf is available here and the respective code is available here.
Springer Book: Modern Optimization with R (R code, data, ...)
Benchmark of AutoML tools: IJCNN 2021 paper results, data & code
Datasets:
Automotive Display Anomaly (anomaly detection in industrial process data, binary classification, available at since 2025).
Cross-source cross-domain sentiment analysis (classification, available at GitHub since 2019).
Twitter-country-geolocation (classification, available at GitHub since 2019).
CS Abstracts Dataset (sequential classification, available at GitHub since 2018).
Online News Popularity (regression/classification, donated to the UCI ML repository in 2015).
Stock Market Lexicon (with more than 20.000 microblog terms associated with positive or negative scores, available at GitHub, created in 2015).
Student Performance (regression/classification, donated to the UCI ML repository in 2014).
Bank Marketing (classification, donated to UCI ML repository in 2014).
Input importance synthetic datasets (regression and classification, eXplainable Artificial Intelligence - XAI, see this paper: Elsevier or RepositoriUM, created in 2012).
Internet Traffic Time Series Datasets (time series forecasting, see this paper: Wiley or RepositoriUM; also available at tsdl R package - index 643 to 648, donated in 2012).
S-Enron corpus (personalized spam e-mail classification, 5 users, "Date:" field should be used to mix the ham
and spam messages, see this paper: Elsevier or RepositoriUM, created in 2011).
Wine Quality (regression/classification) donated to the UCI ML repository in 2009).
Forest Fires (regression, donated to the UCI Machine Learning (ML) repository in 2008).