| Mahout Change Log |
| |
| Release 0.11.0 - unreleased |
| |
| MAHOUT-1744: Deprecate lucene2seq (apalumbo) |
| |
| MAHOUT-1761: Upgraded to Apache parent pom v17 (sslavic) |
| |
| MAHOUT-1745: Purge deprecated ConcatVectorsJob from codebase (apalumbo) |
| |
| MAHOUT-1757: small fix in spca formula (smarthi) |
| |
| MAHOUT-1756: Missing +=: and *=: operators on vectors (smarthi) |
| |
| NOJIRA: Clean up CLI help for spark-rowsimilarity and fixed test that intermitently failed (pferrel) |
| |
| MAHOUT-1685: Move Mahout shell to Spark 1.3+ (dlyubimov, apalumbo) |
| |
| MAHOUT-1653: Spark 1.3 (pferrel, apalumbo) |
| |
| MAHOUT-1754: Distance and squared distance matrices routines (dlyubimov) |
| |
| MAHOUT-1753: First and second moment routines (dlyubimov) |
| |
| MAHOUT-1746: mxA ^ 2, mxA ^ 0.5 to mean the same thing as mxA * mxA and mxA ::= sqrt _ (dlyubimov) |
| |
| MAHOUT-1736: Implement allreduceBlock() on H2O (avati) |
| |
| MAHOUT-1752: Implement CbindScalar operator on H2O (avati) |
| |
| MAHOUT-1660: Hadoop1HDFSUtil.readDRMHEader should be taking Hadoop conf (dlyubimov) |
| |
| MAHOUT-1713: Performance and parallelization improvements for AB', A'B, A'A spark physical operators (dlyubimov) |
| |
| MAHOUT-1714: Add MAHOUT_OPTS environment when running Spark shell (dlyubimov) |
| |
| MAHOUT-1715: Closeable API for broadcast tensors (dlyubimov) |
| |
| MAHOUT-1716: Scala logging style (dlyubimov) |
| |
| MAHOUT-1717: allreduceBlock() operator api and Spark implementation (dlyubimov) |
| |
| MAHOUT-1718: Support for conversion of any type-keyed DRM into ordinally-keyed DRM (dlyubimov) |
| |
| MAHOUT-1719: Unary elementwise function operator and function fusions (dlyubimov) |
| |
| MAHOUT-1720: Support 1 cbind X, X cbind 1 etc. for both Matrix and DRM (dlyubimov) |
| |
| MAHOUT-1721: rowSumsMap() summary for non-int-keyed DRMs (dlyubimov) |
| |
| MAHOUT-1722: DRM row sampling api (dlyubimov) |
| |
| MAHOUT-1723: Optional structural "flavor" abstraction for in-core matrices (dlyubimov) |
| |
| MAHOUT-1724: Optimizations of matrix-matrix in-core multiplication based on structural flavors (dlyubimov) |
| |
| MAHOUT-1725: elementwise power operator ^ (dlyubimov) |
| |
| MAHOUT-1726: R-like vector concatenation operator (dlyubimov) |
| |
| MAHOUT-1727: Elementwise analogues of scala.math functions for tensor types (dlyubimov) |
| |
| MAHOUT-1728: In-core functional assignments (dlyubimov) |
| |
| MAHOUT-1729: Straighten out behavior of Matrix.iterator() and iterateNonEmpty() (dlyubimov) |
| |
| MAHOUT-1730: New mutable transposition view for in-core matrices (dlyubimov) |
| |
| MAHOUT-1731: Deprecate SparseColumnMatrix (dlyubimov) |
| |
| MAHOUT-1732: Native support for kryo serialization of tensor types (dlyubimov) |
| |
| Release 0.10.1 - unreleased |
| |
| MAHOUT-1704: Pare down dependency jar for h2o (apalumbo) |
| |
| MAHOUT-1697: Fixed paths to which math-scala and spark modules docs get packaged under in bin distribution archive (sslavic) |
| |
| MAHOUT-1696: QRDecomposition.solve(...) can return incorrect Matrix types (apalumbo) |
| |
| MAHOUT-1690: CLONE - Some vector dumper flags are expecting arguments. (smarthi) |
| |
| MAHOUT-1693: FunctionalMatrixView materializes row vectors in scala shell (apalumbo) |
| |
| MAHOUT-1680: Renamed mahout-distribution to apache-mahout-distribution (sslavic) |
| |
| Release 0.10.0 - 2015-04-11 |
| |
| MAHOUT-1630: Incorrect SparseColumnMatrix.numSlices() causes IndexException in toString() (Oleg Nitz, smarthi) |
| |
| MAHOUT-1665: Update hadoop commands in example scripts (akm) |
| |
| MAHOUT-1676: Deprecate MLP, ConcatenateVectorsJob and ConcatenateVectorsReducer in the codebase (apalumbo) |
| |
| MAHOUT-1622: MultithreadedBatchItemSimilarities outputs incorrect number of similarities (Jesse Daniels, Anand Avati via smarthi) |
| |
| MAHOUT-1605: Make VisualizerTest locale independent (Frank Rosner, Anand Avati via smarthi) |
| |
| MAHOUT-1635: Getting an exception when I provide classification labels manually for Naive Bayes (apalumbo) |
| |
| MAHOUT-1662: Potential Path bug in SequenceFileVaultIterator breaks DisplaySpectralKMeans (Shannon Quinn) |
| |
| MAHOUT-1656: Change SNAPSHOT version from 1.0 to 0.10.0 (smarthi) |
| |
| MAHOUT-1593: cluster-reuters.sh does not work complaining java.lang.IllegalStateException (smarthi via akm) |
| |
| MAHOUT-1661: All Lanczos modules marked as @Deprecated and slated for removal in future releases (Shannon Quinn) |
| |
| MAHOUT-1638: H2O bindings fail at drmParallelizeWithRowLabels(...) (Anand Avati via apalumbo) |
| |
| MAHOUT-1667: Hadoop 1 and 2 profile in POM (sslavic) |
| |
| MAHOUT-1564: Naive Bayes Classifier for New Text Documents (apalumbo) |
| |
| MAHOUT-1524: Script to auto-generate and view the Mahout website on a local machine (Saleem Ansari via apalumbo) |
| |
| MAHOUT-1589: Deprecate mahout.cmd due to lack of support |
| |
| MAHOUT-1655: Refactors mr-legacy into mahout-hdfs and mahout-mr, Spark now depends on much reduced mahout-hdfs |
| |
| MAHOUT-1522: Handle logging levels via log4j.xml (akm) |
| |
| MAHOUT-1602: Euclidean Distance Similarity Math (Leonardo Fernandez Sanchez, smarthi) |
| |
| MAHOUT-1619: HighDFWordsPruner overwrites cache files (Burke Webster, smarthi) |
| |
| MAHOUT-1516: classify-20newsgroups.sh failed: /tmp/mahout-work-jpan/20news-all does not exists in hdfs. (Jian Pan via apalumbo) |
| |
| MAHOUT-1559: Add documentation for and clean up the wikipedia classifier example (apalumbo) |
| |
| MAHOUT-1598: extend seq2sparse to handle multiple text blocks of same document (Wolfgang Buchnere via akm) |
| |
| MAHOUT-1659: Remove deprecated Lanczos solver from spectral clustering in mr-legacy (Shannon Quinn) |
| |
| MAHOUT-1612: NullPointerException happens during JSON output format for clusterdumper (smarthi, Manoj Awasthi) |
| |
| MAHOUT-1652: Java 7 update (smarthi) |
| |
| MAHOUT-1639: Streaming kmeans doesn't properly validate estimatedNumMapClusters -km (smarthi) |
| |
| MAHOUT-1493: Port Naive Bayes to Scala DSL (apalumbo) |
| |
| MAHOUT-1611: Preconditions.checkArgument in org.apache.mahout.utils.ConcatenateVectorsJob (Haishou Ma via smarthi) |
| |
| MAHOUT-1615: SparkEngine drmFromHDFS returning the same Key for all Key,Vec Pairs for Text-Keyed SequenceFiles (Anand Avati, dlyubimov, apalumbo) |
| |
| MAHOUT-1610: Update tests to pass in Java 8 (srowen) |
| |
| MAHOUT-1608: Add option in WikipediaToSequenceFile to remove category labels from documents (apalumbo) |
| |
| MAHOUT-1604: Spark version of rowsimilarity driver and associated additions to SimilarityAnalysis.scala (pferrel) |
| |
| MAHOUT-1500: H2O Integration (Anand Avati via apalumbo) |
| |
| MAHOUT-1606 - Add rowSums, rowMeans and diagonal extraction operations to distributed matrices (dlyubimov) |
| |
| MAHOUT-1603: Tweaks for Spark 1.0.x (dlyubimov & pferrel) |
| |
| MAHOUT-1596: implement rbind() operator (Anand Avati and dlyubimov) |
| |
| MAHOUT-1597: A + 1.0 (element-wise scala operation) gives wrong result if rdd is missing rows, Spark side (dlyubimov) |
| |
| MAHOUT-1595: MatrixVectorView - implement a proper iterateNonZero() (Anand Avati via dlyubimov) |
| |
| MAHOUT-1590 Mahout unit test failures due to guava version conflict on hadoop 2 (Venkat Ranganathan via sslavic) |
| |
| MAHOUT-1529(e): Move dense/sparse matrix test in mapBlock into spark (Anand Avati via dlyubimov) |
| |
| MAHOUT-1583: cbind() operator for Scala DRMs (dlyubimov) |
| |
| MAHOUT-1563: Eliminated warnings about multiple scala versions (sslavic) |
| |
| MAHOUT-1541, MAHOUT-1568, MAHOUT-1569: Created text-delimited file I/O traits and classes on spark, a MahoutDriver for a CLI and a ItemSimilairtyDriver using the CLI |
| |
| MAHOUT-1573: More explicit parallelism adjustments in math-scala DRM apis; elements of automatic parallelism management (dlyubimov) |
| |
| MAHOUT-1580: Optimize getNumNonZeroElements() (ssc) |
| |
| MAHOUT-1464: Cooccurrence Analysis on Spark (pat) |
| |
| MAHOUT-1578: Optimizations in matrix serialization (ssc) |
| |
| MAHOUT-1572: blockify() to detect (naively) the data sparsity in the loaded data (dlyubimov) |
| |
| MAHOUT-1571: Functional Views are not serialized as dense/sparse correctly (dlyubimov) |
| |
| MAHOUT-1566: (Experimental) Regular ALS factorizer with conversion tests, optimizer enhancements and bug fixes (dlyubimov) |
| |
| MAHOUT-1537: Minor fixes to spark-shell (Anand Avati via dlyubimov) |
| |
| MAHOUT-1529: Finalize abstraction of distributed logical plans from backend operations (dlyubimov) |
| |
| MAHOUT-1489: Interactive Scala & Spark Bindings Shell & Script processor (dlyubimov) |
| |
| MAHOUT-1346: Spark Bindings (DRM) (dlyubimov) |
| |
| MAHOUT-1555: Exception thrown when a test example has the label not present in training examples (Karol Grzegorczyk via smarthi) |
| |
| MAHOUT-1446: Create an intro for matrix factorization (Jian Wang via ssc) |
| |
| MAHOUT-1480: Clean up website on 20 newsgroups (Andrew Palumbo via ssc) |
| |
| MAHOUT-1561: cluster-syntheticcontrol.sh not running locally with MAHOUT_LOCAL=true (Andrew Palumbo via ssc) |
| |
| MAHOUT-1558: Clean up classify-wiki.sh and add in a binary classification problem (Andrew Palumbo via ssc) |
| |
| MAHOUT-1560: Last batch is not filled correctly in MultithreadedBatchItemSimilarities (Jarosław Bojar) |
| |
| MAHOUT-1554: Provide more comprehensive classification statistics (Karol Grzegorczyk via ssc) |
| |
| MAHOUT-1548: Fix broken links in quickstart webpage (Andrew Palumbo via ssc) |
| |
| MAHOUT-1542: Tutorial for playing with Mahout's Spark shell (ssc) |
| |
| MAHOUT-1533: Remove Frequent Pattern Mining (ssc) |
| |
| MAHOUT-1532: Add solve() function to the Scala DSL (ssc) |
| |
| MAHOUT-1530: Custom prompt and welcome message for the Spark Shell (ssc) |
| |
| MAHOUT-1527: Fix wikipedia classifier example (Andrew Palumbo via ssc) |
| |
| MAHOUT-1526: Ant file in examples (ssc) |
| |
| MAHOUT-1523: Remove @author tags in sparkbindings (ssc) |
| |
| MAHOUT-1521: lucene2seq - Error trying to load data from stored field (when non-indexed) (Terry Blankers via frankscholten) |
| |
| MAHOUT-1520: Fix links in Mahout website documentation (Saleem Ansari via smarthi) |
| |
| MAHOUT-1519: Remove StandardThetaTrainer (Andrew Palumbo via ssc) |
| |
| MAHOUT-1517: Remove casts to int in ALSWRFactorizer (ssc) |
| |
| MAHOUT-1513: Deprecate Canopy Clustering (ssc) |
| |
| MAHOUT-1511: Renaming core to mrlegacy (frankscholten) |
| |
| MAHOUT-1510: Goodbye MapReduce (ssc) |
| |
| MAHOUT-1509: Invalid URL in link from "quick start/basics" page (Nick Martin, smarthi) |
| |
| MAHOUT-1508: Performance problems with sparse matrices (ssc) |
| |
| MAHOUT-1505: structure of clusterdump's JSON output (akm) |
| |
| MAHOUT-1504: Enable/fix thetaSummer job in TrainNaiveBayesJob (Andrew Palumbo, smarthi) |
| |
| MAHOUT-1503: TestNaiveBayesDriver fails in sequential mode (Andrew Palumbo, smarthi) |
| |
| MAHOUT-1502: Update Naive Bayes Webpage to Current Implementation (Andrew Palumbo via ssc) |
| |
| MAHOUT-1501: ClusterOutputPostProcessorDriver has private default constructor (ssc) |
| |
| MAHOUT-1498: DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed using oozie (Sergey via ssc) |
| |
| MAHOUT-1497: mahout resplit not producing splited files (ssc) |
| |
| MAHOUT-1496: Create a website describing the distributed ALS recommender (Jian Wang via ssc) |
| |
| MAHOUT-1491: Spectral KMeans Clustering doesn't clean its /tmp dir and fails when seeing it again (smarthi) |
| |
| MAHOUT-1488: DisplaySpectralKMeans fails: examples/output/clusteredPoints/part-m-00000 does not exist (Saleem Ansari via smarthi) |
| |
| MAHOUT-1483: Organize links in web site navigation bar (akm) |
| |
| MAHOUT-1482: Rework quickstart website (Jian Wang via ssc) |
| |
| MAHOUT-1476: Cleanup website on Hidden Markov Models (akm) |
| |
| MAHOUT-1475: Cleanup website on Naive Bayes (smarthi) |
| |
| MAHOUT-1472: Cleanup website on fuzzy kmeans (smarthi) |
| |
| MAHOUT-1471: Cleanup website for Canopy clustering (smarthi) |
| |
| MAHOUT-1468: Creating a new page for StreamingKMeans documentation on mahout website (Maxim Arap and Pavan Kumar via akm) |
| |
| MAHOUT-1467: ClusterClassifier readPolicy leaks file handles (Avi Shinnar, smarthi) |
| |
| MAHOUT-1466: Cluster visualization fails to execute (ssc) |
| |
| MAHOUT-1465: Clean up README (akm) |
| |
| MAHOUT-1463: Modify OnlineSummarizers to use the TDigest dependency from Maven Central (tdunning, smarthi) |
| |
| MAHOUT-1460: Remove reference to Dirichlet in ClusterIterator (frankscholten) |
| |
| MAHOUT-1459: Move Hadoop related code out of CanopyClusterer (frankscholten) |
| |
| MAHOUT-1458: Remove KMeansConfigKeys and FuzzyKMeansConfigKeys (frankscholten) |
| |
| MAHOUT-1457: Move EigenSeedGenerator into spectral kmeans package (frankscholten) |
| |
| MAHOUT-1455: Forkcount config causes JVM crashes during build (frankscholten) |
| |
| MAHOUT-1451: Cleaning up the examples for clustering on the website (Gaurav Misra via ssc) |
| |
| MAHOUT-1450: Cleaning up clustering documentation on mahout website (Pavan Kumar) |
| |
| MAHOUT-1449: Update the Known Issues in Random Forests Page (Manoj Awasthi via ssc) |
| |
| MAHOUT-1448: In Random Forest, the training does not support multiple input files. The input dataset must be one single file. (Manoj Awasthi via ssc) |
| |
| MAHOUT-1447: ImplicitFeedbackAlternatingLeastSquaresSolver tests and features (Adam Ilardi via ssc) |
| |
| MAHOUT-1445: Create an intro for item based recommender (Nick Martin via ssc) |
| |
| MAHOUT-1440: Add option to set the RNG seed for inital cluster generation in Kmeans/fKmeans (Andrew Palumbo via ssc) |
| |
| MAHOUT-1438: "quickstart" tutorial for building a simple recommender (Maciej Mazur and Steve Cook via ssc) |
| |
| MAHOUT-1434: Dead links on the web ste (Kevin Moulart, smarthi) |
| |
| MAHOUT-1433: Make SVDRecommender look at all unknown items of a user per default (ssc) |
| |
| MAHOUT-1429: Parallelize YtransposeY in ImplicitFeedbackAlternatingLeastSquaresSolver (Adam Ilardi via ssc) |
| |
| MAHOUT-1428: Recommending already consumed items (Dodi Hakim via ssc) |
| |
| MAHOUT-1425: SGD classifier example with bank marketing dataset. (frankscholten) |
| |
| MAHOUT-1420: Add solr-recommender to examples (Pat Ferrel via akm) |
| |
| MAHOUT-1419: Random decision forest is excessively slow on numeric features (srowen) |
| |
| MAHOUT-1417: Random decision forest implementation fails in Hadoop 2 (srowen) |
| |
| MAHOUT-1416: Make access of DecisionForest.read(dataInput) less restricted (Manoj Awasthi via smarthi) |
| |
| MAHOUT-1415: Clone method on sparse matrices fails if there is an empty row which has not been set explicitly (till.rohrmann via ssc) |
| |
| MAHOUT-1413: Rework Algorithms page (ssc) |
| |
| MAHOUT-1388: Add command line support and logging for MLP (Yexi Jiang via ssc) |
| |
| MAHOUT-1385: Caching Encoders don't cache (Johannes Schulte, Manoj Awasthi via ssc) |
| |
| MAHOUT-1356: Ensure unit tests fail fast when writing outside mvn target directory (isabel, smarthi, dweiss, frankscholten, akm) |
| |
| MAHOUT-1329: Mahout for hadoop 2 (gcapan, Sergey Svinarchuk) |
| |
| MAHOUT-1310: Mahout support windows (Sergey Svinarchuk via ssc) |
| |
| MAHOUT-1278: Upgraded to apache parent pom version 16 (sslavic) |
| |
| Release 0.9 - 2014-02-01 |
| |
| MAHOUT-1387: Create page for release notes (ssc) |
| |
| MAHOUT-1411: Random test failures from TDigestTest (smarthi) |
| |
| MAHOUT-1410: clusteredPoints do not contain a vector id (smarthi, Andrew Musselman) |
| |
| MAHOUT-1409: MatrixVectorView has index check error (tdunning) |
| |
| MAHOUT-1402: Zero clusters using streaming k-means option in cluster-reuters.sh (smarthi) |
| |
| MAHOUT-1401: Resurrect Frequent Pattern mining (smarthi) |
| |
| MAHOUT-1400: Remove references to deprecated and removed algorithms from examples scripts (ssc) |
| |
| MAHOUT-1399: Fixed multiple slf4j bindings when running Mahout examples issue (sslavic) |
| |
| MAHOUT-1398: FileDataModel should provide a constructor with a delimiterPattern (Roy Guo via ssc) |
| |
| MAHOUT-1396: Accidental use of commons-math won't work with next Hadoop 2 release (srowen) |
| |
| MAHOUT-1394: Undeprecate Lanczos (ssc) |
| |
| MAHOUT-1393: Remove duplicated code from getTopTerms and getTopFeatures in AbstractClusterWriter (Diego Carrion via smarthi) |
| |
| MAHOUT-1392: Streaming KMeans should write centroid output to a 'part-r-xxxx' file when executed in sequential mode (smarthi) |
| |
| MAHOUT-1390: SVD hangs for certain inputs (tdunning) |
| |
| MAHOUT-1389: Complementary Naive Bayes Classifier not getting called when "-c" option is activated (Gouri Shankar Majumdar via smarthi) |
| |
| MAHOUT-1384: Executing the MR version of Naive Bayes/CNB of classify_20newgroups.sh fails in seqdirectory step (smarthi) |
| |
| MAHOUT-1382: Upgrade Mahout third party jars for 0.9 Release (smarthi) |
| |
| MAHOUT-1380: Streaming KMeans fails when executed in Sequential Mode (smarthi) |
| |
| MAHOUT-1379: ClusterQualitySummarizer fails with the new T-Digest for clusters with 1 data point (smarthi) |
| |
| MAHOUT-1378: Running Random Forest with Ignored features fails when loading feature descriptor from JSON file (Sam Wu via smarthi) |
| |
| MAHOUT-1377: Exclude JUnit.jar from tarball (Sergey Svinarchuk via smarthi) |
| |
| MAHOUT-1374: Ability to provide input file with userid, itemid pair (Aliaksei Litouka via ssc) |
| |
| MAHOUT-1371: Arff loader can misinterpret nominals with integer, real or string (Mansur Iqbal via smarthi) |
| |
| MAHOUT-1370: Vectordump doesn't write to output file in MapReduce Mode (smarthi) |
| |
| MAHOUT-1368: Convert OnlineSummarizer to use the new TDigest (tdunning) |
| |
| MAHOUT-1367: WikipediaXmlSplitter --> Exception in thread "main" java.lang.NullPointerException (smarthi) |
| |
| MAHOUT-1364: Upgrade Mahout codebase to Lucene 4.6 (Frank Scholten) |
| |
| MAHOUT-1363: Rebase packages in mahout-scala (dlyubimov) |
| |
| MAHOUT-1362: Remove examples/bin/build-reuters.sh (smarthi) |
| |
| MAHOUT-1361: Online algorithm for computing accurate Quantiles using 1-D clustering (tdunning) |
| |
| MAHOUT-1358: StreamingKMeansThread throws IllegalArgumentException when REDUCE_STREAMING_KMEANS is set to true (smarthi) |
| |
| MAHOUT-1355: InteractionValueEncoder produces wrong traceDictionary entries (Johannes Schulte via smarthi) |
| |
| MAHOUT-1353: Visibility of preparePreferenceMatrix directory location (Pat Ferrel, ssc) |
| |
| MAHOUT-1352: Option to change RecommenderJob output format (Pat Ferrel, ssc) |
| |
| MAHOUT-1351: Adding DenseVector support to AbstractCluster (David DeBarr via smarthi) |
| |
| MAHOUT-1349: Clusterdumper/loadTermDictionary crashes when highest index in (sparse) dictionary vector is larger than dictionary vector size (Andrew Musselman via smarthi) |
| |
| MAHOUT-1347: Add Streaming K-Means clustering algorithm to examples/bin/cluster-reuters.sh (smarthi) |
| |
| MAHOUT-1345: Enable randomised testing for all Mahout modules (Dawid Weiss, Isabel, sslavic, Frank Scholten, smarthi) |
| |
| MAHOUT-1343: JSON output format support in cluster dumper (Telvis Calhoun via sslavic) |
| |
| MAHOUT-1333: Fixed examples bin directory permissions in distribution archives (Mike Percy via sslavic) |
| |
| MAHOUT-1319: seqdirectory -filter argument silently ignored when run as MR (smarthi) |
| |
| MAHOUT-1317: Clarify some of the messages in Preconditions.checkArgument (Nikolai Grinko, smarthi) |
| |
| MAHOUT-1314: StreamingKMeansReducer throws NullPointerException when REDUCE_STREAMING_KMEANS is set to true (smarthi) |
| |
| MAHOUT-1313: Fixed unwanted integral division bug in RowSimilarityJob downsampling code where precision should have been retained (sslavic) |
| |
| MAHOUT-1312: LocalitySensitiveHashSearch does not limit search results (sslavic) |
| |
| MAHOUT-1308: Cannot extend CandidateItemsStrategy due to restricted visibility (David Geiger, smarthi) |
| |
| MAHOUT-1301: toString() method of SequentialAccessSparseVector has excess comma at the end (Alexander Senov, smarthi) |
| |
| MAHOUT-1297: New module for linear algebra scala DSL (dlyubimov) |
| |
| MAHOUT-1296: Remove deprecated algorithms (ssc) |
| |
| MAHOUT-1295: Excluded all Maven's target directories from distribution archives (sslavic) |
| |
| MAHOUT-1294: Cleanup previously installed artifacts from CI server local repository (sslavic) |
| |
| MAHOUT-1293: Source distribution tar.gz archive cannot be unpacked on Linux (sslavic) |
| |
| MAHOUT-1292: lucene2seq should validate the 'id' field (Frank Scholten via smarthi) |
| |
| MAHOUT-1291: MahoutDriver yields cosmetically suboptimal exception when bin/mahout runs without args, on some Hadoop versions (srowen) |
| |
| MAHOUT-1290: Issue when running Mahout Recommender Demo (Helder Garay Martins via smarthi) |
| |
| MAHOUT-1289: Move downsampling code into RowSimilarityJob (ssc) |
| |
| MAHOUT-1287: classifier.sgd.CsvRecordFactory incorrectly parses CSV format (Alex Franchuk via smarthi) |
| |
| MAHOUT-1285: Arff loader can misparse string data as double (smarthi) |
| |
| MAHOUT-1284: DummyRecordWriter's bug with reused Writables (Maysam Yabandeh via smarthi) |
| |
| MAHOUT-1275: Dropped bz2 distribution format for source and binaries (sslavic) |
| |
| MAHOUT-1265: Multilayer Perceptron (Yexi Jiang via smarthi) |
| |
| MAHOUT-1261: TasteHadoopUtils.idToIndex can return an int that has size Integer.MAX_VALUE (Carl Clark, smarthi) |
| |
| MAHOUT-1242: No key redistribution function for associative maps (Tharindu Rusira via smarthi) |
| |
| MAHOUT-1030: Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable (Andrew Musselman, Pat Ferrel, Jeff Eastman, Lars Norskog, smarthi) |
| |
| Release 0.8 - 2013-07-25 |
| |
| MAHOUT-1272: Parallel SGD matrix factorizer for SVDrecommender (Peng Cheng via ssc) |
| |
| MAHOUT-1271: classify-20newsgroups.sh fails during the seqdirectory step (smarthi) |
| |
| MAHOUT-1269: Cleanup deprecated Lucene 3.x API calls in lucene2seq utility unit tests (smarthi) |
| |
| MAHOUT-833: Make conversion to sequence files map-reduce (Josh Patterson, smarthi) |
| |
| MAHOUT-1268: Wrong output directory for CVB (Mark Wicks via ssc) |
| |
| MAHOUT-1264: Performance optimizations in RecommenderJob (ssc) |
| |
| MAHOUT-1262: Cleanup LDA code (ssc) |
| |
| MAHOUT-1255: Fix for weights in Multinomial sometimes overflowing in BallKMeans (dfilimon) |
| |
| MAHOUT-1254: Final round of cleanup for StreamingKMeans (dfilimon) |
| |
| MAHOUT-1263: Serialise/Deserialise Lambda value for OnlineLogisticRegression (Mike Davy via smarthi) |
| |
| MAHOUT-1258: Another shot at findbugs and checkstyle (ssc) |
| |
| MAHOUT-1253: Add experiment tools for StreamingKMeans, part 1 (dfilimon) |
| |
| MAHOUT-884: Matrix Concatenate Utility (Lance Norskog via smarthi) |
| |
| MAHOUT-1250: Deprecate unused algorithms (ssc) |
| |
| MAHOUT-1251: Optimize MinHashMapper (ssc) |
| |
| MAHOUT-1211: Disabled swallowing of IOExceptions is Closeables.close for writers (dfilimon) |
| |
| MAHOUT-1164: Make ARFF integration generate meta-data in JSON format (Marty Kube via ssc) |
| |
| MAHOUT-1164: Make ARFF integration generate meta-data in JSON format (Marty Kube via ssc) |
| |
| MAHOUT-1163: Make random forest classifier meta-data file human readable (Marty Kube via ssc) |
| |
| MAHOUT-1243: Dictionary file format in Lucene-Mahout integration is not in SequenceFileFormat (ssc) |
| |
| MAHOUT-974: org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId (ssc) |
| |
| MAHOUT-1052: Add an option to MinHashDriver that specifies the dimension of vector to hash (indexes or values) (Elena Smirnova via smarthi) |
| |
| MAHOUT-1237: Total cluster cost isn't computed properly (dfilimon) |
| |
| MAHOUT-1196: LogisticModelParameters uses csv.getTargetCategories() even if csv is not used. (Vineet Krishnan via ssc) |
| |
| MAHOUT-1224: Add the option of running a StreamingKMeans pass in the Reducer before BallKMeans (dfilimon) |
| |
| MAHOUT-993: Some vector dumper flags are expecting arguments. (Andrew Look via robinanil) |
| |
| MAHOUT-1228: Cleanup .gitignore (Stevo Slavic via ssc) |
| |
| MAHOUT-1047: CVB hangs after completion (Angel Martinez Gonzalez via smarthi) |
| |
| MAHOUT-1235: ParallelALSFactorizationJob does not use VectorSumCombiner (ssc) |
| |
| MAHOUT-1230: SparceMatrix.clone() is not deep copy (Maysam Yabandeh via tdunning) |
| |
| MAHOUT-1232: VectorHelper.topEntries() throws a NPE when number of NonZero elements in vector < maxEntries (smarthi) |
| |
| MAHOUT-1229: Conf directory content from Mahout distribution archives cannot be unpacked (Stevo Slavic via smarthi) |
| |
| MAHOUT-1213: SSVD job doesn't clean it's temp dir, and fails when seeing it again (smarthi) |
| |
| MAHOUT-1223: Fixed point skipped in StreamingKMeans when iterating through centroids from a reducer (dfilimon) |
| |
| MAHOUT-1222: Fix total weight in FastProjectionSearch (dfilimon) |
| |
| MAHOUT-1219: Remove LSHSearcher from StreamingKMeansTest. It causes it to sometimes fail (dfilimon) |
| |
| MAHOUT-1221: SparseMatrix.viewRow is sometimes readonly. (Maysam Yabandeh via smarthi) |
| |
| MAHOUT-1219: Remove LSHSearcher from SearchQualityTest. It causes it to fail, but the failure is not very meaningful (dfilimon) |
| |
| MAHOUT-1217: Nearest neighbor searchers sometimes fail to remove points: fix in FastProjectionSearch's searchFirst (dfilimon) |
| |
| MAHOUT-1216: Add locality sensitive hashing and a LocalitySensitiveHash searcher (dfilimon) |
| |
| MAHOUT-1181: Adding StreamingKMeans MapReduce classes (dfilimon) |
| |
| MAHOUT-1212: Incorrect classify-20newsgroups.sh file description (Julian Ortega via smarthi) |
| |
| MAHOUT-1209: DRY out maven-compiler-plugin configuration (Stevo Slavic via smarthi) |
| |
| MAHOUT-1207: Fix typos in description in parent pom (Stevo Slavic via smarthi) |
| |
| MAHOUT-1199: Improve javadoc comments of mahout-integration (Angel Martinez Gonzalez via smarthi) |
| |
| MAHOUT-1162: Adding BallKMeans and StreamingKMeans clustering algorithms (dfilimon) |
| |
| MAHOUT-1205: ParallelALSFactorizationJob should leverage the distributed cache (ssc) |
| |
| MAHOUT-1156: Adding nearest neighbor Searchers (dfilimon) |
| |
| MAHOUT-1202: Speed up Vector operations (dfilimon) |
| |
| MAHOUT-1155: Make MatrixSlice a Vector (and fix Centroid cloning; MAHOUT-1202) (dfilimon) |
| |
| MAHOUT-1189: CosineDistanceMeasure doesn't return 0 for two 0 vectors (dfilimon) |
| |
| MAHOUT-1180: Multinomial<T> throws ConcurrentModificationException when iterating and setting probabilities (dfilimon) |
| |
| MAHOUT-1192: Speed up Vector Operations (robinanil) |
| |
| MAHOUT-1191: Cleanup Vector Benchmarks make it less variable (robinanil) |
| |
| MAHOUT-1190: SequentialAccessSparseVector function assignment is very slow and other iterator woes (robinanil) |
| |
| MAHOUT-1188: Inconsistent reference to Lucene versions in code and POM (smarthi) |
| |
| MAHOUT-1161: Unable to run CJKAnalyzer for conversion of a sequence file to sparse vector due to instantiation exception (ssc) |
| |
| MAHOUT-1187: Update Commons Lang to Commons Lang3 (smarthi) |
| |
| MAHOUT-1184 Another take at pmd, findbugs and checkstyle (ssc) |
| |
| MAHOUT-1182: Remove useless append (Dave Brosius via tdunning) |
| |
| MAHOUT-1176: Introduce a changelog file to raise contributors attribution (ssc) |
| |
| MAHOUT-1108: Allows cluster-reuters.sh example to be executed on a cluster (elmer.garduno via gsingers) |
| |
| MAHOUT-961: Fix issue in decision forest tree visualizer to properly show stems of tree (Ikumasa Mukai via gsingers) |
| |
| MAHOUT-944: Create SequenceFiles out of Lucene document storage (no term vectors required) (Frank Scholten, gsingers) |
| |
| MAHOUT-958: Fix issue with globs in RepresentativePointsDriver (Adam Baron, Vikram Dixit K, ehgjr via gsingers) |
| |
| MAHOUT-1084: Fixed issue with too many clusters in synthetic control example (liutengfei, gsingers) |
| |
| MAHOUT-1103: Fixed issue with splitting clusters on Hadoop (Matt Molek, gsingers) |
| |
| MAHOUT-1126: Filter out bad META-INF files in job packaging (Pat Ferrel, gsingers) |
| |
| MAHOUT-1211: Change deprecated Closeables.closeQuietly calls (smarthi, gsingers, srowen, dlyubimov) |