| |
| Apache Solr Release Notes |
| |
| Introduction |
| ------------ |
| Solr is the popular, blazing fast open source enterprise search platform from |
| the Apache Lucene project. Its major features include powerful full-text |
| search, hit highlighting, faceted search, dynamic clustering, database |
| integration, and rich document (e.g., Word, PDF) handling. Solr is highly |
| scalable, providing distributed search and index replication, and it powers the |
| search and navigation features of many of the world's largest internet sites. |
| |
| Solr is written in Java and runs as a standalone full-text search server within |
| a servlet container such as Tomcat. Solr uses the Lucene Java search library at |
| its core for full-text indexing and search, and has REST-like HTTP/XML and JSON |
| APIs that make it easy to use from virtually any programming language. Solr's |
| powerful external configuration allows it to be tailored to almost any type of |
| application without Java coding, and it has an extensive plugin architecture |
| when more advanced customization is required. |
| |
| See README.txt and http://lucene.apache.org/solr for more information |
| on how to get started. |
| |
| ================== 3.5.0 ================== |
| |
| New Features |
| ---------------------- |
| * SOLR-2749: Add boundary scanners for FastVectorHighlighter. <boundaryScanner/> |
| can be specified with a name in solrconfig.xml, and use hl.boundaryScanner=name |
| parameter to specify the named <boundaryScanner/>. (koji) |
| |
| * SOLR-2066,SOLR-2776: Added support for distributed grouping. |
| (Martijn van Groningen, Jasper van Veghel, Matt Beaumont) |
| |
| * SOLR-2769: Added factory for the new Hunspell stemmer capable of doing stemming |
| for 99 languages (janhoy, cmale) |
| |
| Optimizations |
| ---------------------- |
| |
| * SOLR-2742: SolrJ: Provide commitWithinMs as optional parameter for all add() methods, |
| making the feature more conveniently accessible for developers (janhoy) |
| |
| Bug Fixes |
| ---------------------- |
| |
| * SOLR-2762: (backport form 4.x line): FSTLookup could return duplicate |
| results or one results less than requested. (David Smiley, Dawid Weiss) |
| |
| * SOLR-2748: The CommitTracker used for commitWith or autoCommit by maxTime |
| could commit too frequently and could block adds until a new seaercher was |
| registered. (yonik) |
| |
| * SOLR-2726: Fixed NullPointerException when using spellcheck.q with Suggester. |
| (Bernd Fehling, valentin via rmuir) |
| |
| * SOLR-2772: Fixed Date parsing/formatting of years 0001-1000 (hossman) |
| |
| * SOLR-2763: Extracting update request handler throws exception and returns 400 |
| when zero-length file posted using multipart form post (janhoy) |
| |
| * SOLR-2780: Fixed issue where multi select facets didn't respect group.truncate parameter. |
| (Martijn van Groningen, Ramzi Alqrainy) |
| |
| Other Changes |
| ---------------------- |
| |
| * SOLR-2750: Make both "update.chain" and the deprecated "update.param" work |
| consistently everywhere; see also SOLR-2105. (Mark Miller, janhoy) |
| |
| * LUCENE-3410: Deprecated the WordDelimiterFilter constructors accepting multiple |
| ints masquerading as booleans. Preferred constructor now accepts a single int |
| bitfield (Chris Male) |
| |
| * SOLR-2758: Moved ConcurrentLRUCache from o.a.s.common.util package in the solrj |
| module to the o.a.s.util package in the Solr core module. |
| (David Smiley via Steve Rowe) |
| |
| * SOLR-2756: Maven configuration: Removed unused zookeeper dependency; |
| conditionalized geronimo-stax-api dependency to be in force only under |
| Java 1.5; excluded transitive stax:stax-api dependency from |
| org.codehaus.woodstox:wstx-asl dependency. (David Smiley, Steve Rowe) |
| |
| * SOLR-2770: 'ant generate-maven-artifacts' should generate the maven |
| artifact for the Solr-specific jdk15-compiled carrot2-core dependency. |
| (Steve Rowe) |
| |
| * SOLR-2766: Package individual javadoc sites for solrj and test-framework. |
| (Steve Rowe, Mike McCandless) |
| |
| * SOLR-2771: Solr modules' tests should not depend on solr-core test classes; |
| move BufferingRequestProcessor from solr-core tests to test-framework so that |
| the Solr Cell module can use it. (janhoy, Steve Rowe) |
| |
| ================== 3.4.0 ================== |
| |
| Upgrading from Solr 3.3 |
| ---------------------- |
| |
| * The Lucene index format has changed and as a result, once you upgrade, |
| previous versions of Solr will no longer be able to read your indices. |
| In a master/slave configuration, all searchers/slaves should be upgraded |
| before the master. If the master were to be updated first, the older |
| searchers would not be able to read the new index format. |
| |
| * Previous versions of Solr silently allow and ignore some contradictory |
| properties specified in schema.xml. For example: |
| - indexed="false" omitNorms="false" |
| - indexed="false" omitTermFreqAndPositions="false" |
| Field property validation has now been fixed, to ensure that |
| contradictions like these now generate error messages. If users |
| have existing schemas that generate one of these new "conflicting |
| 'false' field options for non-indexed field" error messages the |
| conflicting "omit*" properties can safely be removed, or changed to |
| "true" for consistent behavior with previous Solr versions. This |
| situation has now been fixed to cause an error on startup when these |
| contradictory options. See SOLR-2669. |
| |
| * FacetComponent no longer catches and embeds exceptions occurred during facet |
| processing, it throws HTTP 400 or 500 exceptions instead. |
| |
| New Features |
| ---------------------- |
| |
| * SOLR-2540: CommitWithin as an Update Request parameter |
| You can now specify &commitWithin=N (ms) on the update request (janhoy) |
| |
| * SOLR-2383: /browse improvements: generalize range and date facet display |
| Backported from trunk. Removed facet.date support in Velocity, since this is |
| deprecated in favor of facet.range (gsingers, janhoy) |
| |
| * SOLR-2458: post.jar enhanced to handle JSON, CSV and <optimize> (janhoy) |
| |
| * LUCENE-3234: add a new parameter hl.phraseLimit for FastVectorHighlighter speed up. |
| (Mike Sokolov via koji) |
| |
| * SOLR-2429: Ability to add cache=false to queries and query filters to avoid |
| using the filterCache or queryCache. A cost may also be specified and is used |
| to order the evaluation of non-cached filters from least to greatest cost . |
| For very expensive query filters (cost >= 100) if the query implements |
| the PostFilter interface, it will be used to obtain a Collector that is |
| checked only for documents that match the main query and all other filters. |
| The "frange" query now implements the PostFilter interface. (yonik) |
| |
| * SOLR-2630: Added new XsltUpdateRequestHandler that works like |
| XmlUpdateRequestHandler but allows to transform the POSTed XML document |
| using XSLT. This allows to POST arbitrary XML documents to the update |
| handler, as long as you also provide a XSL to transform them to a valid |
| Solr input document. (Upayavira, Uwe Schindler) |
| |
| * SOLR-2615: Log individual updates (adds and deletes) at the FINE level |
| before adding to the index. Fix a null pointer exception in logging |
| when there was no unique key. (David Smiley via yonik) |
| |
| * LUCENE-2048: Added omitPositions to the schema, so you can omit position |
| information while still indexing term frequencies. (rmuir) |
| |
| * SOLR-2584: add UniqFieldsUpdateProcessor that removes duplicate values in the |
| specified fields. (Elmer Garduno, koji) |
| |
| * SOLR-2670: Added NIOFSDirectoryFactory (yonik) |
| |
| * SOLR-2523: Added support in SolrJ to easily interact with range facets. |
| The range facet response can be parsed and is retrievable from the |
| QueryResponse class. The SolrQuery class has convenient methods for using |
| range facets. (Martijn van Groningen) |
| |
| * SOLR-2637: Added support for group result parsing in SolrJ. |
| (Tao Cheng, Martijn van Groningen) |
| |
| * SOLR-2665: Added post group faceting. Facet counts are based on the most |
| relevant document of each group matching the query. This feature has the |
| same impact on the StatsComponent. (Martijn van Groningen) |
| |
| * SOLR-2675: CoreAdminHandler now allows arbitrary properties to be |
| specified when CREATEing a new SolrCore using property.* request |
| params. (Yury Kats, hossman) |
| |
| * SOLR-2714: JSON update format - "null" field values are now dropped |
| instead of causing an exception. (Trygve Laugstøl, yonik) |
| |
| |
| Optimizations |
| ---------------------- |
| |
| * LUCENE-3233: Improved memory usage, build time, and performance of |
| SynonymFilterFactory. (Mike McCandless, Robert Muir) |
| |
| Bug Fixes |
| ---------------------- |
| |
| * SOLR-2625: TermVectorComponent throws NPE if TF-IDF option is used without DF |
| option. (Daniel Erenrich, Simon Willnauer) |
| |
| * SOLR-2631: PingRequestHandler should not allow to ping itsself using "qt" |
| param to prevent infinite loop. (Edoardo Tosca, Uwe Schindler) |
| |
| * SOLR-2636: Fix explain functionality for negative queries. (Tom Hill via yonik) |
| |
| * SOLR-2538: Range Faceting on long/double fields could overflow if values |
| bigger then the max int/float were used. |
| (Erbi Hanka, hossman) |
| |
| * SOLR-2230: CommonsHttpSolrServer.addFile could not be used to send |
| multiple files in a single request. |
| (Stephan Günther, hossman) |
| |
| * SOLR-2541: PluginInfos was not correctly parsing <long/> tags when |
| initializing plugins |
| (Frank Wesemann, hossman) |
| |
| * SOLR-2623: Solr JMX MBeans do not survive core reloads (Alexey Serba, shalin) |
| |
| * Fixed grouping bug when start is bigger than rows and format is simple that zero documents are returned even |
| if there are documents to display. (Martijn van Groningen, Nikhil Chhaochharia) |
| |
| * SOLR-2564: Fixed ArrayIndexOutOfBoundsException when using simple format and |
| start > 0 (Martijn van Groningen, Matteo Melli) |
| |
| * SOLR-2642: Fixed sorting by function when using grouping. (Thomas Heigl, Martijn van Groningen) |
| |
| * SOLR-2535: REGRESSION: in Solr 3.x and trunk the admin/file handler |
| fails to show directory listings (David Smiley, Peter Wolanin via Erick Erickson) |
| |
| * SOLR-2545: ExternalFileField file parsing would fail if any key |
| contained an "=" character. It now only looks for the last "=" delimiter |
| prior to the float value. |
| (Markus Jelsma, hossman) |
| |
| * SOLR-2662: When Solr is configured to have no queryResultCache, the |
| "start" parameter was not honored and the documents returned were |
| 0 through start+offset. (Markus Jelsma, yonik) |
| |
| * SOLR-2669: Fix backwards validation of field properties in |
| SchemaField.calcProps (hossman) |
| |
| * SOLR-2676: Add "welcome-file-list" to solr.war so admin UI works correctly |
| in servlet containers such as WebSphere that do not use a default list |
| (Jay R. Jaeger, hossman) |
| |
| * SOLR-2682: Remove addException() in SimpleFacet. FacetComponent no longer catches and embeds |
| exceptions occurred during facet processing, it throws HTTP 400 or 500 exceptions instead. (koji) |
| |
| * SOLR-2606: Fixed sort parsing of fields containing punctuation that |
| failed due to sort by function changes introduced in SOLR-1297 |
| (Mitsu Hadeishi, hossman) |
| |
| * SOLR-2734: Fix debug info for MorLikeThisHandler (introduced when SOLR-860 was backported to 3x). |
| (Andrés Cobas, hossman via koji) |
| |
| Other Changes |
| ---------------------- |
| |
| * SOLR-2629: Eliminate deprecation warnings in some JSPs. |
| (Bernd Fehling, hossman) |
| |
| Build |
| ---------------------- |
| |
| * SOLR-2452,SOLR-2653,LUCENE-3323,SOLR-2659,LUCENE-3329,SOLR-2666: |
| Rewrote the Solr build system: |
| - Integrated more fully with the Lucene build system: generalized the |
| Lucene build system and eliminated duplication. |
| - Converted all Solr contribs to the Lucene/Solr conventional src/ layout: |
| java/, resources/, test/, and test-files/<contrib-name>. |
| - Created a new Solr-internal module named "core" by moving the java/, |
| test/, and test-files/ directories from solr/src/ to solr/core/src/. |
| - Merged solr/src/webapp/src/ into solr/core/src/java/. |
| - Eliminated solr/src/ by moving all its directories up one level; |
| renamed solr/src/site/ to solr/site-src/ because solr/site/ already |
| exists. |
| - Merged solr/src/common/ into solr/solrj/src/java/. |
| - Moved o.a.s.client.solrj.* and o.a.s.common.* tests from |
| solr/src/test/ to solr/solrj/src/test/. |
| - Made the solrj tests not depend on the solr core tests by moving |
| some classes from solr/src/test/ to solr/test-framework/src/java/. |
| - Each internal module (core/, solrj/, test-framework/, and webapp/) |
| now has its own build.xml, from which it is possible to run |
| module-specific targets. solr/build.xml delegates all build |
| tasks (via <ant dir="internal-module-dir"> calls) to these |
| modules' build.xml files. |
| (Steve Rowe, Robert Muir) |
| |
| * LUCENE-3406: Add ant target 'package-local-src-tgz' to Lucene and Solr |
| to package sources from the local working copy. |
| (Seung-Yeoul Yang via Steve Rowe) |
| |
| Documentation |
| ---------------------- |
| |
| ================== 3.3.0 ================== |
| |
| Upgrading from Solr 3.2.0 |
| ---------------------- |
| * SolrCore's CloseHook API has been changed in a backward-incompatible way. It |
| has been changed from an interface to an abstract class. Any custom |
| components which use the SolrCore.addCloseHook method will need to |
| be modified accordingly. To migrate, put your old CloseHook#close impl into |
| CloseHook#preClose. |
| |
| New Features |
| ---------------------- |
| |
| * SOLR-2378: A new, automaton-based, implementation of suggest (autocomplete) |
| component, offering an order of magnitude smaller memory consumption |
| compared to ternary trees and jaspell and very fast lookups at runtime. |
| (Dawid Weiss) |
| |
| * SOLR-2400: Field- and DocumentAnalysisRequestHandler now provide a position |
| history for each token, so you can follow the token through all analysis stages. |
| The output contains a separate int[] attribute containing all positions from |
| previous Tokenizers/TokenFilters (called "positionHistory"). |
| (Uwe Schindler) |
| |
| * SOLR-2524: (SOLR-236, SOLR-237, SOLR-1773, SOLR-1311) Grouping / Field collapsing |
| using the Lucene grouping contrib. The search result can be grouped by field and query. |
| (Martijn van Groningen, Emmanuel Keller, Shalin Shekhar Mangar, Koji Sekiguchi, |
| Iván de Prado, Ryan McKinley, Marc Sturlese, Peter Karich, Bojan Smid, |
| Charles Hornberger, Dieter Grad, Dmitry Lihachev, Doug Steigerwald, |
| Karsten Sperling, Michael Gundlach, Oleg Gnatovskiy, Thomas Traeger, |
| Harish Agarwal, yonik, Michael McCandless, Bill Bell) |
| |
| * SOLR-1331: Added a srcCore parameter to CoreAdminHandler's mergeindexes action |
| to merge one or more cores' indexes to a target core (shalin) |
| |
| * SOLR-2610 -- Add an option to delete index through CoreAdmin UNLOAD action (shalin) |
| |
| Optimizations |
| ---------------------- |
| |
| * SOLR-2567: Solr now defaults to TieredMergePolicy. See http://s.apache.org/merging |
| for more information. (rmuir) |
| |
| Bug Fixes |
| ---------------------- |
| |
| * SOLR-2519: Improve text_* fieldTypes in example schema.xml: improve |
| cross-language defaults for text_general; break out separate |
| English-specific fieldTypes (Jan Høydahl, hossman, Robert Muir, |
| yonik, Mike McCandless) |
| |
| * SOLR-2462: Fix extremely high memory usage problems with spellcheck.collate. |
| Separately, an additional spellcheck.maxCollationEvaluations (default=10000) |
| parameter is added to avoid excessive CPU time in extreme cases (e.g. long |
| queries with many misspelled words). (James Dyer via rmuir) |
| |
| Other Changes |
| ---------------------- |
| |
| * SOLR-2620: Removed unnecessary log4j jar from clustering contrib (Dawid Weiss). |
| |
| * SOLR-2571: Add a commented out example of the spellchecker's thresholdTokenFrequency |
| parameter to the example solrconfig.xml, and also add a unit test for this feature. |
| (James Dyer via rmuir) |
| |
| * SOLR-2576: Deprecate SpellingResult.add(Token token, int docFreq), please use |
| SpellingResult.addFrequency(Token token, int docFreq) instead. |
| (James Dyer via rmuir) |
| |
| * SOLR-2574: Upgrade slf4j to v1.6.1 (shalin) |
| |
| * LUCENE-3204: The maven-ant-tasks jar is now included in the source tree; |
| users of the generate-maven-artifacts target no longer have to manually |
| place this jar in the Ant classpath. NOTE: when Ant looks for the |
| maven-ant-tasks jar, it looks first in its pre-existing classpath, so |
| any copies it finds will be used instead of the copy included in the |
| Lucene/Solr source tree. For this reason, it is recommeded to remove |
| any copies of the maven-ant-tasks jar in the Ant classpath, e.g. under |
| ~/.ant/lib/ or under the Ant installation's lib/ directory. (Steve Rowe) |
| |
| * SOLR-2611: Fix typos in the example configuration (Eric Pugh via rmuir) |
| |
| ================== 3.2.0 ================== |
| Versions of Major Components |
| --------------------- |
| Apache Tika 0.8 |
| Carrot2 3.5.0 |
| |
| |
| Upgrading from Solr 3.1 |
| ---------------------- |
| |
| * The updateRequestProcessorChain for a RequestHandler is now defined |
| with update.chain rather than update.processor. The latter still works, |
| but has been deprecated. |
| |
| Detailed Change List |
| ---------------------- |
| |
| New Features |
| ---------------------- |
| |
| * SOLR-2496: Add ability to specify overwrite and commitWithin as request |
| parameters (e.g. specified in the URL) when using the JSON update format, |
| and added a simplified format for specifying multiple documents. |
| Example: [{"id":"doc1"},{"id":"doc2"}] |
| (yonik) |
| |
| * SOLR-2113: Add TermQParserPlugin, registered as "term". This is useful |
| when generating filter queries from terms returned from field faceting or |
| the terms component. Example: fq={!term f=weight}1.5 (hossman, yonik) |
| |
| * SOLR-1915: DebugComponent now supports using a NamedList to model |
| Explanation objects in it's responses instead of |
| Explanation.toString (hossman) |
| |
| Optimizations |
| ---------------------- |
| |
| Bug Fixes |
| ---------------------- |
| |
| * SOLR-2445: Change the default qt to blank in form.jsp, because there is no "standard" |
| request handler unless you have it in your solrconfig.xml explicitly. (koji) |
| |
| * SOLR-2455: Prevent double submit of forms in admin interface. |
| (Jeffrey Chang via uschindler) |
| |
| * SOLR-2464: Fix potential slowness in QueryValueSource (the query() function) when |
| the query is very sparse and may not match any documents in a segment. (yonik) |
| |
| * SOLR-2469: When using java replication with replicateAfter=startup, the first |
| commit point on server startup is never removed. (yonik) |
| |
| * SOLR-2466: SolrJ's CommonsHttpSolrServer would retry requests on failure, regardless |
| of the configured maxRetries, due to HttpClient having it's own retry mechanism |
| by default. The retryCount of HttpClient is now set to 0, and SolrJ does |
| the retry. (yonik) |
| |
| * SOLR-2409: edismax parser - treat the text of a fielded query as a literal if the |
| fieldname does not exist. For example Mission: Impossible should not search on |
| the "Mission" field unless it's a valid field in the schema. (Ryan McKinley, yonik) |
| |
| * SOLR-2403: facet.sort=index reported incorrect results for distributed search |
| in a number of scenarios when facet.mincount>0. This patch also adds some |
| performance/algorithmic improvements when (facet.sort=count && facet.mincount=1 |
| && facet.limit=-1) and when (facet.sort=index && facet.mincount>0) (yonik) |
| |
| * SOLR-2333: The "rename" core admin action does not persist the new name to solr.xml |
| (Rasmus Hahn, Paul R. Brown via Mark Miller) |
| |
| * SOLR-2390: Performance of usePhraseHighlighter is terrible on very large Documents, |
| regardless of hl.maxDocCharsToAnalyze. (Mark Miller) |
| |
| * SOLR-2474: The helper TokenStreams in analysis.jsp and AnalysisRequestHandlerBase |
| did not clear all attributes so they displayed incorrect attribute values for tokens |
| in later filter stages. (uschindler, rmuir, yonik) |
| |
| * SOLR-2467: Fix <analyzer class="..." /> initialization so any errors |
| are logged properly. (hossman) |
| |
| * SOLR-2493: SolrQueryParser was fixed to not parse the SolrConfig DOM tree on each |
| instantiation which is a huge slowdown. (Stephane Bailliez via uschindler) |
| |
| * SOLR-2495: The JSON parser could hang on corrupted input and could fail |
| to detect numbers that were too large to fit in a long. (yonik) |
| |
| * SOLR-2520: Make JSON response format escape \u2029 as well as \u2028 |
| in strings since those characters are not valid in javascript strings |
| (although they are valid in JSON strings). (yonik) |
| |
| * SOLR-2536: Add ReloadCacheRequestHandler to fix ExternalFileField bug (if reopenReaders |
| set to true and no index segments have been changed, commit cannot trigger reload |
| external file). (koji) |
| |
| * SOLR-2539: VectorValueSource.floatVal incorrectly used byteVal on sub-sources. |
| (Tom Liu via yonik) |
| |
| * SOLR-2554: RandomSortField didn't work when used in a function query. (yonik) |
| |
| |
| Other Changes |
| ---------------------- |
| |
| * SOLR-2061: Pull base tests out into a new Solr Test Framework module, |
| and publish binary, javadoc, and source test-framework jars. |
| (Drew Farris, Robert Muir, Steve Rowe) |
| |
| * SOLR-2105: Rename RequestHandler param 'update.processor' to 'update.chain'. |
| (Jan Høydahl via Mark Miller) |
| |
| * SOLR-2485: Deprecate BaseResponseWriter, GenericBinaryResponseWriter, and |
| GenericTextResponseWriter. These classes will be removed in 4.0. (ryan) |
| |
| * SOLR-2451: Enhance assertJQ to allow individual tests to specify the |
| tolerance delta used in numeric equalities. This allows for slight |
| variance in asserting score comparisons in unit tests. |
| (David Smiley, Chris Hostetter) |
| |
| * SOLR-2528: Remove default="true" from HtmlEncoder in example solrconfig.xml, |
| because html encoding confuses non-ascii users. (koji) |
| |
| Build |
| ---------------------- |
| |
| * LUCENE-3006: Building javadocs will fail on warnings by default. Override with -Dfailonjavadocwarning=false (sarowe, gsingers) |
| |
| Documentation |
| ---------------------- |
| |
| |
| |
| ================== 3.1.0 ================== |
| Versions of Major Components |
| --------------------- |
| Apache Lucene 3.1.0 |
| Apache Tika 0.8 |
| Carrot2 3.4.2 |
| Velocity 1.6.1 and Velocity Tools 2.0-beta3 |
| Apache UIMA 2.3.1-SNAPSHOT |
| |
| |
| Upgrading from Solr 1.4 |
| ---------------------- |
| |
| * The Lucene index format has changed and as a result, once you upgrade, |
| previous versions of Solr will no longer be able to read your indices. |
| In a master/slave configuration, all searchers/slaves should be upgraded |
| before the master. If the master were to be updated first, the older |
| searchers would not be able to read the new index format. |
| |
| * The Solr JavaBin format has changed as of Solr 3.1. If you are using the |
| JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034) |
| |
| * The experimental ALIAS command has been removed (SOLR-1637) |
| |
| * Using solr.xml is recommended for single cores also (SOLR-1621) |
| |
| * Old syntax of <highlighting> configuration in solrconfig.xml |
| is deprecated (SOLR-1696) |
| |
| * The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and |
| HTMLStripStandardTokenizerFactory were removed. To strip HTML tags, |
| HTMLStripCharFilter should be used instead, and it works with any |
| Tokenizer of your choice. (SOLR-1657) |
| |
| * Field compression is no longer supported. Fields that were formerly |
| compressed will be uncompressed as index segments are merged. For |
| shorter fields, this may actually be an improvement, as the compression |
| used was not very good for short text. Some indexes may get larger though. |
| |
| * SOLR-1845: The TermsComponent response format was changed so that the |
| "terms" container is a map instead of a named list. This affects |
| response formats like JSON, but not XML. (yonik) |
| |
| * SOLR-1876: All Analyzers and TokenStreams are now final to enforce |
| the decorator pattern. (rmuir, uschindler) |
| |
| * LUCENE-2608: Added the ability to specify the accuracy on a per request basis. |
| It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker |
| methods using the new SpellingOptions class, but are not required to. While this change is |
| backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method. (gsingers) |
| |
| * readercycle script was removed. (SOLR-2046) |
| |
| * In previous releases, sorting or evaluating function queries on |
| fields that were "multiValued" (either by explicit declaration in |
| schema.xml or by implict behavior because the "version" attribute on |
| the schema was less then 1.2) did not generally work, but it would |
| sometimes silently act as if it succeeded and order the docs |
| arbitrarily. Solr will now fail on any attempt to sort, or apply a |
| function to, multi-valued fields |
| |
| * The DataImportHandler jars are no longer included in the solr |
| WAR and should be added in Solr's lib directory, or referenced |
| via the <lib> directive in solrconfig.xml. |
| |
| |
| Detailed Change List |
| ---------------------- |
| |
| New Features |
| ---------------------- |
| |
| * SOLR-1302: Added several new distance based functions, including |
| Great Circle (haversine), Manhattan, Euclidean and String (using the |
| StringDistance methods in the Lucene spellchecker). |
| Also added geohash(), deg() and rad() convenience functions. |
| See http://wiki.apache.org/solr/FunctionQuery. (gsingers) |
| |
| * SOLR-1553: New dismax parser implementation (accessible as "edismax") |
| that supports full lucene syntax, improved reserved char escaping, |
| fielded queries, improved proximity boosting, and improved stopword |
| handling. Note: status is experimental for now. (yonik) |
| |
| * SOLR-1574: Add many new functions from java Math (e.g. sin, cos) (yonik) |
| |
| * SOLR-1569: Allow functions to take in literal strings by modifying the |
| FunctionQParser and adding LiteralValueSource (gsingers) |
| |
| * SOLR-1571: Added unicode collation support though Lucene's CollationKeyFilter |
| (Robert Muir via shalin) |
| |
| * SOLR-785: Distributed Search support for SpellCheckComponent |
| (Matthew Woytowitz, shalin) |
| |
| * SOLR-1625: Add regexp support for TermsComponent (Uri Boness via noble) |
| |
| * SOLR-1297: Add sort by Function capability (gsingers, yonik) |
| |
| * SOLR-1139: Add TermsComponent Query and Response Support in SolrJ (Matt Weber via shalin) |
| |
| * SOLR-1177: Distributed Search support for TermsComponent (Matt Weber via shalin) |
| |
| * SOLR-1621, SOLR-1722: Allow current single core deployments to be specified by solr.xml (Mark Miller , noble) |
| |
| * SOLR-1532: Allow StreamingUpdateSolrServer to use a provided HttpClient (Gabriele Renzi via shalin) |
| |
| * SOLR-1653: Add PatternReplaceCharFilter (koji) |
| |
| * SOLR-1131: FieldTypes can now output multiple Fields per Type and still be searched. This can be handy for hiding the details of a particular |
| implementation such as in the spatial case. (Chris Mattmann, shalin, noble, gsingers, yonik) |
| |
| * SOLR-1586: Add support for Geohash and Spatial Tile FieldType (Chris Mattmann, gsingers) |
| |
| * SOLR-1697: PluginInfo should load plugins w/o class attribute also (noble) |
| |
| * SOLR-1268: Incorporate FastVectorHighlighter (koji) |
| |
| * SOLR-1750: SolrInfoMBeanHandler added for simpler programmatic access |
| to info currently available from registry.jsp and stats.jsp |
| (ehatcher, hossman) |
| |
| * SOLR-1815: SolrJ now preserves the order of facet queries. (yonik) |
| |
| * SOLR-1677: Add support for choosing the Lucene Version for Lucene components within |
| Solr. (Uwe Schindler, Mark Miller) |
| |
| * SOLR-1379: Add RAMDirectoryFactory for non-persistent in memory index storage. |
| (Alex Baranov via yonik) |
| |
| * SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory |
| and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms. |
| Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the |
| performance of SnowballPorterFilterFactory. (rmuir) |
| |
| * SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr |
| TokenFilters now support custom Attributes, and some have improved performance: |
| especially WordDelimiterFilter and CommonGramsFilter. (rmuir, cmale, uschindler) |
| |
| * SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator" |
| parameters for controlling the minimum shingle size produced by the filter, and |
| the separator string that it uses, respectively. (Steven Rowe via rmuir) |
| |
| * SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles" |
| parameter, to output unigrams if the number of input tokens is fewer than |
| minShingleSize, and no shingles can be generated. |
| (Chris Harris via Steven Rowe) |
| |
| * SOLR-1923: PhoneticFilterFactory now has support for the |
| Caverphone algorithm. (rmuir) |
| |
| * SOLR-1957: The VelocityResponseWriter contrib moved to core. |
| Example search UI now available at http://localhost:8983/solr/browse |
| (ehatcher) |
| |
| * SOLR-1974: Add LimitTokenCountFilterFactory. (koji) |
| |
| * SOLR-1966: QueryElevationComponent can now return just the included results in the elevation file (gsingers, yonik) |
| |
| * SOLR-1556: TermVectorComponent now supports per field overrides. Also, it now throws an error |
| if passed in fields do not exist and warnings |
| if fields that do not have term vector options (termVectors, offsets, positions) |
| that align with the schema declaration. It also |
| will now return warnings about (gsingers) |
| |
| * SOLR-1985: FastVectorHighlighter: add wrapper class for Lucene's SingleFragListBuilder (koji) |
| |
| * SOLR-1984: Add HyphenationCompoundWordTokenFilterFactory. (PB via rmuir) |
| |
| * SOLR-397: Date Faceting now supports a "facet.date.include" param |
| for specifying when the upper & lower end points of computed date |
| ranges should be included in the range. Legal values are: "all", |
| "lower", "upper", "edge", and "outer". For backwards compatibility |
| the default value is the set: [lower,upper,edge], so that al ranges |
| between start and ed are inclusive of their endpoints, but the |
| "before" and "after" ranges are not. |
| |
| * SOLR-945: JSON update handler that accepts add, delete, commit |
| commands in JSON format. (Ryan McKinley, yonik) |
| |
| * SOLR-2015: Add a boolean attribute autoGeneratePhraseQueries to TextField. |
| autoGeneratePhraseQueries="true" (the default) causes the query parser to |
| generate phrase queries if multiple tokens are generated from a single |
| non-quoted analysis string. For example WordDelimiterFilter splitting text:pdp-11 |
| will cause the parser to generate text:"pdp 11" rather than (text:PDP OR text:11). |
| Note that autoGeneratePhraseQueries="true" tends to not work well for non whitespace |
| delimited languages. (yonik) |
| |
| * SOLR-1925: Add CSVResponseWriter (use wt=csv) that returns the list of documents |
| in CSV format. (Chris Mattmann, yonik) |
| |
| * SOLR-1240: "Range Faceting" has been added. This is a generalization |
| of the existing "Date Faceting" logic so that it now supports any |
| all stock numeric field types that support range queries in addition |
| to dates. facet.date is now deprecated in favor of this generalized mechanism. |
| (Gijs Kunze, hossman) |
| |
| * SOLR-2021: Add SolrEncoder plugin to Highlighter. (koji) |
| |
| * SOLR-2030: Make FastVectorHighlighter use of SolrEncoder. (koji) |
| |
| * SOLR-2053: Add support for custom comparators in Solr spellchecker, per LUCENE-2479 (gsingers) |
| |
| * SOLR-2049: Add hl.multiValuedSeparatorChar for FastVectorHighlighter, per LUCENE-2603. (koji) |
| |
| * SOLR-2059: Add "types" attribute to WordDelimiterFilterFactory, which |
| allows you to customize how WordDelimiterFilter tokenizes text with |
| a configuration file. (Peter Karich, rmuir) |
| |
| * SOLR-2099: Add ability to throttle rsync based replication using rsync option --bwlimit. |
| (Brandon Evans via koji) |
| |
| * SOLR-1316: Create autosuggest component. |
| (Ankul Garg, Jason Rutherglen, Shalin Shekhar Mangar, Grant Ingersoll, Robert Muir, ab) |
| |
| * SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See |
| http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial. |
| Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved. (gsingers) |
| |
| * SOLR-2128: Full parameter substitution for function queries. |
| Example: q=add($v1,$v2)&v1=mul(popularity,5)&v2=20.0 |
| (yonik) |
| |
| * SOLR-2133: Function query parser can now parse multiple comma separated |
| value sources. It also now fails if there is extra unexpected text |
| after parsing the functions, instead of silently ignoring it. |
| This allows expressions like q=dist(2,vector(1,2),$pt)&pt=3,4 (yonik) |
| |
| * SOLR-2157: Suggester should return alpha-sorted results when onlyMorePopular=false (ab) |
| |
| * SOLR-2010: Added ability to verify that spell checking collations have |
| actual results in the index. (James Dyer via gsingers) |
| |
| * SOLR-2188: Added "maxTokenLength" argument to the factories for ClassicTokenizer, |
| StandardTokenizer, and UAX29URLEmailTokenizer. (Steven Rowe) |
| |
| * SOLR-2129: Added a Solr module for dynamic metadata extraction/indexing with Apache UIMA. |
| See contrib/uima/README.txt for more information. (Tommaso Teofili via rmuir) |
| |
| * SOLR-2325: Allow tagging and exlcusion of main query for faceting. (yonik) |
| |
| * SOLR-2263: Add ability for RawResponseWriter to stream binary files as well as |
| text files. (Eric Pugh via yonik) |
| |
| * SOLR-860: Add debug output for MoreLikeThis. (koji) |
| |
| * SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji) |
| |
| Optimizations |
| ---------------------- |
| |
| * SOLR-1679: Don't build up string messages in SolrCore.execute unless they |
| are necessary for the current log level. |
| (Fuad Efendi and hossman) |
| |
| * SOLR-1874: Optimize PatternReplaceFilter for better performance. (rmuir, uschindler) |
| |
| * SOLR-1968: speed up initial filter cache population for facet.method=enum and |
| also big terms for multi-valued facet.method=fc. The resulting speedup |
| for the first facet request is anywhere from 30% to 32x, depending on how many |
| terms are in the field and how many documents match per term. (yonik) |
| |
| * SOLR-2089: Speed up UnInvertedField faceting (facet.method=fc for |
| multi-valued fields) when facet.limit is both high, and a high enough |
| percentage of the number of unique terms in the field. Extreme cases |
| yield speedups over 3x. (yonik) |
| |
| * SOLR-2046: add common functions to scripts-util. (koji) |
| |
| Bug Fixes |
| ---------------------- |
| * SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble) |
| |
| * SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate |
| to the original ValueSource.getValues(reader) so custom sources |
| will work. (yonik) |
| |
| * SOLR-1572: FastLRUCache correctly implemented the LRU policy only |
| for the first 2B accesses. (yonik) |
| |
| * SOLR-1582: copyField was ignored for BinaryField types (gsingers) |
| |
| * SOLR-1563: Binary fields, including trie-based numeric fields, caused null |
| pointer exceptions in the luke request handler. (yonik) |
| |
| * SOLR-1577: The example solrconfig.xml defaulted to a solr data dir |
| relative to the current working directory, even if a different solr home |
| was being used. The new behavior changes the default to a zero length |
| string, which is treated the same as if no dataDir had been specified, |
| hence the "data" directory under the solr home will be used. (yonik) |
| |
| * SOLR-1584: SolrJ - SolrQuery.setIncludeScore() incorrectly added |
| fl=score to the parameter list instead of appending score to the |
| existing field list. (yonik) |
| |
| * SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always |
| uses Lucene default. (Lance Norskog via Mark Miller) |
| |
| * SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs |
| (i.e. code points outside of the BMP), resulting in incorrect |
| matching. This change requires reindexing for any content with |
| such characters. (Robert Muir, yonik) |
| |
| * SOLR-1596: A rollback operation followed by the shutdown of Solr |
| or the close of a core resulted in a warning: |
| "SEVERE: SolrIndexWriter was not closed prior to finalize()" although |
| there were no other consequences. (yonik) |
| |
| * SOLR-1595: StreamingUpdateSolrServer used the platform default character |
| set when streaming updates, rather than using UTF-8 as the HTTP headers |
| indicated, leading to an encoding mismatch. (hossman, yonik) |
| |
| * SOLR-1587: A distributed search request with fl=score, didn't match |
| the behavior of a non-distributed request since it only returned |
| the id,score fields instead of all fields in addition to score. (yonik) |
| |
| * SOLR-1601: Schema browser does not indicate presence of charFilter. (koji) |
| |
| * SOLR-1615: Backslash escaping did not work in quoted strings |
| for local param arguments. (Wojtek Piaseczny, yonik) |
| |
| * SOLR-1628: log contains incorrect number of adds and deletes. |
| (Thijs Vonk via yonik) |
| |
| * SOLR-343: Date faceting now respects facet.mincount limiting |
| (Uri Boness, Raiko Eckstein via hossman) |
| |
| * SOLR-1624: Highlighter only highlights values from the first field value |
| in a multivalued field when term positions (term vectors) are stored. |
| (Chris Harris via yonik) |
| |
| * SOLR-1635: Fixed error message when numeric values can't be parsed by |
| DOMUtils - notably for plugin init params in solrconfig.xml. |
| (hossman) |
| |
| * SOLR-1651: Fixed Incorrect dataimport handler package name in SolrResourceLoader |
| (Akshay Ukey via shalin) |
| |
| * SOLR-1660: CapitalizationFilter crashes if you use the maxWordCountOption |
| (Robert Muir via shalin) |
| |
| * SOLR-1667: PatternTokenizer does not reset attributes such as positionIncrementGap |
| (Robert Muir via shalin) |
| |
| * SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that |
| could halt the streaming of documents. The original patch to fix this |
| (never officially released) introduced another hanging bug due to |
| connections not being released. |
| (Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik) |
| |
| * SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers |
| retrieved from ContentStreams are not closed in various places, resulting |
| in file descriptor leaks. |
| (Christoff Brill, Mark Miller) |
| |
| * SOLR-1753: StatsComponent throws NPE when getting statistics for facets in distributed search |
| (Janne Majaranta via koji) |
| |
| * SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble) |
| |
| * SOLR-1579: Fixes to XML escaping in stats.jsp |
| (David Bowen and hossman) |
| |
| * SOLR-1777: fieldTypes with sortMissingLast=true or sortMissingFirst=true can |
| result in incorrectly sorted results. (yonik) |
| |
| * SOLR-1798: Small memory leak (~100 bytes) in fastLRUCache for every |
| commit. (yonik) |
| |
| * SOLR-1823: Fixed XMLResponseWriter (via XMLWriter) so it no longer throws |
| a ClassCastException when a Map containing a non-String key is used. |
| (Frank Wesemann, hossman) |
| |
| * SOLR-1797: fix ConcurrentModificationException and potential memory |
| leaks in ResourceLoader. (yonik) |
| |
| * SOLR-1850: change KeepWordFilter so a new word set is not created for |
| each instance (John Wang via yonik) |
| |
| * SOLR-1706: fixed WordDelimiterFilter for certain combinations of options |
| where it would output incorrect tokens. (Robert Muir, Chris Male) |
| |
| * SOLR-1936: The JSON response format needed to escape unicode code point |
| U+2028 - 'LINE SEPARATOR' (Robert Hofstra, yonik) |
| |
| * SOLR-1914: Change the JSON response format to output float/double |
| values of NaN,Infinity,-Infinity as strings. (yonik) |
| |
| * SOLR-1948: PatternTokenizerFactory should use parent's args (koji) |
| |
| * SOLR-1870: Indexing documents using the 'javabin' format no longer |
| fails with a ClassCastException whenSolrInputDocuments contain field |
| values which are Collections or other classes that implement |
| Iterable. (noble, hossman) |
| |
| * SOLR-1981: Solr will now fail correctly if solr.xml attempts to |
| specify multiple cores that have the same name (hossman) |
| |
| * SOLR-1791: Fix messed up core names on admin gui (yonik via koji) |
| |
| * SOLR-1995: Change date format from "hour in am/pm" to "hour in day" |
| in CoreContainer and SnapShooter. (Hayato Ito, koji) |
| |
| * SOLR-2008: avoid possible RejectedExecutionException w/autoCommit |
| by making SolreCore close the UpdateHandler before closing the |
| SearchExecutor. (NarasimhaRaju, hossman) |
| |
| * SOLR-2036: Avoid expensive fieldCache ram estimation for the |
| admin stats page. (yonik) |
| |
| * SOLR-2047: ReplicationHandler should accept bool type for enable flag. (koji) |
| |
| * SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers) |
| |
| * SOLR-2100: The replication handler backup command didn't save the commit |
| point and hence could fail when a newer commit caused the older commit point |
| to be removed before it was finished being copied. This did not affect |
| normal master/slave replication. (Peter Sturge via yonik) |
| |
| * SOLR-2114: Fixed parsing error in hsin function. The function signature has changed slightly. (gsingers) |
| |
| * SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers) |
| |
| * SOLR-2111: Change exception handling in distributed faceting to work more |
| like non-distributed faceting, change facet_counts/exception from a String |
| to a List<String> to enable listing all exceptions that happened, and |
| prevent an exception in one facet command from affecting another |
| facet command. (yonik) |
| |
| * SOLR-2110: Remove the restriction on names for local params |
| substitution/dereferencing. Properly encode local params in |
| distributed faceting. (yonik) |
| |
| * SOLR-2135: Fix behavior of ConcurrentLRUCache when asking for |
| getLatestAccessedItems(0) or getOldestAccessedItems(0). |
| (David Smiley via hossman) |
| |
| * SOLR-2148: Highlighter doesn't support q.alt. (koji) |
| |
| * SOLR-2180: It was possible for EmbeddedSolrServer to leave searchers |
| open if a request threw an exception. (yonik) |
| |
| * SOLR-2173: Suggester should always rebuild Lookup data if Lookup.load fails. (ab) |
| |
| * SOLR-2081: BaseResponseWriter.isStreamingDocs causes |
| SingleResponseWriter.end to be called 2x |
| (Chris A. Mattmann via hossman) |
| |
| * SOLR-2219: The init() method of every SolrRequestHandler was being |
| called twice. (ambikeshwar singh and hossman) |
| |
| * SOLR-2285: duplicate SolrEventListeners no longer created (hossman) |
| |
| * SOLR-1993: fix String cast assumption in JavaBinCodec - specific |
| addresses "commitWithin" option on Update requests. |
| (noble, hossman, and Maxim Valyanskiy) |
| |
| * SOLR-2261: fix velocity template layout.vm that referred to an older |
| version of jquery. (Eric Pugh via rmuir) |
| |
| * SOLR-2307: fix bug in PHPSerializedResponseWriter (wt=phps) when |
| dealing with SolrDocumentList objects -- ie: sharded queries. |
| (Antonio Verni via hossman) |
| |
| * SOLR-2127: Fixed serialization of default core and indentation of solr.xml when serializing. |
| (Ephraim Ofir, Mark Miller) |
| |
| * SOLR-2320: Fixed ReplicationHandler detail reporting for masters |
| (hossman) |
| |
| * SOLR-482: Provide more exception handling in CSVLoader (gsingers) |
| |
| * SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception. |
| (Julien Coloos, hossman, yonik) |
| |
| * SOLR-2085: Improve SolrJ behavior when FacetComponent comes before |
| QueryComponent (Tomas Salfischberger via hossman) |
| |
| * SOLR-1940: Fix SolrDispatchFilter behavior when Content-Type is |
| unknown (Lance Norskog and hossman) |
| |
| * SOLR-1983: snappuller fails when modifiedConfFiles is not empty and |
| full copy of index is needed. (Alexander Kanarsky via yonik) |
| |
| * SOLR-2156: SnapPuller fails to clean Old Index Directories on Full Copy |
| (Jayendra Patil via yonik) |
| |
| * SOLR-96: Fix XML parsing in XMLUpdateRequestHandler and |
| DocumentAnalysisRequestHandler to respect charset from XML file and only |
| use HTTP header's "Content-Type" as a "hint". (uschindler) |
| |
| * SOLR-2339: Fix sorting to explicitly generate an error if you |
| attempt to sort on a multiValued field. (hossman) |
| |
| * SOLR-2348: Fix field types to explicitly generate an error if you |
| attempt to get a ValueSource for a multiValued field. (hossman) |
| |
| * SOLR-2380: Distributed faceting could miss values when facet.sort=index |
| and when facet.offset was greater than 0. (yonik) |
| |
| * SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader |
| are fixed to be resolved using the URI standard (RFC 2396). The system |
| identifier is no longer a plain filename with path, it gets initialized |
| using a custom URI scheme "solrres:". This scheme is resolved using a |
| EntityResolver that utilizes ResourceLoader |
| (org.apache.solr.common.util.SystemIdResolver). This makes all relative |
| pathes in Solr's config files behave like expected. This change |
| introduces some backwards breaks in the API: Some config classes |
| (Config, SolrConfig, IndexSchema) were changed to take |
| org.xml.sax.InputSource instead of InputStream. There may also be some |
| backwards breaks in existing config files, it is recommended to check |
| your config files / XSLTs and replace all XIncludes/HREFs that were |
| hacked to use absolute paths to use relative ones. (uschindler) |
| |
| * SOLR-309: Fix FieldType so setting an analyzer on a FieldType that |
| doesn't expect it will generate an error. Practically speaking this |
| means that Solr will now correctly generate an error on |
| initialization if the schema.xml contains an analyzer configuration |
| for a fieldType that does not use TextField. (hossman) |
| |
| * SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not |
| thread safe and could throw an exception. (yonik) |
| |
| Other Changes |
| ---------------------- |
| |
| * SOLR-1602: Refactor SOLR package structure to include o.a.solr.response |
| and move QueryResponseWriters in there |
| (Chris A. Mattmann, ryan, hoss) |
| |
| * SOLR-1516: Addition of an abstract BaseResponseWriter class to simplify the |
| development of QueryResponseWriter implementations. |
| (Chris A. Mattmann via noble) |
| |
| * SOLR-1592: Refactor XMLWriter startTag to allow arbitrary attributes to be written |
| (Chris A. Mattmann via noble) |
| |
| * SOLR-1561: Added Lucene 2.9.1 spatial contrib jar to lib. (gsingers) |
| |
| * SOLR-1570: Log warnings if uniqueKey is multi-valued or not stored (hossman, shalin) |
| |
| * SOLR-1558: QueryElevationComponent only works if the uniqueKey field is |
| implemented using StrField. In previous versions of Solr no warning or |
| error would be generated if you attempted to use QueryElevationComponent, |
| it would just fail in unexpected ways. This has been changed so that it |
| will fail with a clear error message on initialization. (hossman) |
| |
| * SOLR-1611: Added Lucene 2.9.1 collation contrib jar to lib (shalin) |
| |
| * SOLR-1608: Extract base class from TestDistributedSearch to make |
| it easy to write test cases for other distributed components. (shalin) |
| |
| * Upgraded to Lucene 2.9-dev r888785 (shalin) |
| |
| * SOLR-1610: Generify SolrCache (Jason Rutherglen via shalin) |
| |
| * SOLR-1637: Remove ALIAS command |
| |
| * SOLR-1662: Added Javadocs in BufferedTokenStream and fixed incorrect cloning |
| in TestBufferedTokenStream (Robert Muir, Uwe Schindler via shalin) |
| |
| * SOLR-1674: Improve analysis tests and cut over to new TokenStream API. |
| (Robert Muir via Mark Miller) |
| |
| * SOLR-1661: Remove adminCore from CoreContainer . removed deprecated methods setAdminCore(), getAdminCore() (noble) |
| |
| * SOLR-1704: Google collections moved from clustering to core (noble) |
| |
| * SOLR-1268: Add Lucene 2.9-dev r888785 FastVectorHighlighter contrib jar to lib. (koji) |
| |
| * SOLR-1538: Reordering of object allocations in ConcurrentLRUCache to eliminate |
| (an extremely small) potential for deadlock. |
| (gabriele renzi via hossman) |
| |
| * SOLR-1588: Removed some very old dead code. |
| (Chris A. Mattmann via hossman) |
| |
| * SOLR-1696 : Deprecate old <highlighting> syntax and move configuration to HighlightComponent (noble) |
| |
| * SOLR-1727: SolrEventListener should extend NamedListInitializedPlugin (noble) |
| |
| * SOLR-1771: Improved error message when StringIndex cannot be initialized |
| for a function query (hossman) |
| |
| * SOLR-1695: Improved error messages when adding a document that does not |
| contain exactly one value for the uniqueKey field (hossman) |
| |
| * SOLR-1776: DismaxQParser and ExtendedDismaxQParser now use the schema.xml |
| "defaultSearchField" as the default value for the "qf" param instead of failing |
| with an error when "qf" is not specified. (hossman) |
| |
| * SOLR-1851: luceneAutoCommit no longer has any effect - it has been remove (Mark Miller) |
| |
| * SOLR-1865: SolrResourceLoader.getLines ignores Byte Order Markers (BOMs) at the |
| beginning of input files, these are often created by editors such as Windows |
| Notepad. (rmuir, hossman) |
| |
| * SOLR-1938: ElisionFilterFactory will use a default set of French contractions |
| if you do not supply a custom articles file. (rmuir) |
| |
| * SOLR-2003: SolrResourceLoader will report any encoding errors, rather than |
| silently using replacement characters for invalid inputs (blargy via rmuir) |
| |
| * SOLR-1804: Google collections updated to Google Guava (which is a superset of collections and contains bug fixes) (gsingers) |
| |
| * SOLR-2034: Switch to JavaBin codec version 2. Strings are now serialized |
| as the number of UTF-8 bytes, followed by the bytes in UTF-8. Previously |
| Strings were serialized as the number of UTF-16 chars, followed by the |
| bytes in Modified UTF-8. (hossman, yonik, rmuir) |
| |
| * SOLR-2013: Add mapping-FoldToASCII.txt to example conf directory. |
| (Steven Rowe via koji) |
| |
| * SOLR-2213: Upgrade to jQuery 1.4.3 (Erick Erickson via ryan) |
| |
| * SOLR-1826: Add unit tests for highlighting with termOffsets=true |
| and overlapping tokens. (Stefan Oestreicher via rmuir) |
| |
| * SOLR-2340: Add version infos to message in JavaBinCodec when throwing |
| exception. (koji) |
| |
| * SOLR-2350: Since Solr no longer requires XML files to be in UTF-8 |
| (see SOLR-96) SimplePostTool (aka: post.jar) has been improved to |
| work with files of any mime-type or charset. (hossman) |
| |
| * SOLR-2365: Move DIH jars out of solr.war (David Smiley via yonik) |
| |
| * SOLR-2381: Include a patched version of Jetty (6.1.26 + JETTY-1340) |
| to fix problematic UTF-8 handling for supplementary characters. |
| (Bernd Fehling, uschindler, yonik, rmuir) |
| |
| * SOLR-2391: The preferred Content-Type for XML was changed to |
| application/xml. XMLResponseWriter now only delivers using this |
| type; updating documents and analyzing documents is still supported |
| using text/xml as Content-Type, too. If you have clients that are |
| hardcoded on text/xml as Content-Type, you have to change them. |
| (uschindler, rmuir) |
| |
| * SOLR-2414: All ResponseWriters now use only ServletOutputStreams |
| and wrap their own Writer around it when serializing. This fixes |
| the bug in PHPSerializedResponseWriter that produced wrong string |
| length if the servlet container had a broken UTF-8 encoding that was |
| in fact CESU-8 (see SOLR-1091). The system property to enable the |
| CESU-8 byte counting in PHPSerializesResponseWriters for broken |
| servlet containers was therefore removed and is now ignored if set. |
| Output is always UTF-8. (uschindler, yonik, rmuir) |
| |
| Build |
| ---------------------- |
| |
| * SOLR-1522: Automated release signing process. (gsingers) |
| |
| * SOLR-1891: Make lucene-jars-to-solr fail if copying any of the jars fails, and |
| update clean to remove the jars in that directory (Mark Miller) |
| |
| * LUCENE-2466: Commons-Codec was upgraded from 1.3 to 1.4. (rmuir) |
| |
| * SOLR-2042: Fixed some Maven deps (Drew Farris via gsingers) |
| |
| * LUCENE-2657: Switch from using Maven POM templates to full POMs when |
| generating Maven artifacts (Steven Rowe) |
| |
| Documentation |
| ---------------------- |
| |
| * SOLR-1590: Javadoc for XMLWriter#startTag |
| (Chris A. Mattmann via hossman) |
| |
| * SOLR-1792: Documented peculiar behavior of TestHarness.LocalRequestFactory |
| (hossman) |
| |
| ================== Release 1.4.1 ================== |
| Release Date: See http://lucene.apache.org/solr for the official release date. |
| |
| Upgrading from Solr 1.4 |
| ----------------------- |
| |
| This is a bug fix release - no changes are required when upgrading from Solr 1.4. |
| However, a reindex is needed for some of the analysis fixes to take effect. |
| |
| Versions of Major Components |
| ---------------------------- |
| Apache Lucene 2.9.3 |
| Apache Tika 0.4 |
| Carrot2 3.1.0 |
| |
| Lucene Information |
| ---------------- |
| |
| Since Solr is built on top of Lucene, many people add customizations to Solr |
| that are dependent on Lucene. Please see http://lucene.apache.org/java/2_9_3/, |
| especially http://lucene.apache.org/java/2_9_3/changes/Changes.html for more |
| information on the version of Lucene used in Solr. |
| |
| Bug Fixes |
| ---------------------- |
| |
| * SOLR-1934: Upgrade to Apache Lucene 2.9.3 to obtain several bug |
| fixes from the previous 2.9.1. See the Lucene 2.9.3 release notes |
| for details. (hossman, Mark Miller) |
| |
| * SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate |
| to the original ValueSource.getValues(reader) so custom sources |
| will work. (yonik) |
| |
| * SOLR-1572: FastLRUCache correctly implemented the LRU policy only |
| for the first 2B accesses. (yonik) |
| |
| * SOLR-1595: StreamingUpdateSolrServer used the platform default character |
| set when streaming updates, rather than using UTF-8 as the HTTP headers |
| indicated, leading to an encoding mismatch. (hossman, yonik) |
| |
| * SOLR-1660: CapitalizationFilter crashes if you use the maxWordCountOption |
| (Robert Muir via shalin) |
| |
| * SOLR-1662: Added Javadocs in BufferedTokenStream and fixed incorrect cloning |
| in TestBufferedTokenStream (Robert Muir, Uwe Schindler via shalin) |
| |
| * SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that |
| could halt the streaming of documents. The original patch to fix this |
| (never officially released) introduced another hanging bug due to |
| connections not being released. (Attila Babo, Erik Hetzner via yonik) |
| |
| * SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers |
| retrieved from ContentStreams are not closed in various places, resulting |
| in file descriptor leaks. |
| (Christoff Brill, Mark Miller) |
| |
| * SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always |
| uses Lucene default. (Lance Norskog via Mark Miller) |
| |
| * SOLR-1777: fieldTypes with sortMissingLast=true or sortMissingFirst=true can |
| result in incorrectly sorted results. (yonik) |
| |
| * SOLR-1797: fix ConcurrentModificationException and potential memory |
| leaks in ResourceLoader. (yonik) |
| |
| * SOLR-1798: Small memory leak (~100 bytes) in fastLRUCache for every |
| commit. (yonik) |
| |
| * SOLR-1522: Show proper message if <script> tag is missing for DIH |
| ScriptTransformer (noble) |
| |
| * SOLR-1538: Reordering of object allocations in ConcurrentLRUCache to eliminate |
| (an extremely small) potential for deadlock. |
| (gabriele renzi via hossman) |
| |
| * SOLR-1558: QueryElevationComponent only works if the uniqueKey field is |
| implemented using StrField. In previous versions of Solr no warning or |
| error would be generated if you attempted to use QueryElevationComponent, |
| it would just fail in unexpected ways. This has been changed so that it |
| will fail with a clear error message on initialization. (hossman) |
| |
| * SOLR-1563: Binary fields, including trie-based numeric fields, caused null |
| pointer exceptions in the luke request handler. (yonik) |
| |
| * SOLR-1579: Fixes to XML escaping in stats.jsp |
| (David Bowen and hossman) |
| |
| * SOLR-1582: copyField was ignored for BinaryField types (gsingers) |
| |
| * SOLR-1596: A rollback operation followed by the shutdown of Solr |
| or the close of a core resulted in a warning: |
| "SEVERE: SolrIndexWriter was not closed prior to finalize()" although |
| there were no other consequences. (yonik) |
| |
| * SOLR-1651: Fixed Incorrect dataimport handler package name in SolrResourceLoader |
| (Akshay Ukey via shalin) |
| |
| * SOLR-1936: The JSON response format needed to escape unicode code point |
| U+2028 - 'LINE SEPARATOR' (Robert Hofstra, yonik) |
| |
| * SOLR-1852: Fix WordDelimiterFilterFactory bug where position increments |
| were not being applied properly to subwords. (Peter Wolanin via Robert Muir) |
| |
| * SOLR-1706: fixed WordDelimiterFilter for certain combinations of options |
| where it would output incorrect tokens. (Robert Muir, Chris Male) |
| |
| * SOLR-1948: PatternTokenizerFactory should use parent's args (koji) |
| |
| * SOLR-1870: Indexing documents using the 'javabin' format no longer |
| fails with a ClassCastException whenSolrInputDocuments contain field |
| values which are Collections or other classes that implement |
| Iterable. (noble, hossman) |
| |
| * SOLR-1769 Solr 1.4 Replication - Repeater throwing NullPointerException (noble) |
| |
| |
| ================== Release 1.4.0 ================== |
| Release Date: See http://lucene.apache.org/solr for the official release date. |
| |
| Upgrading from Solr 1.3 |
| ----------------------- |
| |
| There is a new default faceting algorithm for multiVaued fields that should be |
| faster for most cases. One can revert to the previous algorithm (which has |
| also been improved somewhat) by adding facet.method=enum to the request. |
| |
| Searching and sorting is now done on a per-segment basis, meaning that |
| the FieldCache entries used for sorting and for function queries are |
| created and used per-segment and can be reused for segments that don't |
| change between index updates. While generally beneficial, this can lead |
| to increased memory usage over 1.3 in certain scenarios: |
| 1) A single valued field that was used for both sorting and faceting |
| in 1.3 would have used the same top level FieldCache entry. In 1.4, |
| sorting will use entries at the segment level while faceting will still |
| use entries at the top reader level, leading to increased memory usage. |
| 2) Certain function queries such as ord() and rord() require a top level |
| FieldCache instance and can thus lead to increased memory usage. Consider |
| replacing ord() and rord() with alternatives, such as function queries |
| based on ms() for date boosting. |
| |
| If you use custom Tokenizer or TokenFilter components in a chain specified in |
| schema.xml, they must support reusability. If your Tokenizer or TokenFilter |
| maintains state, it should implement reset(). If your TokenFilteFactory does |
| not return a subclass of TokenFilter, then it should implement reset() and call |
| reset() on it's input TokenStream. TokenizerFactory implementations must |
| now return a Tokenizer rather than a TokenStream. |
| |
| New users of Solr 1.4 will have omitTermFreqAndPositions enabled for non-text |
| indexed fields by default, which avoids indexing term frequency, positions, and |
| payloads, making the index smaller and faster. If you are upgrading from an |
| earlier Solr release and want to enable omitTermFreqAndPositions by default, |
| change the schema version from 1.1 to 1.2 in schema.xml. Remove any existing |
| index and restart Solr to ensure that omitTermFreqAndPositions completely takes |
| affect. |
| |
| The default QParserPlugin used by the QueryComponent for parsing the "q" param |
| has been changed, to remove support for the deprecated use of ";" as a separator |
| between the query string and the sort options when no "sort" param was used. |
| Users who wish to continue using the semi-colon based method of specifying the |
| sort options should explicitly set the defType param to "lucenePlusSort" on all |
| requests. (The simplest way to do this is by specifying it as a default param |
| for your request handlers in solrconfig.xml, see the example solrconfig.xml for |
| sample syntax.) |
| |
| If spellcheck.extendedResults=true, the response format for suggestions |
| has changed, see SOLR-1071. |
| |
| Use of the "charset" option when configuring the following Analysis |
| Factories has been deprecated and will cause a warning to be logged. |
| In future versions of Solr attempting to use this option will cause an |
| error. See SOLR-1410 for more information. |
| * GreekLowerCaseFilterFactory |
| * RussianStemFilterFactory |
| * RussianLowerCaseFilterFactory |
| * RussianLetterTokenizerFactory |
| |
| Versions of Major Components |
| ---------------------------- |
| Apache Lucene 2.9.1 (r832363 on 2.9 branch) |
| Apache Tika 0.4 |
| Carrot2 3.1.0 |
| |
| Lucene Information |
| ---------------- |
| |
| Since Solr is built on top of Lucene, many people add customizations to Solr |
| that are dependent on Lucene. Please see http://lucene.apache.org/java/2_9_0/, |
| especially http://lucene.apache.org/java/2_9_0/changes/Changes.html for more |
| information on the version of Lucene used in Solr. |
| |
| Detailed Change List |
| ---------------------- |
| |
| New Features |
| ---------------------- |
| 1. SOLR-560: Use SLF4J logging API rather then JDK logging. The packaged .war file is |
| shipped with a JDK logging implementation, so logging configuration for the .war should |
| be identical to solr 1.3. However, if you are using the .jar file, you can select |
| which logging implementation to use by dropping a different binding. |
| See: http://www.slf4j.org/ (ryan) |
| |
| 2. SOLR-617: Allow configurable index deletion policy and provide a default implementation which |
| allows deletion of commit points on various criteria such as number of commits, age of commit |
| point and optimized status. |
| See http://lucene.apache.org/java/2_3_2/api/org/apache/lucene/index/IndexDeletionPolicy.html |
| (yonik, Noble Paul, Akshay Ukey via shalin) |
| |
| 3. SOLR-658: Allow Solr to load index from arbitrary directory in dataDir |
| (Noble Paul, Akshay Ukey via shalin) |
| |
| 4. SOLR-793: Add 'commitWithin' argument to the update add command. This behaves |
| similar to the global autoCommit maxTime argument except that it is set for |
| each request. (ryan) |
| |
| 5. SOLR-670: Add support for rollbacks in UpdateHandler. This allows user to rollback all changes |
| since the last commit. (Noble Paul, koji via shalin) |
| |
| 6. SOLR-813: Adding DoubleMetaphone Filter and Factory. Similar to the PhoneticFilter, |
| but this uses DoubleMetaphone specific calls (including alternate encoding) |
| (Todd Feak via ryan) |
| |
| 7. SOLR-680: Add StatsComponent. This gets simple statistics on matched numeric fields, |
| including: min, max, mean, median, stddev. (koji, ryan) |
| |
| 7.1 SOLR-1380: Added support for multi-valued fields (Harish Agarwal via gsingers) |
| |
| 8. SOLR-561: Added Replication implemented in Java as a request handler. Supports index replication |
| as well as configuration replication and exposes detailed statistics and progress information |
| on the Admin page. Works on all platforms. (Noble Paul, yonik, Akshay Ukey, shalin) |
| |
| 9. SOLR-746: Added "omitHeader" request parameter to omit the header from the response. |
| (Noble Paul via shalin) |
| |
| 10. SOLR-651: Added TermVectorComponent for serving up term vector information, plus IDF. |
| See http://wiki.apache.org/solr/TermVectorComponent (gsingers, Vaijanath N. Rao, Noble Paul) |
| |
| 12. SOLR-795: SpellCheckComponent supports building indices on optimize if configured in solrconfig.xml |
| (Jason Rennie, shalin) |
| |
| 13. SOLR-667: A LRU cache implementation based upon ConcurrentHashMap and other techniques to reduce |
| contention and synchronization overhead, to utilize multiple CPU cores more effectively. |
| (Fuad Efendi, Noble Paul, yonik via shalin) |
| |
| 14. SOLR-465: Add configurable DirectoryProvider so that alternate Directory |
| implementations can be specified via solrconfig.xml. The default |
| DirectoryProvider will use NIOFSDirectory for better concurrency |
| on non Windows platforms. (Mark Miller, TJ Laurenzo via yonik) |
| |
| 15. SOLR-822: Add CharFilter so that characters can be filtered (e.g. character normalization) |
| before Tokenizer/TokenFilters. (koji) |
| |
| 16. SOLR-829: Allow slaves to request compressed files from master during replication |
| (Simon Collins, Noble Paul, Akshay Ukey via shalin) |
| |
| 17. SOLR-877: Added TermsComponent for accessing Lucene's TermEnum capabilities. |
| Useful for auto suggest and possibly distributed search. Not distributed search compliant. (gsingers) |
| - Added mincount and maxcount options (Khee Chin via gsingers) |
| |
| 18. SOLR-538: Add maxChars attribute for copyField function so that the length limit for destination |
| can be specified. |
| (Georgios Stamatis, Lars Kotthoff, Chris Harris via koji) |
| |
| 19. SOLR-284: Added support for extracting content from binary documents like MS Word and PDF using Apache Tika. See also contrib/extraction/CHANGES.txt (Eric Pugh, Chris Harris, yonik, gsingers) |
| |
| 20. SOLR-819: Added factories for Arabic support (gsingers) |
| |
| 21. SOLR-781: Distributed search ability to sort field.facet values |
| lexicographically. facet.sort values "true" and "false" are |
| also deprecated and replaced with "count" and "lex". |
| (Lars Kotthoff via yonik) |
| |
| 22. SOLR-821: Add support for replication to copy conf file to slave with a different name. This allows replication |
| of solrconfig.xml |
| (Noble Paul, Akshay Ukey via shalin) |
| |
| 23. SOLR-911: Add support for multi-select faceting by allowing filters to be |
| tagged and facet commands to exclude certain filters. This patch also |
| added the ability to change the output key for facets in the response, and |
| optimized distributed faceting refinement by lowering parsing overhead and |
| by making requests and responses smaller. |
| |
| 24. SOLR-876: WordDelimiterFilter now supports a splitOnNumerics |
| option, as well as a list of protected terms. |
| (Dan Rosher via hossman) |
| |
| 25. SOLR-928: SolrDocument and SolrInputDocument now implement the Map<String,?> |
| interface. This should make plugging into other standard tools easier. (ryan) |
| |
| 26. SOLR-847: Enhance the snappull command in ReplicationHandler to accept masterUrl. |
| (Noble Paul, Preetam Rao via shalin) |
| |
| 27. SOLR-540: Add support for globbing in field names to highlight. |
| For example, hl.fl=*_text will highlight all fieldnames ending with |
| _text. (Lars Kotthoff via yonik) |
| |
| 28. SOLR-906: Adding a StreamingUpdateSolrServer that writes update commands to |
| an open HTTP connection. If you are using solrj for bulk update requests |
| you should consider switching to this implementaion. However, note that |
| the error handling is not immediate as it is with the standard SolrServer. |
| (ryan) |
| |
| 29. SOLR-865: Adding support for document updates in binary format and corresponding support in Solrj client. |
| (Noble Paul via shalin) |
| |
| 30. SOLR-763: Add support for Lucene's PositionFilter (Mck SembWever via shalin) |
| |
| 31. SOLR-966: Enhance the map() function query to take in an optional default value (Noble Paul, shalin) |
| |
| 32. SOLR-820: Support replication on startup of master with new index. (Noble Paul, Akshay Ukey via shalin) |
| |
| 33. SOLR-943: Make it possible to specify dataDir in solr.xml and accept the dataDir as a request parameter for |
| the CoreAdmin create command. (Noble Paul via shalin) |
| |
| 34. SOLR-850: Addition of timeouts for distributed searching. Configurable through 'shard-socket-timeout' and |
| 'shard-connection-timeout' parameters in SearchHandler. (Patrick O'Leary via shalin) |
| |
| 35. SOLR-799: Add support for hash based exact/near duplicate document |
| handling. (Mark Miller, yonik) |
| |
| 36. SOLR-1026: Add protected words support to SnowballPorterFilterFactory (ehatcher) |
| |
| 37. SOLR-739: Add support for OmitTf (Mark Miller via yonik) |
| |
| 38. SOLR-1046: Nested query support for the function query parser |
| and lucene query parser (the latter existed as an undocumented |
| feature in 1.3) (yonik) |
| |
| 39. SOLR-940: Add support for Lucene's Trie Range Queries by providing new FieldTypes in |
| schema for int, float, long, double and date. Single-valued Trie based |
| fields with a precisionStep will index multiple precisions and enable |
| faster range queries. (Uwe Schindler, yonik, shalin) |
| |
| 40. SOLR-1038: Enhance CommonsHttpSolrServer to add docs in batch using an iterator API (Noble Paul via shalin) |
| |
| 41. SOLR-844: A SolrServer implementation to front-end multiple solr servers and provides load balancing and failover |
| support (Noble Paul, Mark Miller, hossman via shalin) |
| |
| 42. SOLR-939: ValueSourceRangeFilter/Query - filter based on values in a FieldCache entry or on any arbitrary function of field values. (yonik) |
| |
| 43. SOLR-1095: Fixed performance problem in the StopFilterFactory and simplified code. Added tests as well. (gsingers) |
| |
| 44. SOLR-1096: Introduced httpConnTimeout and httpReadTimeout in replication slave configuration to avoid stalled |
| replication. (Jeff Newburn, Noble Paul, shalin) |
| |
| 45. SOLR-1115: <bool>on</bool> and <bool>yes</bool> work as expected in solrconfig.xml. (koji) |
| |
| 46. SOLR-1099: A FieldAnalysisRequestHandler which provides the analysis functionality of the web admin page as |
| a service. The AnalysisRequestHandler is renamed to DocumentAnalysisRequestHandler which is enhanced with |
| query analysis and showMatch support. AnalysisRequestHandler is now deprecated. Support for both |
| FieldAnalysisRequestHandler and DocumentAnalysisRequestHandler is also provided in the Solrj client. |
| (Uri Boness, shalin) |
| |
| 47. SOLR-1106: Made CoreAdminHandler Actions pluggable so that additional actions may be plugged in or the existing |
| ones can be overridden if needed. (Kay Kay, Noble Paul, shalin) |
| |
| 48. SOLR-1124: Add a top() function query that causes it's argument to |
| have it's values derived from the top level IndexReader, even when |
| invoked from a sub-reader. top() is implicitly used for the |
| ord() and rord() functions. (yonik) |
| |
| 49. SOLR-1110: Support sorting on trie fields with Distributed Search. (Mark Miller, Uwe Schindler via shalin) |
| |
| 50. SOLR-1121: CoreAdminhandler should not need a core . This makes it possible to start a Solr server w/o a core .(noble) |
| |
| 51. SOLR-769: Added support for clustering in contrib/clustering. See http://wiki.apache.org/solr/ClusteringComponent for more info. (gsingers, Stanislaw Osinski) |
| |
| 52. SOLR-1175: disable/enable replication on master side. added two commands 'enableReplication' and 'disableReplication' (noble) |
| |
| 53. SOLR-1179: DocSets can now be used as Lucene Filters via |
| DocSet.getTopFilter() (yonik) |
| |
| 54. SOLR-1116: Add a Binary FieldType (noble) |
| |
| 55. SOLR-1051: Support the merge of multiple indexes as a CoreAdmin and an update command (Ning Li via shalin) |
| |
| 56. SOLR-1152: Snapshoot on ReplicationHandler should accept location as a request parameter (shalin) |
| |
| 57. SOLR-1204: Enhance SpellingQueryConverter to handle UTF-8 instead of ASCII only. |
| Use the NMTOKEN syntax for matching field names. |
| (Michael Ludwig, shalin) |
| |
| 58. SOLR-1189: Support providing username and password for basic HTTP authentication in Java replication |
| (Matthew Gregg, shalin) |
| |
| 59. SOLR-243: Add configurable IndexReaderFactory so that alternate IndexReader implementations |
| can be specified via solrconfig.xml. Note that using a custom IndexReader may be incompatible |
| with ReplicationHandler (see comments in SOLR-1366). This should be treated as an experimental feature. |
| (Andrzej Bialecki, hossman, Mark Miller, John Wang) |
| |
| 60. SOLR-1214: differentiate between solr home and instanceDir .deprecates the method SolrResourceLoader#locateInstanceDir() |
| and it is renamed to locateSolrHome (noble) |
| |
| 61. SOLR-1216 : disambiguate the replication command names. 'snappull' becomes 'fetchindex' 'abortsnappull' becomes 'abortfetch' (noble) |
| |
| 62. SOLR-1145: Add capability to specify an infoStream log file for the underlying Lucene IndexWriter in solrconfig.xml. |
| This is an advanced debug log file that can be used to aid developers in fixing IndexWriter bugs. See the commented |
| out example in the example solrconfig.xml under the indexDefaults section. |
| (Chris Harris, Mark Miller) |
| |
| 63. SOLR-1256: Show the output of CharFilters in analysis.jsp. (koji) |
| |
| 64. SOLR-1266: Added stemEnglishPossessive option (default=true) to WordDelimiterFilter |
| that allows disabling of english possessive stemming (removal of trailing 's from tokens) |
| (Robert Muir via yonik) |
| |
| 65. SOLR-1237: firstSearcher and newSearcher can now be identified via the CommonParams.EVENT (evt) parameter |
| in a request. This allows a RequestHandler or SearchComponent to know when a newSearcher or firstSearcher |
| event happened. QuerySenderListender is the only implementation in Solr that implements this, but outside |
| implementations may wish to. See the AbstractSolrEventListener for a helper method. (gsingers) |
| |
| 66. SOLR-1343: Added HTMLStripCharFilter and marked HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and |
| HTMLStripStandardTokenizerFactory deprecated. To strip HTML tags, HTMLStripCharFilter can be used |
| with an arbitrary Tokenizer. (koji) |
| |
| 67. SOLR-1275: Add expungeDeletes to DirectUpdateHandler2 (noble) |
| |
| 68. SOLR-1372: Enhance FieldAnalysisRequestHandler to accept field value from content stream (ehatcher) |
| |
| 69. SOLR-1370: Show the output of CharFilters in FieldAnalysisRequestHandler (koji) |
| |
| 70. SOLR-1373: Add Filter query to admin/form.jsp |
| (Jason Rutherglen via hossman) |
| |
| 71. SOLR-1368: Add ms() function query for getting milliseconds from dates and for |
| high precision date subtraction, add sub() for subtracting other arguments. |
| (yonik) |
| |
| 72. SOLR-1156: Sort TermsComponent results by frequency (Matt Weber via yonik) |
| |
| 73. SOLR-1335 : load core properties from a properties file (noble) |
| |
| 74. SOLR-1385 : Add an 'enable' attribute to all plugins (noble) |
| |
| 75. SOLR-1414 : implicit core properties are not set for single core (noble) |
| |
| 76. SOLR-659 : Adds shards.start and shards.rows to distributed search |
| to allow more efficient bulk queries (those that retrieve many or all |
| documents). (Brian Whitman via yonik) |
| |
| 77. SOLR-1321: Add better support for efficient wildcard handling (Andrzej Bialecki, Robert Muir, gsingers) |
| |
| 78. SOLR-1326 : New interface PluginInfoInitialized for all types of plugin (noble) |
| |
| 79. SOLR-1447 : Simple property injection. <mergePolicy> & <mergeScheduler> syntaxes are now deprecated |
| (Jason Rutherglen, noble) |
| |
| 80. SOLR-908 : CommonGramsFilterFactory/CommonGramsQueryFilterFactory for |
| speeding up phrase queries containing common words by indexing |
| n-grams and using them at query time. |
| (Tom Burton-West, Jason Rutherglen via yonik) |
| |
| 81. SOLR-1292: Add FieldCache introspection to stats.jsp and JMX Monitoring via |
| a new SolrFieldCacheMBean. (hossman) |
| |
| 82. SOLR-1167: Solr Config now supports XInclude for XML engines that can support it. (Bryan Talbot via gsingers) |
| |
| 83. SOLR-1478: Enable sort by Lucene docid. (ehatcher) |
| |
| 84. SOLR-1449: Add <lib> elements to solrconfig.xml to specifying additional |
| classpath directories and regular expressions. (hossman via yonik) |
| |
| |
| Optimizations |
| ---------------------- |
| 1. SOLR-374: Use IndexReader.reopen to save resources by re-using parts of the |
| index that haven't changed. (Mark Miller via yonik) |
| |
| 2. SOLR-808: Write string keys in Maps as extern strings in the javabin format. (Noble Paul via shalin) |
| |
| 3. SOLR-475: New faceting method with better performance and smaller memory usage for |
| multi-valued fields with many unique values but relatively few values per document. |
| Controllable via the facet.method parameter - "fc" is the new default method and "enum" |
| is the original method. (yonik) |
| |
| 4. SOLR-970: Use an ArrayList in SolrPluginUtils.parseQueryStrings |
| since we know exactly how long the List will be in advance. |
| (Kay Kay via hossman) |
| |
| 5. SOLR-1002: Change SolrIndexSearcher to use insertWithOverflow |
| with reusable priority queue entries to reduce the amount of |
| generated garbage during searching. (Mark Miller via yonik) |
| |
| 6. SOLR-971: Replace StringBuffer with StringBuilder for instances that do not require thread-safety. |
| (Kay Kay via shalin) |
| |
| 7. SOLR-921: SolrResourceLoader must cache short class name vs fully qualified classname |
| (Noble Paul, hossman via shalin) |
| |
| 8. SOLR-973: CommonsHttpSolrServer writes the xml directly to the server. |
| (Noble Paul via shalin) |
| |
| 9. SOLR-1108: Remove un-needed synchronization in SolrCore constructor. |
| (Noble Paul via shalin) |
| |
| 10. SOLR-1166: Speed up docset/filter generation by avoiding top-level |
| score() call and iterating over leaf readers with TermDocs. (yonik) |
| |
| 11. SOLR-1169: SortedIntDocSet - a new small set implementation |
| that saves memory over HashDocSet, is faster to construct, |
| is ordered for easier implementation of skipTo, and is faster |
| in the general case. (yonik) |
| |
| 12. SOLR-1165: Use Lucene Filters and pass them down to the Lucene |
| search methods to filter earlier and improve performance. (yonik) |
| |
| 13. SOLR-1111: Use per-segment sorting to share fieldcache elements |
| across unchanged segments. This saves memory and reduces |
| commit times for incremental updates to the index. (yonik) |
| |
| 14. SOLR-1188: Minor efficiency improvement in TermVectorComponent related to ignoring positions or offsets (gsingers) |
| |
| 15. SOLR-1150: Load Documents for Highlighting one at a time rather than |
| all at once to avoid OOM with many large Documents. (Siddharth Gargate via Mark Miller) |
| |
| 16. SOLR-1353: Implement and use reusable token streams for analysis. (Robert Muir, yonik) |
| |
| 17. SOLR-1296: Enables setting IndexReader's termInfosIndexDivisor via a new attribute to StandardIndexReaderFactory. Enables |
| setting termIndexInterval to IndexWriter via SolrIndexConfig. (Jason Rutherglen, hossman, gsingers) |
| |
| Bug Fixes |
| ---------------------- |
| 1. SOLR-774: Fixed logging level display (Sean Timm via Otis Gospodnetic) |
| |
| 2. SOLR-771: CoreAdminHandler STATUS should display 'normalized' paths (koji, hossman, shalin) |
| |
| 3. SOLR-532: WordDelimiterFilter now respects payloads and other attributes of the original Token by |
| using Token.clone() (Tricia Williams, gsingers) |
| |
| 4. SOLR-805: DisMax queries are not being cached in QueryResultCache (Todd Feak via koji) |
| |
| 5. SOLR-751: WordDelimiterFilter didn't adjust the start offset of single |
| tokens that started with delimiters, leading to incorrect highlighting. |
| (Stefan Oestreicher via yonik) |
| |
| 7. SOLR-843: SynonymFilterFactory cannot handle multiple synonym files correctly (koji) |
| |
| 8. SOLR-840: BinaryResponseWriter does not handle incompatible data in fields (Noble Paul via shalin) |
| |
| 9. SOLR-803: CoreAdminRequest.createCore fails because name parameter isn't set (Sean Colombo via ryan) |
| |
| 10. SOLR-869: Fix file descriptor leak in SolrResourceLoader#getLines (Mark Miller, shalin) |
| |
| 11. SOLR-872: Better error message for incorrect copyField destination (Noble Paul via shalin) |
| |
| 12. SOLR-879: Enable position increments in the query parser and fix the |
| example schema to enable position increments for the stop filter in |
| both the index and query analyzers to fix the bug with phrase queries |
| with stopwords. (yonik) |
| |
| 13. SOLR-836: Add missing "a" to the example stopwords.txt (yonik) |
| |
| 14. SOLR-892: Fix serialization of booleans for PHPSerializedResponseWriter |
| (yonik) |
| |
| 15. SOLR-898: Fix null pointer exception for the JSON response writer |
| based formats when nl.json=arrarr with null keys. (yonik) |
| |
| 16. SOLR-901: FastOutputStream ignores write(byte[]) call. (Noble Paul via shalin) |
| |
| 17. SOLR-807: BinaryResponseWriter writes fieldType.toExternal if it is not a supported type, |
| otherwise it writes fieldType.toObject. This fixes the bug with encoding/decoding UUIDField. |
| (koji, Noble Paul, shalin) |
| |
| 18. SOLR-863: SolrCore.initIndex should close the directory it gets for clearing the lock and |
| use the DirectoryFactory. (Mark Miller via shalin) |
| |
| 19. SOLR-802: Fix a potential null pointer error in the distributed FacetComponent |
| (David Bowen via ryan) |
| |
| 20. SOLR-346: Use perl regex to improve accuracy of finding latest snapshot in snapinstaller (billa) |
| |
| 21. SOLR-830: Use perl regex to improve accuracy of finding latest snapshot in snappuller (billa) |
| |
| 22. SOLR-897: Fixed Argument list too long error when there are lots of snapshots/backups (Dan Rosher via billa) |
| |
| 23. SOLR-925: Fixed highlighting on fields with multiValued="true" and termOffsets="true" (koji) |
| |
| 24. SOLR-902: FastInputStream#read(byte b[], int off, int len) gives incorrect results when amount left to read is less |
| than buffer size (Noble Paul via shalin) |
| |
| 25. SOLR-978: Old files are not removed from slaves after replication (Jaco, Noble Paul, shalin) |
| |
| 26. SOLR-883: Implicit properties are not set for Cores created through CoreAdmin (Noble Paul via shalin) |
| |
| 27. SOLR-991: Better error message when parsing solrconfig.xml fails due to malformed XML. Error message notes the name |
| of the file being parsed. (Michael Henson via shalin) |
| |
| 28. SOLR-1008: Fix stats.jsp XML encoding for <stat> item entries with ampersands in their names. (ehatcher) |
| |
| 29. SOLR-976: deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a <delete>. |
| Now both delete by id and delete by query can be specified at the same time as follows. (koji) |
| <delete> |
| <id>05991</id><id>06000</id> |
| <query>office:Bridgewater</query><query>office:Osaka</query> |
| </delete> |
| |
| 30. SOLR-1016: HTTP 503 error changes 500 in SolrCore (koji) |
| |
| 31. SOLR-1015: Incomplete information in replication admin page and http command response when server |
| is both master and slave i.e. when server is a repeater (Akshay Ukey via shalin) |
| |
| 32. SOLR-1018: Slave is unable to replicate when server acts as repeater (as both master and slave) |
| (Akshay Ukey, Noble Paul via shalin) |
| |
| 33. SOLR-1031: Fix XSS vulnerability in schema.jsp (Paul Lovvik via ehatcher) |
| |
| 34. SOLR-1064: registry.jsp incorrectly displaying info for last core initialized |
| regardless of what the current core is. (hossman) |
| |
| 35. SOLR-1072: absolute paths used in sharedLib attribute were |
| incorrectly treated as relative paths. (hossman) |
| |
| 36. SOLR-1104: Fix some rounding errors in LukeRequestHandler's histogram (hossman) |
| |
| 37. SOLR-1125: Use query analyzer rather than index analyzer for queryFieldType in QueryElevationComponent |
| (koji) |
| |
| 38. SOLR-1126: Replicated files have incorrect timestamp (Jian Han Guo, Jeff Newburn, Noble Paul via shalin) |
| |
| 39. SOLR-1094: Incorrect value of correctlySpelled attribute in some cases (David Smiley, Mark Miller via shalin) |
| |
| 40. SOLR-965: Better error message when <pingQuery> is not configured. |
| (Mark Miller via hossman) |
| |
| 41. SOLR-1135: Java replication creates Snapshot in the directory where Solr was launched (Jianhan Guo via shalin) |
| |
| 42. SOLR-1138: Query Elevation Component now gracefully handles missing queries. (gsingers) |
| |
| 43. SOLR-929: LukeRequestHandler should return "dynamicBase" only if the field is dynamic. |
| (Peter Wolanin, koji) |
| |
| 44. SOLR-1141: NullPointerException during snapshoot command in java based replication (Jian Han Guo, shalin) |
| |
| 45. SOLR-1078: Fixes to WordDelimiterFilter to avoid splitting or dropping |
| international non-letter characters such as non spacing marks. (yonik) |
| |
| 46. SOLR-825, SOLR-1221: Enables highlighting for range/wildcard/fuzzy/prefix queries if using hl.usePhraseHighlighter=true |
| and hl.highlightMultiTerm=true. Also make both options default to true. (Mark Miller, yonik) |
| |
| 47. SOLR-1174: Fix Logging admin form submit url for multicore. (Jacob Singh via shalin) |
| |
| 48. SOLR-1182: Fix bug in OrdFieldSource#equals which could cause a bug with OrdFieldSource caching |
| on OrdFieldSource#hashcode collisions. (Mark Miller) |
| |
| 49. SOLR-1207: equals method should compare this and other of DocList in DocSetBase (koji) |
| |
| 50. SOLR-1242: Human readable JVM info from system handler does integer cutoff rounding, even when dealing |
| with GB. Fixed to round to one decimal place. (Jay Hill, Mark Miller) |
| |
| 51. SOLR-1243: Admin RequestHandlers should not be cached over HTTP. (Mark Miller) |
| |
| 52. SOLR-1260: Fix implementations of set operations for DocList subclasses |
| and fix a bug in HashDocSet construction when offset != 0. These bugs |
| never manifested in normal Solr use and only potentially affect |
| custom code. (yonik) |
| |
| 53. SOLR-1171: Fix LukeRequestHandler so it doesn't rely on SolrQueryParser |
| and report incorrect stats when field names contain characters |
| SolrQueryParser considers special. |
| (hossman) |
| |
| 54. SOLR-1317: Fix CapitalizationFilterFactory to work when keep parameter is not specified. |
| (ehatcher) |
| |
| 55. SOLR-1342: CapitalizationFilterFactory uses incorrect term length calculations. |
| (Robert Muir via Mark Miller) |
| |
| 56. SOLR-1359: DoubleMetaphoneFilter didn't index original tokens if there was no |
| alternative, and could incorrectly skip or reorder tokens. (yonik) |
| |
| 57. SOLR-1360: Prevent PhoneticFilter from producing duplicate tokens. (yonik) |
| |
| 58. SOLR-1371: LukeRequestHandler/schema.jsp errored if schema had no |
| uniqueKey field. The new test for this also (hopefully) adds some |
| future proofing against similar bugs in the future. As a side |
| effect QueryElevationComponentTest was refactored, and a bug in |
| that test was found. (hossman) |
| |
| 59. SOLR-914: General finalize() improvements. No finalizer delegates |
| to the respective close/destroy method w/o first checking if it's |
| already been closed/destroyed; if it hasn't a, SEVERE error is |
| logged first. (noble, hossman) |
| |
| 60. SOLR-1362: WordDelimiterFilter had inconsistent behavior when setting |
| the position increment of tokens following a token consisting of all |
| delimiters, and could additionally lose big position increments. |
| (Robert Muir, yonik) |
| |
| 61. SOLR-1091: Jetty's use of CESU-8 for code points outside the BMP |
| resulted in invalid output from the serialized PHP writer. (yonik) |
| |
| 62. SOLR-1103: LukeRequestHandler (and schema.jsp) have been fixed to |
| include the "1" (ie: 2**0) bucket in the term histogram data. |
| (hossman) |
| |
| 63. SOLR-1398: Add offset corrections in PatternTokenizerFactory. |
| (Anders Melchiorsen, koji) |
| |
| 64. SOLR-1400: Properly handle zero-length tokens in TrimFilter. This |
| was not a bug in any released version. (Peter Wolanin, gsingers) |
| |
| 65. SOLR-1071: spellcheck.extendedResults returns an invalid JSON response |
| when count > 1. To fix, the extendedResults format was changed. |
| (Uri Boness, yonik) |
| |
| 66. SOLR-1381: Fixed improper handling of fields that have only term positions and not term offsets during Highlighting (Thorsten Fischer, gsingers) |
| |
| 67. SOLR-1427: Fixed registry.jsp issue with MBeans (gsingers) |
| |
| 68. SOLR-1468: SolrJ's XML response parsing threw an exception for null |
| names, such as those produced when facet.missing=true (yonik) |
| |
| 69. SOLR-1471: Fixed issue with calculating missing values for facets in single valued cases in Stats Component. |
| This is not correctly calculated for the multivalued case. (James Miller, gsingers) |
| |
| 70. SOLR-1481: Fixed omitHeader parameter for PHP ResponseWriter. (Jun Ohtani via billa) |
| |
| 71. SOLR-1448: Add weblogic.xml to solr webapp to enable correct operation in |
| WebLogic. (Ilan Rabinovitch via yonik) |
| |
| 72. SOLR-1504: empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co. |
| (koji) |
| |
| 73. SOLR-1394: HTMLStripCharFilter split tokens that contained entities and |
| often calculated offsets incorrectly for entities. |
| (Anders Melchiorsen via yonik) |
| |
| 74. SOLR-1517: Admin pages could stall waiting for localhost name resolution |
| if reverse DNS wasn't configured; this was changed so the DNS resolution |
| is attempted only once the first time an admin page is loaded. |
| (hossman) |
| |
| 75. SOLR-1529: More than 8 deleteByQuery commands in a single request |
| caused an error to be returned, although the deletes were |
| still executed. (asmodean via yonik) |
| |
| Other Changes |
| ---------------------- |
| 1. Upgraded to Lucene 2.4.0 (yonik) |
| |
| 2. SOLR-805: Upgraded to Lucene 2.9-dev (r707499) (koji) |
| |
| 3. DumpRequestHandler (/debug/dump): changed 'fieldName' to 'sourceInfo'. (ehatcher) |
| |
| 4. SOLR-852: Refactored common code in CSVRequestHandler and XMLUpdateRequestHandler (gsingers, ehatcher) |
| |
| 5. SOLR-871: Removed dependency on stax-utils.jar. If you using solr.jar and running |
| java 6, you can also remove woodstox and geronimo. (ryan) |
| |
| 6. SOLR-465: Upgraded to Lucene 2.9-dev (r719351) (shalin) |
| |
| 7. SOLR-889: Upgraded to commons-io-1.4.jar and commons-fileupload-1.2.1.jar (ryan) |
| |
| 8. SOLR-875: Upgraded to Lucene 2.9-dev (r723985) and consolidated the BitSet implementations (Michael Busch, gsingers) |
| |
| 9. SOLR-819: Upgraded to Lucene 2.9-dev (r724059) to get access to Arabic public constructors (gsingers) |
| and |
| 10. SOLR-900: Moved solrj into /src/solrj. The contents of solr-common.jar is now included |
| in the solr-solrj.jar. (ryan) |
| |
| 11. SOLR-924: Code cleanup: make all existing finalize() methods call |
| super.finalize() in a finally block. All current instances extend |
| Object, so this doesn't fix any bugs, but helps protect against |
| future changes. (Kay Kay via hossman) |
| |
| 12. SOLR-885: NamedListCodec is renamed to JavaBinCodec and returns Object instead of NamedList. |
| (Noble Paul, yonik via shalin) |
| |
| 13. SOLR-84: Use new Solr logo in admin (Michiel via koji) |
| |
| 14. SOLR-981: groupId for Woodstox dependency in maven solrj changed to org.codehaus.woodstox (Tim Taranov via shalin) |
| |
| 15. Upgraded to Lucene 2.9-dev r738218 (yonik) |
| |
| 16. SOLR-959: Refactored TestReplicationHandler to remove hardcoded port numbers (hossman, Akshay Ukey via shalin) |
| |
| 17. Upgraded to Lucene 2.9-dev r742220 (yonik) |
| |
| 18. SOLR-1022: Better "ignored" field in example schema.xml (Peter Wolanin via hossman) |
| |
| 19. SOLR-967: New type-safe constructor for NamedList (Kay Kay via hossman) |
| |
| 20. SOLR-1036: Change default QParser from "lucenePlusSort" to "lucene" to |
| reduce confusion of semicolon splitting behavior when no sort param is |
| specified (hossman) |
| |
| 21. Upgraded to Lucene 2.9-dev r752164 (shalin) |
| |
| 22. SOLR-1068: Use fsync on replicated index and configuration files (yonik, Noble Paul, shalin) |
| |
| 23. SOLR-952: Cleanup duplicated code in deprecated HighlightingUtils (hossman) |
| |
| 24. Upgraded to Lucene 2.9-dev r764281 (shalin) |
| |
| 25. SOLR-1079: Rename omitTf to omitTermFreqAndPositions (shalin) |
| |
| 26. SOLR-804: Added Lucene's misc contrib JAR (rev 764281). (gsingers) |
| |
| 27. Upgraded to Lucene 2.9-dev r768228 (shalin) |
| |
| 28. Upgraded to Lucene 2.9-dev r768336 (shalin) |
| |
| 29. SOLR-997: Wait for a longer time for slave to complete replication in TestReplicationHandler |
| (Mark Miller via shalin) |
| |
| 30. SOLR-748: FacetComponent helper classes are made public as an experimental API. |
| (Wojtek Piaseczny via shalin) |
| |
| 31. Upgraded to Lucene 2.9-dev 773862 (Mark Miller) |
| |
| 32. Upgraded to Lucene 2.9-dev r776177 (shalin) |
| |
| 33. SOLR-1149: Made QParserPlugin and related classes extendible as an experimental API. |
| (Kaktu Chakarabati via shalin) |
| |
| 34. Upgraded to Lucene 2.9-dev r779312 (yonik) |
| |
| 35. SOLR-786: Refactor DisMaxQParser to allow overriding certain features of DisMaxQParser |
| (Wojciech Biela via shalin) |
| |
| 36. SOLR-458: Add equals and hashCode methods to NamedList (Stefan Rinner, shalin) |
| |
| 37. SOLR-1184: Add option in solrconfig to open a new IndexReader rather than |
| using reopen. Done mainly as a fail-safe in the case that a user runs into |
| a reopen bug/issue. (Mark Miller) |
| |
| 38. SOLR-1215 use double quotes to enclose attributes in solr.xml (noble) |
| |
| 39. SOLR-1151: add dynamic copy field and maxChars example to example schema.xml. |
| (Peter Wolanin, Mark Miller) |
| |
| 40. SOLR-1233: remove /select?qt=/whatever restriction on /-prefixed request handlers. |
| (ehatcher) |
| |
| 41. SOLR-1257: logging.jsp has been removed and now passes through to the |
| hierarchical log level tool added in Solr 1.3. Users still |
| hitting "/admin/logging.jsp" should switch to "/admin/logging". |
| (hossman) |
| |
| 42. Upgraded to Lucene 2.9-dev r794238. Other changes include: |
| LUCENE-1614 - Use Lucene's DocIdSetIterator.NO_MORE_DOCS as the sentinel value. |
| LUCENE-1630 - Add acceptsDocsOutOfOrder method to Collector implementations. |
| LUCENE-1673, LUCENE-1701 - Trie has moved to Lucene core and renamed to NumericRangeQuery. |
| LUCENE-1662, LUCENE-1687 - Replace usage of ExtendedFieldCache by FieldCache. |
| (shalin) |
| |
| 42. SOLR-1241: Solr's CharFilter has been moved to Lucene. Remove CharFilter and related classes |
| from Solr and use Lucene's corresponding code (koji via shalin) |
| |
| 43. SOLR-1261: Lucene trunk renamed RangeQuery & Co to TermRangeQuery (Uwe Schindler via shalin) |
| |
| 44. Upgraded to Lucene 2.9-dev r801856 (Mark Miller) |
| |
| 45. SOLR1276: Added StatsComponentTest (Rafa�ł Ku�ć, gsingers) |
| |
| 46. SOLR-1377: The TokenizerFactory API has changed to explicitly return a Tokenizer |
| rather then a TokenStream (that may be or may not be a Tokenizer). This change |
| is required to take advantage of the Token reuse improvements in lucene 2.9. (ryan) |
| |
| 47. SOLR-1410: Log a warning if the deprecated charset option is used |
| on GreekLowerCaseFilterFactory, RussianStemFilterFactory, |
| RussianLowerCaseFilterFactory or RussianLetterTokenizerFactory. |
| (Robert Muir via hossman) |
| |
| 48. SOLR-1423: Due to LUCENE-1906, Solr's tokenizer should use Tokenizer.correctOffset() instead of CharStream.correctOffset(). |
| (Uwe Schindler via koji) |
| |
| 49. SOLR-1319, SOLR-1345: Upgrade Solr Highlighter classes to new Lucene Highlighter API. This upgrade has |
| resulted in a back compat break in the DefaultSolrHighlighter class - getQueryScorer is no longer |
| protected. If you happened to be overriding that method in custom code, overide getHighlighter instead. |
| Also, HighlightingUtils#getQueryScorer has been removed as it was deprecated and backcompat has been |
| broken with it anyway. (Mark Miller) |
| |
| 50. SOLR-1357 SolrInputDocument cannot process dynamic fields (Lars Grote via noble) |
| |
| Build |
| ---------------------- |
| 1. SOLR-776: Added in ability to sign artifacts via Ant for releases (gsingers) |
| |
| 2. SOLR-854: Added run-example target (Mark Miller via ehatcher) |
| |
| 3. SOLR-1054:Fix dist-src target for DataImportHandler (Ryuuichi Kumai via shalin) |
| |
| 4. SOLR-1219: Added proxy.setup target (koji) |
| |
| 5. SOLR-1386: In build.xml, use longfile="gnu" in tar task to avoid warnings about long file names |
| (Mark Miller via shalin) |
| |
| 6. SOLR-1441: Make it possible to run all tests in a package (shalin) |
| |
| |
| Documentation |
| ---------------------- |
| 1. SOLR-789: The javadoc of RandomSortField is not readable (Nicolas Lalev�Á�e via koji) |
| |
| 2. SOLR-962: Note about null handling in ModifiableSolrParams.add javadoc |
| (Kay Kay via hossman) |
| |
| 3. SOLR-1409: Added Solr Powered By Logos |
| |
| ================== Release 1.3.0 ================== |
| |
| Upgrading from Solr 1.2 |
| ----------------------- |
| IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves |
| should be upgraded before the master! If the master were to be updated |
| first, the older searchers would not be able to read the new index format. |
| |
| The Porter snowball based stemmers in Lucene were updated (LUCENE-1142), |
| and are not guaranteed to be backward compatible at the index level |
| (the stem of certain words may have changed). Re-indexing is recommended. |
| |
| Older Apache Solr installations can be upgraded by replacing |
| the relevant war file with the new version. No changes to configuration |
| files should be needed. |
| |
| This version of Solr contains a new version of Lucene implementing |
| an updated index format. This version of Solr/Lucene can still read |
| and update indexes in the older formats, and will convert them to the new |
| format on the first index change. Be sure to backup your index before |
| upgrading in case you need to downgrade. |
| |
| Solr now recognizes HTTP Request headers related to HTTP Caching (see |
| RFC 2616 sec13) and will by default respond with "304 Not Modified" |
| when appropriate. This should only affect users who access Solr via |
| an HTTP Cache, or via a Web-browser that has an internal cache, but if |
| you wish to suppress this behavior an '<httpCaching never304="true"/>' |
| option can be added to your solrconfig.xml. See the wiki (or the |
| example solrconfig.xml) for more details... |
| http://wiki.apache.org/solr/SolrConfigXml#HTTPCaching |
| |
| In Solr 1.2, DateField did not enforce the canonical representation of |
| the ISO 8601 format when parsing incoming data, and did not generation |
| the canonical format when generating dates from "Date Math" strings |
| (particularly as it pertains to milliseconds ending in trailing zeros) |
| -- As a result equivalent dates could not always be compared properly. |
| This problem is corrected in Solr 1.3, but DateField users that might |
| have been affected by indexing inconsistent formats of equivilent |
| dates (ie: 1995-12-31T23:59:59Z vs 1995-12-31T23:59:59.000Z) may want |
| to consider reindexing to correct these inconsistencies. Users who |
| depend on some of the the "broken" behavior of DateField in Solr 1.2 |
| (specificly: accepting any input that ends in a 'Z') should consider |
| using the LegacyDateField class as a possible alternative. Users that |
| desire 100% backwards compatibility should consider using the Solr 1.2 |
| version of DateField. |
| |
| Due to some changes in the lifecycle of TokenFilterFactories, users of |
| Solr 1.2 who have written Java code which constructs new instances of |
| StopFilterFactory, SynonymFilterFactory, or EnglishProterFilterFactory |
| will need to modify their code by adding a line like the following |
| prior to using the factory object... |
| factory.inform(SolrCore.getSolrCore().getSolrConfig().getResourceLoader()); |
| These lifecycle changes do not affect people who use Solr "out of the |
| box" or who have developed their own TokenFilterFactory plugins. More |
| info can be found in SOLR-594. |
| |
| The python client that used to ship with Solr is no longer included in |
| the distribution (see client/python/README.txt). |
| |
| Detailed Change List |
| -------------------- |
| |
| New Features |
| 1. SOLR-69: Adding MoreLikeThisHandler to search for similar documents using |
| lucene contrib/queries MoreLikeThis. MoreLikeThis is also available from |
| the StandardRequestHandler using ?mlt=true. (bdelacretaz, ryan) |
| |
| 2. SOLR-253: Adding KeepWordFilter and KeepWordFilterFactory. A TokenFilter |
| that keeps tokens with text in the registered keeplist. This behaves like |
| the inverse of StopFilter. (ryan) |
| |
| 3. SOLR-257: WordDelimiterFilter has a new parameter splitOnCaseChange, |
| which can be set to 0 to disable splitting "PowerShot" => "Power" "Shot". |
| (klaas) |
| |
| 4. SOLR-193: Adding SolrDocument and SolrInputDocument to represent documents |
| outside of the lucene Document infrastructure. This class will be used |
| by clients and for processing documents. (ryan) |
| |
| 5. SOLR-244: Added ModifiableSolrParams - a SolrParams implementation that |
| help you change values after initialization. (ryan) |
| |
| 6. SOLR-20: Added a java client interface with two implementations. One |
| implementation uses commons httpclient to connect to solr via HTTP. The |
| other connects to solr directly. Check client/java/solrj. This addition |
| also includes tests that start jetty and test a connection using the full |
| HTTP request cycle. (Darren Erik Vengroff, Will Johnson, ryan) |
| |
| 7. SOLR-133: Added StaxUpdateRequestHandler that uses StAX for XML parsing. |
| This implementation has much better error checking and lets you configure |
| a custom UpdateRequestProcessor that can selectively process update |
| requests depending on the request attributes. This class will likely |
| replace XmlUpdateRequestHandler. (Thorsten Scherler, ryan) |
| |
| 8. SOLR-264: Added RandomSortField, a utility field with a random sort order. |
| The seed is based on a hash of the field name, so a dynamic field |
| of this type is useful for generating different random sequences. |
| This field type should only be used for sorting or as a value source |
| in a FunctionQuery (ryan, hossman, yonik) |
| |
| 9. SOLR-266: Adding show=schema to LukeRequestHandler to show the parsed |
| schema fields and field types. (ryan) |
| |
| 10. SOLR-133: The UpdateRequestHandler now accepts multiple delete options |
| within a single request. For example, sending: |
| <delete><id>1</id><id>2</id></delete> will delete both 1 and 2. (ryan) |
| |
| 11. SOLR-269: Added UpdateRequestProcessor plugin framework. This provides |
| a reasonable place to process documents after they are parsed and |
| before they are committed to the index. This is a good place for custom |
| document manipulation or document based authorization. (yonik, ryan) |
| |
| 12. SOLR-260: Converting to a standard PluginLoader framework. This reworks |
| RequestHandlers, FieldTypes, and QueryResponseWriters to share the same |
| base code for loading and initializing plugins. This adds a new |
| configuration option to define the default RequestHandler and |
| QueryResponseWriter in XML using default="true". (ryan) |
| |
| 13. SOLR-225: Enable pluggable highlighting classes. Allow configurable |
| highlighting formatters and Fragmenters. (ryan) |
| |
| 14. SOLR-273/376/452/516: Added hl.maxAnalyzedChars highlighting parameter, defaulting |
| to 50k, hl.alternateField, which allows the specification of a backup |
| field to use as summary if no keywords are matched, and hl.mergeContiguous, |
| which combines fragments if they are adjacent in the source document. |
| (klaas, Grant Ingersoll, Koji Sekiguchi via klaas) |
| |
| 15. SOLR-291: Control maximum number of documents to cache for any entry |
| in the queryResultCache via queryResultMaxDocsCached solrconfig.xml |
| entry. (Koji Sekiguchi via yonik) |
| |
| 16. SOLR-240: New <lockType> configuration setting in <mainIndex> and |
| <indexDefaults> blocks supports all Lucene builtin LockFactories. |
| 'single' is recommended setting, but 'simple' is default for total |
| backwards compatibility. |
| (Will Johnson via hossman) |
| |
| 17. SOLR-248: Added CapitalizationFilterFactory that creates tokens with |
| normalized capitalization. This filter is useful for facet display, |
| but will not work with a prefix query. (ryan) |
| SOLR-468: Change to the semantics to keep the original token, not the |
| token in the Map. Also switched to use Lucene's new reusable token |
| capabilities. (gsingers) |
| |
| 18. SOLR-307: Added NGramFilterFactory and EdgeNGramFilterFactory. |
| (Thomas Peuss via Otis Gospodnetic) |
| |
| 19. SOLR-305: analysis.jsp can be given a fieldtype instead of a field |
| name. (hossman) |
| |
| 20. SOLR-102: Added RegexFragmenter, which splits text for highlighting |
| based on a given pattern. (klaas) |
| |
| 21. SOLR-258: Date Faceting added to SimpleFacets. Facet counts |
| computed for ranges of size facet.date.gap (a DateMath expression) |
| between facet.date.start and facet.date.end. (hossman) |
| |
| 22. SOLR-196: A PHP serialized "phps" response writer that returns a |
| serialized array that can be used with the PHP function unserialize, |
| and a PHP response writer "php" that may be used by eval. |
| (Nick Jenkin, Paul Borgermans, Pieter Berkel via yonik) |
| |
| 23. SOLR-308: A new UUIDField class which accepts UUID string values, |
| as well as the special value of "NEW" which triggers generation of |
| a new random UUID. |
| (Thomas Peuss via hossman) |
| |
| 24. SOLR-349: New FunctionQuery functions: sum, product, div, pow, log, |
| sqrt, abs, scale, map. Constants may now be used as a value source. |
| (yonik) |
| |
| 25. SOLR-359: Add field type className to Luke response, and enabled access |
| to the detailed field information from the solrj client API. |
| (Grant Ingersoll via ehatcher) |
| |
| 26. SOLR-334: Pluggable query parsers. Allows specification of query |
| type and arguments as a prefix on a query string. (yonik) |
| |
| 27. SOLR-351: External Value Source. An external file may be used |
| to specify the values of a field, currently usable as |
| a ValueSource in a FunctionQuery. (yonik) |
| |
| 28. SOLR-395: Many new features for the spell checker implementation, including |
| an extended response mode with much richer output, multi-word spell checking, |
| and a bevy of new and renamed options (see the wiki). |
| (Mike Krimerman, Scott Taber via klaas). |
| |
| 29. SOLR-408: Added PingRequestHandler and deprecated SolrCore.getPingQueryRequest(). |
| Ping requests should be configured using standard RequestHandler syntax in |
| solrconfig.xml rather then using the <pingQuery></pingQuery> syntax. |
| (Karsten Sperling via ryan) |
| |
| 30. SOLR-281: Added a 'Search Component' interface and converted StandardRequestHandler |
| and DisMaxRequestHandler to use this framework. |
| (Sharad Agarwal, Henri Biestro, yonik, ryan) |
| |
| 31. SOLR-176: Add detailed timing data to query response output. The SearchHandler |
| interface now returns how long each section takes. (klaas) |
| |
| 32. SOLR-414: Plugin initialization now supports SolrCore and ResourceLoader "Aware" |
| plugins. Plugins that implement SolrCoreAware or ResourceLoaderAware are |
| informed about the SolrCore/ResourceLoader. (Henri Biestro, ryan) |
| |
| 33. SOLR-350: Support multiple SolrCores running in the same solr instance and allows |
| runtime runtime management for any running SolrCore. If a solr.xml file exists |
| in solr.home, this file is used to instanciate multiple cores and enables runtime |
| core manipulation. For more informaion see: http://wiki.apache.org/solr/CoreAdmin |
| (Henri Biestro, ryan) |
| |
| 34. SOLR-447: Added an single request handler that will automatically register all |
| standard admin request handlers. This replaces the need to register (and maintain) |
| the set of admin request handlers. Assuming solrconfig.xml includes: |
| <requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" /> |
| This will register: Luke/SystemInfo/PluginInfo/ThreadDump/PropertiesRequestHandler. |
| (ryan) |
| |
| 35. SOLR-142: Added RawResponseWriter and ShowFileRequestHandler. This returns config |
| files directly. If AdminHandlers are configured, this will be added automatically. |
| The jsp files /admin/get-file.jsp and /admin/raw-schema.jsp have been deprecated. |
| The deprecated <admin><gettableFiles> will be automatically registered with |
| a ShowFileRequestHandler instance for backwards compatibility. (ryan) |
| |
| 36. SOLR-446: TextResponseWriter can write SolrDocuments and SolrDocumentLists the |
| same way it writes Document and DocList. (yonik, ryan) |
| |
| 37. SOLR-418: Adding a query elevation component. This is an optional component to |
| elevate some documents to the top positions (or exclude them) for a given query. |
| (ryan) |
| |
| 38. SOLR-478: Added ability to get back unique key information from the LukeRequestHandler. |
| (gsingers) |
| |
| 39. SOLR-127: HTTP Caching awareness. Solr now recognizes HTTP Request |
| headers related to HTTP Caching (see RFC 2616 sec13) and will respond |
| with "304 Not Modified" when appropriate. New options have been added |
| to solrconfig.xml to influence this behavior. |
| (Thomas Peuss via hossman) |
| |
| 40. SOLR-303: Distributed Search over HTTP. Specification of shards |
| argument causes Solr to query those shards and merge the results |
| into a single response. Querying, field faceting (sorted only), |
| query faceting, highlighting, and debug information are supported |
| in distributed mode. |
| (Sharad Agarwal, Patrick O'Leary, Sabyasachi Dalal, Stu Hood, |
| Jayson Minard, Lars Kotthoff, ryan, yonik) |
| |
| 41. SOLR-356: Pluggable functions (value sources) that allow |
| registration of new functions via solrconfig.xml |
| (Doug Daniels via yonik) |
| |
| 42. SOLR-494: Added cool admin Ajaxed schema explorer. |
| (Greg Ludington via ehatcher) |
| |
| 43. SOLR-497: Added date faceting to the QueryResponse in SolrJ |
| and QueryResponseTest (Shalin Shekhar Mangar via gsingers) |
| |
| 44. SOLR-486: Binary response format, faster and smaller |
| than XML and JSON response formats (use wt=javabin). |
| BinaryResponseParser for utilizing the binary format via SolrJ |
| and is now the default. |
| (Noble Paul, yonik) |
| |
| 45. SOLR-521: StopFilterFactory support for "enablePositionIncrements" |
| (Walter Ferrara via hossman) |
| |
| 46. SOLR-557: Added SolrCore.getSearchComponents() to return an unmodifiable Map. (gsingers) |
| |
| 47. SOLR-516: Added hl.maxAlternateFieldLength parameter, to set max length for hl.alternateField |
| (Koji Sekiguchi via klaas) |
| |
| 48. SOLR-319: Changed SynonymFilterFactory to "tokenize" synonyms file. |
| To use a tokenizer, specify "tokenizerFactory" attribute in <filter>. |
| For example: |
| <tokenizer class="solr.CJKTokenizerFactory"/> |
| <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" expand="true" |
| ignoreCase="true" tokenizerFactory="solr.CJKTokenizerFactory"/> |
| (koji) |
| |
| 49. SOLR-515: Added SimilarityFactory capability to schema.xml, |
| making config file parameters usable in the construction of |
| the global Lucene Similarity implementation. |
| (ehatcher) |
| |
| 50. SOLR-536: Add a DocumentObjectBinder to solrj that converts Objects to and |
| from SolrDocuments. (Noble Paul via ryan) |
| |
| 51. SOLR-595: Add support for Field level boosting in the MoreLikeThis Handler. |
| (Tom Morton, gsingers) |
| |
| 52. SOLR-572: Added SpellCheckComponent and org.apache.solr.spelling package to support more spell |
| checking functionality. Also includes ability to add your own SolrSpellChecker implementation that |
| plugs in. See http://wiki.apache.org/solr/SpellCheckComponent for more details |
| (Shalin Shekhar Mangar, Bojan Smid, gsingers) |
| |
| 53. SOLR-679: Added accessor methods to Lucene based spell checkers (gsingers) |
| |
| 54. SOLR-423: Added Request Handler close hook notification so that RequestHandlers can be notified |
| when a core is closing. (gsingers, ryan) |
| |
| 55. SOLR-603: Added ability to partially optimize. (gsingers) |
| |
| 56. SOLR-483: Add byte/short sorting support (gsingers) |
| |
| 57. SOLR-14: Add preserveOriginal flag to WordDelimiterFilter |
| (Geoffrey Young, Trey Hyde, Ankur Madnani, yonik) |
| |
| 58. SOLR-502: Add search timeout support. (Sean Timm via yonik) |
| |
| 59. SOLR-605: Add the ability to register callbacks programatically (ryan, Noble Paul) |
| |
| 60. SOLR-610: hl.maxAnalyzedChars can be -1 to highlight everything (Lars Kotthoff via klaas) |
| |
| 61. SOLR-522: Make analysis.jsp show payloads. (Tricia Williams via yonik) |
| |
| 62. SOLR-611: Expose sort_values returned by QueryComponent in SolrJ's QueryResponse |
| (Dan Rosher via shalin) |
| |
| 63. SOLR-256: Support exposing Solr statistics through JMX (Sharad Agrawal, shalin) |
| |
| 64. SOLR-666: Expose warmup time in statistics for SolrIndexSearcher and LRUCache (shalin) |
| |
| 65. SOLR-663: Allow multiple files for stopwords, keepwords, protwords and synonyms |
| (Otis Gospodnetic, shalin) |
| |
| 66. SOLR-469: Added DataImportHandler as a contrib project which makes indexing data from Databases, |
| XML files and HTTP data sources into Solr quick and easy. Includes API and implementations for |
| supporting multiple data sources, processors and transformers for importing data. Supports full |
| data imports as well as incremental (delta) indexing. See http://wiki.apache.org/solr/DataImportHandler |
| for more details. (Noble Paul, shalin) |
| |
| 67. SOLR-622: SpellCheckComponent supports auto-loading indices on startup and optionally, (re)builds |
| indices on newSearcher event, if configured in solrconfig.xml (shalin) |
| |
| 68. SOLR-554: Hierarchical JDK log level selector for SOLR Admin replaces logging.jsp |
| (Sean Timm via shalin) |
| |
| 69. SOLR-506: Emitting HTTP Cache headers can be enabled or disabled through configuration on a |
| per-handler basis (shalin) |
| |
| 70. SOLR-716: Added support for properties in configuration files. Properties can be specified in |
| solr.xml and can be used in solrconfig.xml and schema.xml (Henri Biestro, hossman, ryan, shalin) |
| |
| 71. SOLR-1129 : Support binding dynamic fields to beans in SolrJ (Avlesh Singh , noble) |
| |
| 72. SOLR-920 : Cache and reuse IndexSchema . A new attribute added in solr.xml called 'shareSchema' (noble) |
| |
| Changes in runtime behavior |
| 1. SOLR-559: use Lucene updateDocument, deleteDocuments methods. This |
| removes the maxBufferedDeletes parameter added by SOLR-310 as Lucene |
| now manages the deletes. This provides slightly better indexing |
| performance and makes overwrites atomic, eliminating the possibility of |
| a crash causing duplicates. (yonik) |
| |
| 2. SOLR-689 / SOLR-695: If you have used "MultiCore" functionality in an unreleased |
| version of 1.3-dev, many classes and configs have been renamed for the official |
| 1.3 release. Speciffically, solr.xml has replaced multicore.xml, and uses a slightly |
| different syntax. The solrj classes: MultiCore{Request/Response/Params} have been |
| renamed: CoreAdmin{Request/Response/Params} (hossman, ryan, Henri Biestro) |
| |
| 3. SOLR-647: reference count the SolrCore uses to prevent a premature |
| close while a core is still in use. (Henri Biestro, Noble Paul, yonik) |
| |
| 4. SOLR-737: SolrQueryParser now uses a ConstantScoreQuery for wildcard |
| queries that prevent an exception from being thrown when the number |
| of matching terms exceeds the BooleanQuery clause limit. (yonik) |
| |
| Optimizations |
| 1. SOLR-276: improve JSON writer speed. (yonik) |
| |
| 2. SOLR-310: bound and reduce memory usage by providing <maxBufferedDeletes> parameter, |
| which flushes deleted without forcing the user to use <commit/> for this purpose. |
| (klaas) |
| |
| 3. SOLR-348: short-circuit faceting if less than mincount docs match. (yonik) |
| |
| 4. SOLR-354: Optimize removing all documents. Now when a delete by query |
| of *:* is issued, the current index is removed. (yonik) |
| |
| 5. SOLR-377: Speed up response writers. (yonik) |
| |
| 6. SOLR-342: Added support into the SolrIndexWriter for using several new features of the new |
| LuceneIndexWriter, including: setRAMBufferSizeMB(), setMergePolicy(), setMergeScheduler. |
| Also, added support to specify Lucene's autoCommit functionality (not to be confused with Solr's |
| similarily named autoCommit functionality) via the <luceneAutoCommit> config. item. See the test |
| and example solrconfig.xml <indexDefaults> section for usage. Performance during indexing should |
| be significantly increased by moving up to 2.3 due to Lucene's new indexing capabilities. |
| Furthermore, the setRAMBufferSizeMB makes it more logical to decide on tuning factors related to |
| indexing. For best performance, leave the mergePolicy and mergeScheduler as the defaults and set |
| ramBufferSizeMB instead of maxBufferedDocs. The best value for this depends on the types of |
| documents in use. 32 should be a good starting point, but reports have shown up to 48 MB provides |
| good results. Note, it is acceptable to set both ramBufferSizeMB and maxBufferedDocs, and Lucene |
| will flush based on whichever limit is reached first. (gsingers) |
| |
| 7. SOLR-330: Converted TokenStreams to use Lucene's new char array based |
| capabilities. (gsingers) |
| |
| 8. SOLR-624: Only take snapshots if there are differences to the index (Richard Trey Hyde via gsingers) |
| |
| 9. SOLR-587: Delete by Query performance greatly improved by using |
| new underlying Lucene IndexWriter implementation. (yonik) |
| |
| 10. SOLR-730: Use read-only IndexReaders that don't synchronize |
| isDeleted(). This will speed up function queries and *:* queries |
| as well as improve their scalability on multi-CPU systems. |
| (Mark Miller via yonik) |
| |
| Bug Fixes |
| 1. Make TextField respect sortMissingFirst and sortMissingLast fields. |
| (J.J. Larrea via yonik) |
| |
| 2. autoCommit/maxDocs was not working properly when large autoCommit/maxTime |
| was specified (klaas) |
| |
| 3. SOLR-283: autoCommit was not working after delete. (ryan) |
| |
| 4. SOLR-286: ContentStreamBase was not using default encoding for getBytes() |
| (Toru Matsuzawa via ryan) |
| |
| 5. SOLR-292: Fix MoreLikeThis facet counting. (Pieter Berkel via ryan) |
| |
| 6. SOLR-297: Fix bug in RequiredSolrParams where requiring a field |
| specific param would fail if a general default value had been supplied. |
| (hossman) |
| |
| 7. SOLR-331: Fix WordDelimiterFilter handling of offsets for synonyms or |
| other injected tokens that can break highlighting. (yonik) |
| |
| 8. SOLR-282: Snapshooter does not work on Solaris and OS X since the cp command |
| there does not have the -l option. Also updated commit/optimize related |
| scripts to handle both old and new response format. (bill) |
| |
| 9. SOLR-294: Logging of elapsed time broken on Solaris because the date command |
| there does not support the %s output format. (bill) |
| |
| 10. SOLR-136: Snappuller - "date -d" and locales don't mix. (J�Á�rgen Hermann via bill) |
| |
| 11. SOLR-333: Changed distributiondump.jsp to use Solr HOME instead of CWD to set path. |
| |
| 12. SOLR-393: Removed duplicate contentType from raw-schema.jsp. (bill) |
| |
| 13. SOLR-413: Requesting a large numbers of documents to be returned (limit) |
| can result in an out-of-memory exception, even for a small index. (yonik) |
| |
| 14. The CSV loader incorrectly threw an exception when given |
| header=true (the default). (ryan, yonik) |
| |
| 15. SOLR-449: the python and ruby response writers are now able to correctly |
| output NaN and Infinity in their respective languages. (klaas) |
| |
| 16. SOLR-42: HTMLStripReader tokenizers now preserve correct source |
| offsets for highlighting. (Grant Ingersoll via yonik) |
| |
| 17. SOLR-481: Handle UnknownHostException in _info.jsp (gsingers) |
| |
| 18. SOLR-324: Add proper support for Long and Doubles in sorting, etc. (gsingers) |
| |
| 19. SOLR-496: Cache-Control max-age changed to Long so Expires |
| calculation won't cause overflow. (Thomas Peuss via hossman) |
| |
| 20. SOLR-535: Fixed typo (Tokenzied -> Tokenized) in schema.jsp (Thomas Peuss via billa) |
| |
| 21. SOLR-529: Better error messages from SolrQueryParser when field isn't |
| specified and there is no defaultSearchField in schema.xml |
| (Lars Kotthoff via hossman) |
| |
| 22. SOLR-530: Better error messages/warnings when parsing schema.xml: |
| field using bogus fieldtype and multiple copyFields to a non-multiValue |
| field. (Shalin Shekhar Mangar via hossman) |
| |
| 23. SOLR-528: Better error message when defaultSearchField is bogus or not |
| indexed. (Lars Kotthoff via hossman) |
| |
| 24. SOLR-533: Fixed tests so they don't use hardcoded port numbers. |
| (hossman) |
| |
| 25. SOLR-400: SolrExceptionTest should now handle using OpenDNS as a DNS provider (gsingers) |
| |
| 26. SOLR-541: Legacy XML update support (provided by SolrUpdateServlet |
| when no RequestHandler is mapped to "/update") now logs error correctly. |
| (hossman) |
| |
| 27. SOLR-267: Changed logging to report number of hits, and also provide a mechanism to add log |
| messages to be output by the SolrCore via a NamedList toLog member variable. |
| (Will Johnson, yseeley, gsingers) |
| |
| SOLR-267: Removed adding values to the HTTP headers in SolrDispatchFilter (gsingers) |
| |
| 28. SOLR-509: Moved firstSearcher event notification to the end of the SolrCore constructor |
| (Koji Sekiguchi via gsingers) |
| |
| 29. SOLR-470, SOLR-552, SOLR-544, SOLR-701: Multiple fixes to DateField |
| regarding lenient parsing of optional milliseconds, and correct |
| formating using the canonical representation. LegacyDateField has |
| been added for people who have come to depend on the existing |
| broken behavior. (hossman, Stefan Oestreicher) |
| |
| 30. SOLR-539: Fix for non-atomic long counters and a cast fix to avoid divide |
| by zero. (Sean Timm via Otis Gospodnetic) |
| |
| 31. SOLR-514: Added explicit media-type with UTF* charset to *.xsl files that |
| don't already have one. (hossman) |
| |
| 32. SOLR-505: Give RequestHandlers the possiblity to suppress the generation |
| of HTTP caching headers. (Thomas Peuss via Otis Gospodnetic) |
| |
| 33. SOLR-553: Handle highlighting of phrase terms better when |
| hl.usePhraseHighligher=true URL param is used. |
| (Bojan Smid via Otis Gospodnetic) |
| |
| 34. SOLR-590: Limitation in pgrep on Linux platform breaks script-utils fixUser. |
| (Hannes Schmidt via billa) |
| |
| 35. SOLR-597: SolrServlet no longer "caches" SolrCore. This was causing |
| problems in Resin, and could potentially cause problems for customized |
| usages of SolrServlet. |
| |
| 36. SOLR-585: Now sets the QParser on the ResponseBuilder (gsingers) |
| |
| 37. SOLR-604: If the spellchecking path is relative, make it relative to the Solr Data Directory. |
| (Shalin Shekhar Mangar via gsingers) |
| |
| 38. SOLR-584: Make stats.jsp and stats.xsl more robust. |
| (Yousef Ourabi and hossman) |
| |
| 39. SOLR-443: SolrJ: Declare UTF-8 charset on POSTed parameters |
| to avoid problems with servlet containers that default to latin-1 |
| and allow switching of the exact POST mechanism for parameters |
| via useMultiPartPost in CommonsHttpSolrServer. |
| (Lars Kotthoff, Andrew Schurman, ryan, yonik) |
| |
| 40. SOLR-556: multi-valued fields always highlighted in disparate snippets |
| (Lars Kotthoff via klaas) |
| |
| 41. SOLR-501: Fix admin/analysis.jsp UTF-8 input for some other servlet |
| containers such as Tomcat. (Hiroaki Kawai, Lars Kotthoff via yonik) |
| |
| 42. SOLR-616: SpellChecker accuracy configuration is not applied for FileBasedSpellChecker. |
| Apply it for FileBasedSpellChecker and IndexBasedSpellChecker both. |
| (shalin) |
| |
| 43. SOLR-648: SpellCheckComponent throws NullPointerException on using spellcheck.q request |
| parameter after restarting Solr, if reload is called but build is not called. |
| (Jonathan Lee, shalin) |
| |
| 44. SOLR-598: DebugComponent now always occurs last in the SearchHandler list unless the |
| components are explicitly declared. (gsingers) |
| |
| 45. SOLR-676: DataImportHandler should use UpdateRequestProcessor API instead of directly |
| using UpdateHandler. (shalin) |
| |
| 46. SOLR-696: Fixed bug in NamedListCodec in regards to serializing Iterable objects. (gsingers) |
| |
| 47. SOLR-669: snappuler fix for FreeBSD/Darwin (Richard "Trey" Hyde via Otis Gospodnetic) |
| |
| 48. SOLR-606: Fixed spell check collation offset issue. (Stefan Oestreicher , Geoffrey Young, gsingers) |
| |
| 49. SOLR-589: Improved handling of badly formated query strings (Sean Timm via Otis Gospodnetic) |
| |
| 50. SOLR-749: Allow QParser and ValueSourceParsers to be extended with same name (hossman, gsingers) |
| |
| Other Changes |
| 1. SOLR-135: Moved common classes to org.apache.solr.common and altered the |
| build scripts to make two jars: apache-solr-1.3.jar and |
| apache-solr-1.3-common.jar. This common.jar can be used in client code; |
| It does not have lucene or junit dependencies. The original classes |
| have been replaced with a @Deprecated extended class and are scheduled |
| to be removed in a later release. While this change does not affect API |
| compatibility, it is recommended to update references to these |
| deprecated classes. (ryan) |
| |
| 2. SOLR-268: Tweaks to post.jar so it prints the error message from Solr. |
| (Brian Whitman via hossman) |
| |
| 3. Upgraded to Lucene 2.2.0; June 18, 2007. |
| |
| 4. SOLR-215: Static access to SolrCore.getSolrCore() and SolrConfig.config |
| have been deprecated in order to support multiple loaded cores. |
| (Henri Biestro via ryan) |
| |
| 5. SOLR-367: The create method in all TokenFilter and Tokenizer Factories |
| provided by Solr now declare their specific return types instead of just |
| using "TokenStream" (hossman) |
| |
| 6. SOLR-396: Hooks add to build system for automatic generation of (stub) |
| Tokenizer and TokenFilter Factories. |
| Also: new Factories for all Tokenizers and TokenFilters provided by the |
| lucene-analyzers-2.2.0.jar -- includes support for German, Chinese, |
| Russan, Dutch, Greek, Brazilian, Thai, and French. (hossman) |
| |
| 7. Upgraded to commons-CSV r609327, which fixes escaping bugs and |
| introduces new escaping and whitespace handling options to |
| increase compatibility with different formats. (yonik) |
| |
| 8. Upgraded to Lucene 2.3.0; Jan 23, 2008. |
| |
| 9. SOLR-451: Changed analysis.jsp to use POST instead of GET, also made the input area a |
| bit bigger (gsingers) |
| |
| 10. Upgrade to Lucene 2.3.1 |
| |
| 11. SOLR-531: Different exit code for rsyncd-start and snappuller if disabled (Thomas Peuss via billa) |
| |
| 12. SOLR-550: Clarified DocumentBuilder addField javadocs (gsingers) |
| |
| 13. Upgrade to Lucene 2.3.2 |
| |
| 14. SOLR-518: Changed luke.xsl to use divs w/css for generating histograms |
| instead of SVG (Thomas Peuss via hossman) |
| |
| 15. SOLR-592: Added ShardParams interface and changed several string literals |
| to references to constants in CommonParams. |
| (Lars Kotthoff via Otis Gospodnetic) |
| |
| 16. SOLR-520: Deprecated unused LengthFilter since already core in |
| Lucene-Java (hossman) |
| |
| 17. SOLR-645: Refactored SimpleFacetsTest (Lars Kotthoff via hossman) |
| |
| 18. SOLR-591: Changed Solrj default value for facet.sort to true (Lars Kotthoff via Shalin) |
| |
| 19. Upgraded to Lucene 2.4-dev (r669476) to support SOLR-572 (gsingers) |
| |
| 20. SOLR-636: Improve/simplify example configs; and make index.jsp |
| links more resilient to configs loaded via an InputStream |
| (Lars Kotthoff, hossman) |
| |
| 21. SOLR-682: Scripts now support FreeBSD (Richard Trey Hyde via gsingers) |
| |
| 22. SOLR-489: Added in deprecation comments. (Sean Timm, Lars Kothoff via gsingers) |
| |
| 23. SOLR-692: Migrated to stable released builds of StAX API 1.0.1 and StAX 1.2.0 (shalin) |
| 24. Upgraded to Lucene 2.4-dev (r686801) (yonik) |
| 25. Upgraded to Lucene 2.4-dev (r688745) 27-Aug-2008 (yonik) |
| 26. Upgraded to Lucene 2.4-dev (r691741) 03-Sep-2008 (yonik) |
| 27. Replaced the StAX reference implementation with the geronimo |
| StAX API jar, and the Woodstox StAX implementation. (yonik) |
| |
| Build |
| 1. SOLR-411. Changed the names of the Solr JARs to use the defacto standard JAR names based on |
| project-name-version.jar. This yields, for example: |
| apache-solr-common-1.3-dev.jar |
| apache-solr-solrj-1.3-dev.jar |
| apache-solr-1.3-dev.jar |
| |
| 2. SOLR-479: Added clover code coverage targets for committers and the nightly build. Requires |
| the Clover library, as licensed to Apache and only available privately. To run: |
| ant -Drun.clover=true clean clover test generate-clover-reports |
| |
| 3. SOLR-510: Nightly release includes client sources. (koji) |
| |
| 4. SOLR-563: Modified the build process to build contrib projects |
| (Shalin Shekhar Mangar via Otis Gospodnetic) |
| |
| 5. SOLR-673: Modify build file to create javadocs for core, solrj, contrib and "all inclusive" (shalin) |
| |
| 6. SOLR-672: Nightly release includes contrib sources. (Jeremy Hinegardner, shalin) |
| |
| 7. SOLR-586: Added ant target and POM files for building maven artifacts of the Solr core, common, |
| client and contrib. The target can publish artifacts with source and javadocs. |
| (Spencer Crissman, Craig McClanahan, shalin) |
| |
| ================== Release 1.2 ================== |
| |
| Upgrading from Solr 1.1 |
| ------------------------------------- |
| IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves |
| should be upgraded before the master! If the master were to be updated |
| first, the older searchers would not be able to read the new index format. |
| |
| Older Apache Solr installations can be upgraded by replacing |
| the relevant war file with the new version. No changes to configuration |
| files should be needed. |
| |
| This version of Solr contains a new version of Lucene implementing |
| an updated index format. This version of Solr/Lucene can still read |
| and update indexes in the older formats, and will convert them to the new |
| format on the first index change. One change in the new index format |
| is that all "norms" are kept in a single file, greatly reducing the number |
| of files per segment. Users of compound file indexes will want to consider |
| converting to the non-compound format for faster indexing and slightly better |
| search concurrency. |
| |
| The JSON response format for facets has changed to make it easier for |
| clients to retain sorted order. Use json.nl=map explicitly in clients |
| to get the old behavior, or add it as a default to the request handler |
| in solrconfig.xml |
| |
| The Lucene based Solr query syntax is slightly more strict. |
| A ':' in a field value must be escaped or the whole value must be quoted. |
| |
| The Solr "Request Handler" framework has been updated in two key ways: |
| First, if a Request Handler is registered in solrconfig.xml with a name |
| starting with "/" then it can be accessed using path-based URL, instead of |
| using the legacy "/select?qt=name" URL structure. Second, the Request |
| Handler framework has been extended making it possible to write Request |
| Handlers that process streams of data for doing updates, and there is a |
| new-style Request Handler for XML updates given the name of "/update" in |
| the example solrconfig.xml. Existing installations without this "/update" |
| handler will continue to use the old update servlet and should see no |
| changes in behavior. For new-style update handlers, errors are now |
| reflected in the HTTP status code, Content-type checking is more strict, |
| and the response format has changed and is controllable via the wt |
| parameter. |
| |
| |
| |
| Detailed Change List |
| -------------------- |
| |
| New Features |
| 1. SOLR-82: Default field values can be specified in the schema.xml. |
| (Ryan McKinley via hossman) |
| |
| 2. SOLR-89: Two new TokenFilters with corresponding Factories... |
| * TrimFilter - Trims leading and trailing whitespace from Tokens |
| * PatternReplaceFilter - applies a Pattern to each token in the |
| stream, replacing match occurances with a specified replacement. |
| (hossman) |
| |
| 3. SOLR-91: allow configuration of a limit of the number of searchers |
| that can be warming in the background. This can be used to avoid |
| out-of-memory errors, or contention caused by more and more searchers |
| warming in the background. An error is thrown if the limit specified |
| by maxWarmingSearchers in solrconfig.xml is exceeded. (yonik) |
| |
| 4. SOLR-106: New faceting parameters that allow specification of a |
| minimum count for returned facets (facet.mincount), paging through facets |
| (facet.offset, facet.limit), and explicit sorting (facet.sort). |
| facet.zeros is now deprecated. (yonik) |
| |
| 5. SOLR-80: Negative queries are now allowed everywhere. Negative queries |
| are generated and cached as their positive counterpart, speeding |
| generation and generally resulting in smaller sets to cache. |
| Set intersections in SolrIndexSearcher are more efficient, |
| starting with the smallest positive set, subtracting all negative |
| sets, then intersecting with all other positive sets. (yonik) |
| |
| 6. SOLR-117: Limit a field faceting to constraints with a prefix specified |
| by facet.prefix or f.<field>.facet.prefix. (yonik) |
| |
| 7. SOLR-107: JAVA API: Change NamedList to use Java5 generics |
| and implement Iterable<Map.Entry> (Ryan McKinley via yonik) |
| |
| 8. SOLR-104: Support for "Update Plugins" -- RequestHandlers that want |
| access to streams of data for doing updates. ContentStreams can come |
| from the raw POST body, multi-part form data, or remote URLs. |
| Included in this change is a new SolrDispatchFilter that allows |
| RequestHandlers registered with names that begin with a "/" to be |
| accessed using a URL structure based on that name. |
| (Ryan McKinley via hossman) |
| |
| 9. SOLR-126: DirectUpdateHandler2 supports autocommitting after a specified time |
| (in ms), using <autoCommit><maxTime>10000</maxTime></autoCommit>. |
| (Ryan McKinley via klaas). |
| |
| 10. SOLR-116: IndexInfoRequestHandler added. (Erik Hatcher) |
| |
| 11. SOLR-79: Add system property ${<sys.prop>[:<default>]} substitution for |
| configuration files loaded, including schema.xml and solrconfig.xml. |
| (Erik Hatcher with inspiration from Andrew Saar) |
| |
| 12. SOLR-149: Changes to make Solr more easily embeddable, in addition |
| to logging which request handler handled each request. |
| (Ryan McKinley via yonik) |
| |
| 13. SOLR-86: Added standalone Java-based command-line updater. |
| (Erik Hatcher via Bertrand Delecretaz) |
| |
| 14. SOLR-152: DisMaxRequestHandler now supports configurable alternate |
| behavior when q is not specified. A "q.alt" param can be specified |
| using SolrQueryParser syntax as a mechanism for specifying what query |
| the dismax handler should execute if the main user query (q) is blank. |
| (Ryan McKinley via hossman) |
| |
| 15. SOLR-158: new "qs" (Query Slop) param for DisMaxRequestHandler |
| allows for specifying the amount of default slop to use when parsing |
| explicit phrase queries from the user. |
| (Adam Hiatt via hossman) |
| |
| 16. SOLR-81: SpellCheckerRequestHandler that uses the SpellChecker from |
| the Lucene contrib. |
| (Otis Gospodnetic and Adam Hiatt) |
| |
| 17. SOLR-182: allow lazy loading of request handlers on first request. |
| (Ryan McKinley via yonik) |
| |
| 18. SOLR-81: More SpellCheckerRequestHandler enhancements, inlcluding |
| support for relative or absolute directory path configurations, as |
| well as RAM based directory. (hossman) |
| |
| 19. SOLR-197: New parameters for input: stream.contentType for specifying |
| or overriding the content type of input, and stream.file for reading |
| local files. (Ryan McKinley via yonik) |
| |
| 20. SOLR-66: CSV data format for document additions and updates. (yonik) |
| |
| 21. SOLR-184: add echoHandler=true to responseHeader, support echoParams=all |
| (Ryan McKinley via ehatcher) |
| |
| 22. SOLR-211: Added a regex PatternTokenizerFactory. This extracts tokens |
| from the input string using a regex Pattern. (Ryan McKinley) |
| |
| 23. SOLR-162: Added a "Luke" request handler and other admin helpers. |
| This exposes the system status through the standard requestHandler |
| framework. (ryan) |
| |
| 24. SOLR-212: Added a DirectSolrConnection class. This lets you access |
| solr using the standard request/response formats, but does not require |
| an HTTP connection. It is designed for embedded applications. (ryan) |
| |
| 25. SOLR-204: The request dispatcher (added in SOLR-104) can handle |
| calls to /select. This offers uniform error handling for /update and |
| /select. To enable this behavior, you must add: |
| <requestDispatcher handleSelect="true" > to your solrconfig.xml |
| See the example solrconfig.xml for details. (ryan) |
| |
| 26. SOLR-170: StandardRequestHandler now supports a "sort" parameter. |
| Using the ';' syntax is still supported, but it is recommended to |
| transition to the new syntax. (ryan) |
| |
| 27. SOLR-181: The index schema now supports "required" fields. Attempts |
| to add a document without a required field will fail, returning a |
| descriptive error message. By default, the uniqueKey field is |
| a required field. This can be disabled by setting required=false |
| in schema.xml. (Greg Ludington via ryan) |
| |
| 28. SOLR-217: Fields configured in the schema to be neither indexed or |
| stored will now be quietly ignored by Solr when Documents are added. |
| The example schema has a comment explaining how this can be used to |
| ignore any "unknown" fields. |
| (Will Johnson via hossman) |
| |
| 29. SOLR-227: If schema.xml defines multiple fieldTypes, fields, or |
| dynamicFields with the same name, a severe error will be logged rather |
| then quietly continuing. Depending on the <abortOnConfigurationError> |
| settings, this may halt the server. Likewise, if solrconfig.xml |
| defines multiple RequestHandlers with the same name it will also add |
| an error. (ryan) |
| |
| 30. SOLR-226: Added support for dynamic field as the destination of a |
| copyField using glob (*) replacement. (ryan) |
| |
| 31. SOLR-224: Adding a PhoneticFilterFactory that uses apache commons codec |
| language encoders to build phonetically similar tokens. This currently |
| supports: DoubleMetaphone, Metaphone, Soundex, and RefinedSoundex (ryan) |
| |
| 32. SOLR-199: new n-gram tokenizers available via NGramTokenizerFactory |
| and EdgeNGramTokenizerFactory. (Adam Hiatt via yonik) |
| |
| 33. SOLR-234: TrimFilter can update the Token's startOffset and endOffset |
| if updateOffsets="true". By default the Token offsets are unchanged. |
| (ryan) |
| |
| 34. SOLR-208: new example_rss.xsl and example_atom.xsl to provide more |
| examples for people about the Solr XML response format and how they |
| can transform it to suit different needs. |
| (Brian Whitman via hossman) |
| |
| 35. SOLR-249: Deprecated SolrException( int, ... ) constructors in favor |
| of constructors that takes an ErrorCode enum. This will ensure that |
| all SolrExceptions use a valid HTTP status code. (ryan) |
| |
| 36. SOLR-386: Abstracted SolrHighlighter and moved existing implementation |
| to DefaultSolrHighlighter. Adjusted SolrCore and solrconfig.xml so |
| that highlighter is configurable via a class attribute. Allows users |
| to use their own highlighter implementation. (Tricia Williams via klaas) |
| |
| Changes in runtime behavior |
| 1. Highlighting using DisMax will only pick up terms from the main |
| user query, not boost or filter queries (klaas). |
| |
| 2. SOLR-125: Change default of json.nl to flat, change so that |
| json.nl only affects items where order matters (facet constraint |
| listings). Fix JSON output bug for null values. Internal JAVA API: |
| change most uses of NamedList to SimpleOrderedMap. (yonik) |
| |
| 3. A new method "getSolrQueryParser" has been added to the IndexSchema |
| class for retrieving a new SolrQueryParser instance with all options |
| specified in the schema.xml's <solrQueryParser> block set. The |
| documentation for the SolrQueryParser constructor and it's use of |
| IndexSchema have also been clarified. |
| (Erik Hatcher and hossman) |
| |
| 4. DisMaxRequestHandler's bq, bf, qf, and pf parameters can now accept |
| multiple values (klaas). |
| |
| 5. Query are re-written before highlighting is performed. This enables |
| proper highlighting of prefix and wildcard queries (klaas). |
| |
| 6. A meaningful exception is raised when attempting to add a doc missing |
| a unique id if it is declared in the schema and allowDups=false. |
| (ryan via klaas) |
| |
| 7. SOLR-183: Exceptions with error code 400 are raised when |
| numeric argument parsing fails. RequiredSolrParams class added |
| to facilitate checking for parameters that must be present. |
| (Ryan McKinley, J.J. Larrea via yonik) |
| |
| 8. SOLR-179: By default, solr will abort after any severe initalization |
| errors. This behavior can be disabled by setting: |
| <abortOnConfigurationError>false</abortOnConfigurationError> |
| in solrconfig.xml (ryan) |
| |
| 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using |
| the new request dispatcher (SOLR-104). This requires posted content to |
| have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8' |
| The response format matches that of /select and returns standard error |
| codes. To enable solr1.1 style /update, do not map "/update" to any |
| handler in solrconfig.xml (ryan) |
| |
| 10. SOLR-231: If a charset is not specified in the contentType, |
| ContentStream.getReader() will use UTF-8 encoding. (ryan) |
| |
| 11. SOLR-230: More options for post.jar to support stdin, xml on the |
| commandline, and defering commits. Tutorial modified to take |
| advantage of these options so there is no need for curl. |
| (hossman) |
| |
| 12. SOLR-128: Upgraded Jetty to the latest stable release 6.1.3 (ryan) |
| |
| Optimizations |
| 1. SOLR-114: HashDocSet specific implementations of union() and andNot() |
| for a 20x performance improvement for those set operations, and a new |
| hash algorithm speeds up exists() by 10% and intersectionSize() by 8%. |
| (yonik) |
| |
| 2. SOLR-115: Solr now uses BooleanQuery.clauses() instead of |
| BooleanQuery.getClauses() in any situation where there is no risk of |
| modifying the original query. |
| (hossman) |
| |
| 3. SOLR-221: Speed up sorted faceting on multivalued fields by ~60% |
| when the base set consists of a relatively large portion of the |
| index. (yonik) |
| |
| 4. SOLR-221: Added a facet.enum.cache.minDf parameter which avoids |
| using the filterCache for terms that match few documents, trading |
| decreased memory usage for increased query time. (yonik) |
| |
| Bug Fixes |
| 1. SOLR-87: Parsing of synonym files did not correctly handle escaped |
| whitespace such as \r\n\t\b\f. (yonik) |
| |
| 2. SOLR-92: DOMUtils.getText (used when parsing config files) did not |
| work properly with many DOM implementations when dealing with |
| "Attributes". (Ryan McKinley via hossman) |
| |
| 3. SOLR-9,SOLR-99: Tighten up sort specification error checking, throw |
| exceptions for missing sort specifications or a sort on a non-indexed |
| field. (Ryan McKinley via yonik) |
| |
| 4. SOLR-145: Fix for bug introduced in SOLR-104 where some Exceptions |
| were being ignored by all "out of the box" RequestHandlers. (hossman) |
| |
| 5. SOLR-166: JNDI solr.home code refactoring. SOLR-104 moved |
| some JNDI related code to the init method of a Servlet Filter - |
| according to the Servlet Spec, all Filter's should be initialized |
| prior to initializing any Servlets, but this is not the case in at |
| least one Servlet Container (Resin). This "bug fix" refactors |
| this JNDI code so that it should be executed the first time any |
| attempt is made to use the solr.home dir. |
| (Ryan McKinley via hossman) |
| |
| 6. SOLR-173: Bug fix to SolrDispatchFilter to reduce "too many open |
| files" problem was that SolrDispatchFilter was not closing requests |
| when finished. Also modified ResponseWriters to only fetch a Searcher |
| reference if necessary for writing out DocLists. |
| (Ryan McKinley via hossman) |
| |
| 7. SOLR-168: Fix display positioning of multiple tokens at the same |
| position in analysis.jsp (yonik) |
| |
| 8. SOLR-167: The SynonymFilter sometimes generated incorrect offsets when |
| multi token synonyms were mached in the source text. (yonik) |
| |
| 9. SOLR-188: bin scripts do not support non-default webapp names. Added "-U" |
| option to specify a full path to the update url, overriding the |
| "-h" (hostname), "-p" (port) and "-w" (webapp name) parameters. |
| (Jeff Rodenburg via billa) |
| |
| 10. SOLR-198: RunExecutableListener always waited for the process to |
| finish, even when wait="false" was set. (Koji Sekiguchi via yonik) |
| |
| 11. SOLR-207: Changed distribution scripts to remove recursive find |
| and avoid use of "find -maxdepth" on platforms where it is not |
| supported. (yonik) |
| |
| 12. SOLR-222: Changing writeLockTimeout in solrconfig.xml did not |
| change the effective timeout. (Koji Sekiguchi via yonik) |
| |
| 13. Changed the SOLR-104 RequestDispatcher so that /select?qt=xxx can not |
| access handlers that start with "/". This makes path based authentication |
| possible for path based request handlers. (ryan) |
| |
| 14. SOLR-214: Some servlet containers (including Tomcat and Resin) do not |
| obey the specified charset. Rather then letting the the container handle |
| it solr now uses the charset from the header contentType to decode posted |
| content. Using the contentType: "text/xml; charset=utf-8" will force |
| utf-8 encoding. If you do not specify a contentType, it will use the |
| platform default. (Koji Sekiguchi via ryan) |
| |
| 15. SOLR-241: Undefined system properties used in configuration files now |
| cause a clear message to be logged rather than an obscure exception thrown. |
| (Koji Sekiguchi via ehatcher) |
| |
| Other Changes |
| 1. Updated to Lucene 2.1 |
| |
| 2. Updated to Lucene 2007-05-20_00-04-53 |
| |
| ================== Release 1.1.0 ================== |
| |
| Status |
| ------ |
| This is the first release since Solr joined the Incubator, and brings many |
| new features and performance optimizations including highlighting, |
| faceted browsing, and JSON/Python/Ruby response formats. |
| |
| |
| Upgrading from previous Solr versions |
| ------------------------------------- |
| Older Apache Solr installations can be upgraded by replacing |
| the relevant war file with the new version. No changes to configuration |
| files are needed and the index format has not changed. |
| |
| The default version of the Solr XML response syntax has been changed to 2.2. |
| Behavior can be preserved for those clients not explicitly specifying a |
| version by adding a default to the request handler in solrconfig.xml |
| |
| By default, Solr will no longer use a searcher that has not fully warmed, |
| and requests will block in the meantime. To change back to the previous |
| behavior of using a cold searcher in the event there is no other |
| warm searcher, see the useColdSearcher config item in solrconfig.xml |
| |
| The XML response format when adding multiple documents to the collection |
| in a single <add> command has changed to return a single <result>. |
| |
| |
| Detailed Change List |
| -------------------- |
| |
| New Features |
| 1. added support for setting Lucene's positionIncrementGap |
| 2. Admin: new statistics for SolrIndexSearcher |
| 3. Admin: caches now show config params on stats page |
| 3. max() function added to FunctionQuery suite |
| 4. postOptimize hook, mirroring the functionallity of the postCommit hook, |
| but only called on an index optimize. |
| 5. Ability to HTTP POST query requests to /select in addition to HTTP-GET |
| 6. The default search field may now be overridden by requests to the |
| standard request handler using the df query parameter. (Erik Hatcher) |
| 7. Added DisMaxRequestHandler and SolrPluginUtils. (Chris Hostetter) |
| 8. Support for customizing the QueryResponseWriter per request |
| (Mike Baranczak / SOLR-16 / hossman) |
| 9. Added KeywordTokenizerFactory (hossman) |
| 10. copyField accepts dynamicfield-like names as the source. |
| (Darren Erik Vengroff via yonik, SOLR-21) |
| 11. new DocSet.andNot(), DocSet.andNotSize() (yonik) |
| 12. Ability to store term vectors for fields. (Mike Klaas via yonik, SOLR-23) |
| 13. New abstract BufferedTokenStream for people who want to write |
| Tokenizers or TokenFilters that require arbitrary buffering of the |
| stream. (SOLR-11 / yonik, hossman) |
| 14. New RemoveDuplicatesToken - useful in situations where |
| synonyms, stemming, or word-deliminater-ing produce identical tokens at |
| the same position. (SOLR-11 / yonik, hossman) |
| 15. Added highlighting to SolrPluginUtils and implemented in StandardRequestHandler |
| and DisMaxRequestHandler (SOLR-24 / Mike Klaas via hossman,yonik) |
| 16. SnowballPorterFilterFactory language is configurable via the "language" |
| attribute, with the default being "English". (Bertrand Delacretaz via yonik, SOLR-27) |
| 17. ISOLatin1AccentFilterFactory, instantiates ISOLatin1AccentFilter to remove accents. |
| (Bertrand Delacretaz via yonik, SOLR-28) |
| 18. JSON, Python, Ruby QueryResponseWriters: use wt="json", "python" or "ruby" |
| (yonik, SOLR-31) |
| 19. Make web admin pages return UTF-8, change Content-type declaration to include a |
| space between the mime-type and charset (Philip Jacob, SOLR-35) |
| 20. Made query parser default operator configurable via schema.xml: |
| <solrQueryParser defaultOperator="AND|OR"/> |
| The default operator remains "OR". |
| 21. JAVA API: new version of SolrIndexSearcher.getDocListAndSet() which takes |
| flags (Greg Ludington via yonik, SOLR-39) |
| 22. A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin |
| words that were hyphenated and split by a newline. (Boris Vitez via yonik, SOLR-41) |
| 23. Added a CompressableField base class which allows fields of derived types to |
| be compressed using the compress=true setting. The field type also gains the |
| ability to specify a size threshold at which field data is compressed. |
| (klaas, SOLR-45) |
| 24. Simple faceted search support for fields (enumerating terms) |
| and arbitrary queries added to both StandardRequestHandler and |
| DisMaxRequestHandler. (hossman, SOLR-44) |
| 25. In addition to specifying default RequestHandler params in the |
| solrconfig.xml, support has been added for configuring values to be |
| appended to the multi-val request params, as well as for configuring |
| invariant params that can not overridden in the query. (hossman, SOLR-46) |
| 26. Default operator for query parsing can now be specified with q.op=AND|OR |
| from the client request, overriding the schema value. (ehatcher) |
| 27. New XSLTResponseWriter does server side XSLT processing of XML Response. |
| In the process, an init(NamedList) method was added to QueryResponseWriter |
| which works the same way as SolrRequestHandler. |
| (Bertrand Delacretaz / SOLR-49 / hossman) |
| 28. json.wrf parameter adds a wrapper-function around the JSON response, |
| useful in AJAX with dynamic script tags for specifying a JavaScript |
| callback function. (Bertrand Delacretaz via yonik, SOLR-56) |
| 29. autoCommit can be specified every so many documents added (klaas, SOLR-65) |
| 30. ${solr.home}/lib directory can now be used for specifying "plugin" jars |
| (hossman, SOLR-68) |
| 31. Support for "Date Math" relative "NOW" when specifying values of a |
| DateField in a query -- or when adding a document. |
| (hossman, SOLR-71) |
| 32. useColdSearcher control in solrconfig.xml prevents the first searcher |
| from being used before it's done warming. This can help prevent |
| thrashing on startup when multiple requests hit a cold searcher. |
| The default is "false", preventing use before warm. (yonik, SOLR-77) |
| |
| Changes in runtime behavior |
| 1. classes reorganized into different packages, package names changed to Apache |
| 2. force read of document stored fields in QuerySenderListener |
| 3. Solr now looks in ./solr/conf for config, ./solr/data for data |
| configurable via solr.solr.home system property |
| 4. Highlighter params changed to be prefixed with "hl."; allow fragmentsize |
| customization and per-field overrides on many options |
| (Andrew May via klaas, SOLR-37) |
| 5. Default param values for DisMaxRequestHandler should now be specified |
| using a '<lst name="defaults">...</lst>' init param, for backwards |
| compatability all init prams will be used as defaults if an init param |
| with that name does not exist. (hossman, SOLR-43) |
| 6. The DisMaxRequestHandler now supports multiple occurances of the "fq" |
| param. (hossman, SOLR-44) |
| 7. FunctionQuery.explain now uses ComplexExplanation to provide more |
| accurate score explanations when composed in a BooleanQuery. |
| (hossman, SOLR-25) |
| 8. Document update handling locking is much sparser, allowing performance gains |
| through multiple threads. Large commits also might be faster (klaas, SOLR-65) |
| 9. Lazy field loading can be enabled via a solrconfig directive. This will be faster when |
| not all stored fields are needed from a document (klaas, SOLR-52) |
| 10. Made admin JSPs return XML and transform them with new XSL stylesheets |
| (Otis Gospodnetic, SOLR-58) |
| 11. If the "echoParams=explicit" request parameter is set, request parameters are copied |
| to the output. In an XML output, they appear in new <lst name="params"> list inside |
| the new <lst name="responseHeader"> element, which replaces the old <responseHeader>. |
| Adding a version=2.1 parameter to the request produces the old format, for backwards |
| compatibility (bdelacretaz and yonik, SOLR-59). |
| |
| Optimizations |
| 1. getDocListAndSet can now generate both a DocList and a DocSet from a |
| single lucene query. |
| 2. BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate |
| set |
| 3. OpenBitSet completed, replaces BitSet as the implementation for BitDocSet. |
| Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize |
| is between 3 and 4 times faster. (yonik, SOLR-15) |
| 4. much faster unionSize when one of the sets is a HashDocSet: O(smaller_set_size) |
| 5. Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field |
| queries where DocSets aren't cached (for example, if the number of terms in the field |
| is larger than the filter cache.) (yonik) |
| 6. Optimized facet.field faceting by as much as 500 times when the field has |
| a single token per document (not multiValued & not tokenized) by using the |
| Lucene FieldCache entry for that field to tally term counts. The first request |
| utilizing the FieldCache will take longer than subsequent ones. |
| |
| Bug Fixes |
| 1. Fixed delete-by-id for field types who's indexed form is different |
| from the printable form (mainly sortable numeric types). |
| 2. Added escaping of attribute values in the XML response (Erik Hatcher) |
| 3. Added empty extractTerms() to FunctionQuery to enable use in |
| a MultiSearcher (Yonik) |
| 4. WordDelimiterFilter sometimes lost token positionIncrement information |
| 5. Fix reverse sorting for fields were sortMissingFirst=true |
| (Rob Staveley, yonik) |
| 6. Worked around a Jetty bug that caused invalid XML responses for fields |
| containing non ASCII chars. (Bertrand Delacretaz via yonik, SOLR-32) |
| 7. WordDelimiterFilter can throw exceptions if configured with both |
| generate and catenate off. (Mike Klaas via yonik, SOLR-34) |
| 8. Escape '>' in XML output (because ]]> is illegal in CharData) |
| 9. field boosts weren't being applied and doc boosts were being applied to fields (klaas) |
| 10. Multiple-doc update generates well-formed xml (klaas, SOLR-65) |
| 11. Better parsing of pingQuery from solrconfig.xml (hossman, SOLR-70) |
| 12. Fixed bug with "Distribution" page introduced when Versions were |
| added to "Info" page (hossman) |
| 13. Fixed HTML escaping issues with user input to analysis.jsp and action.jsp |
| (hossman, SOLR-74) |
| |
| Other Changes |
| 1. Upgrade to Lucene 2.0 nightly build 2006-06-22, lucene SVN revision 416224, |
| http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=markup&pathrev=416224 |
| 2. Modified admin styles to improve display in Internet Explorer (Greg Ludington via billa, SOLR-6) |
| 3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302, |
| 4. Included unique key field name/value (if available) in log message of add (billa, SOLR-18) |
| 5. Updated to Lucene 2.0 nightly build 2006-09-07, SVN revision 462111 |
| 6. Added javascript to catch empty query in admin query forms (Tomislav Nakic-Alfirevic via billa, SOLR-48 |
| 7. blackslash escape * in ssh command used in snappuller for zsh compatibility, SOLR-63 |
| 8. check solr return code in admin scripts, SOLR-62 |
| 9. Updated to Lucene 2.0 nightly build 2006-11-15, SVN revision 475069 |
| 10. Removed src/apps containing the legacy "SolrTest" app (hossman, SOLR-3) |
| 11. Simplified index.jsp and form.jsp, primarily by removing/hiding XML |
| specific params, and adding an option to pick the output type. (hossman) |
| 12. Added new numeric build property "specversion" to allow clean |
| MANIFEST.MF files (hossman) |
| 13. Added Solr/Lucene versions to "Info" page (hossman) |
| 14. Explicitly set mime-type of .xsl files in web.xml to |
| application/xslt+xml (hossman) |
| 15. Config parsing should now work useing DOM Level 2 parsers -- Solr |
| previously relied on getTextContent which is a DOM Level 3 addition |
| (Alexander Saar via hossman, SOLR-78) |
| |
| 2006/01/17 Solr open sourced, moves to Apache Incubator |