Select a category on the left, to get your answers quickly
At Opensolr, we don’t count the number of requests you make—because, let’s face it, not all requests are created equal.
Instead, we use a Traffic Bandwidth Limit to keep things fair. You’re only billed (on a pre-paid plan) for the outgoing bytes sent from your Opensolr index to your site or app.
Translation:
- 1 GB of traffic could be a million ultra-efficient requests (if you optimize your queries)
- …or it could be just one monster request (if you don’t).
Yes, size matters!
Bonus: Opensolr transparently logs every single request. You get full access to see all the action, via: - Our Automation API - The Opensolr Index Control Panel
Want deeper insights or custom advice? Contact our team. We love a good bandwidth optimization challenge!
If you were using Solr's DataImport Handler, starting with Solr 9.x that is no longer possible.
Here's how to write a small script that will import data into your Opensolr Index, from XML files:
#!/bin/bash USERNAME="<OPENSOLR_INDEX_HTTP_AUTH_USERNAME>" PASSWORD="<OPENSOLR_INDEX_HTTP_AUTH_PASSWORD>" echo "Starting import on all indexes..." echo "" echo "Importing: <YOUR_OPENSOLR_INDEX_NAME>" echo "Downloading the xml data file" wget -q <URL_TO_YOUR_XML_FILE>/<YOUR_XML_FILE_NAME> echo "Removing all data" curl -s -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" -H "Content-Type: text/xml" -d "*:*" echo "" echo "Uploading and Importing all data into <YOUR_OPENSOLR_INDEX_NAME>" curl -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" --progress-bar -H "Content-Type: text/xml" --data-binary @<YOUR_XML_FILE_NAME> | tee -a "/dev/null" ; test ${PIPESTATUS[0]} -eq 0 echo "" rm -rf <YOUR_XML_FILE_NAME> echo "Done!" echo "" echo "" echo ""
Now, the way this is made, is that if you have a minimal tech background, you can understand that everything within the <> brackets will have to be replaced with your Opensolr Index Name, your Opensolr Index Hostname, the URL for your XML file, and so forth. You can get all that info in your Opensolr Index Control Panel. Except for the URL to your XML file, which that is hosted somewhere on your end.
The way you format your XML file, is the classic Solr format.
This article may should show you more about the Solr XML Data File format.
Solr is a beast—it loves RAM like a dog loves a steak. If your Solr server is gobbling up memory and crashing, don’t panic! Here’s what you need to know, plus battle-tested ways to keep things lean, mean, and not out-of-memory.
Solr eats memory to build search results, cache data, and keep things fast.
But:
- Bad configuration or huge, inefficient requests can cause even the biggest server to choke and burn through RAM.
- Sometimes, small indexes on giant machines will still crash if your setup isn’t right.
- Good news: Opensolr has self-healing—if Solr crashes, it’ll be back in under a minute. Still, prevention is better than panic.
Want to save bandwidth and RAM? Read these tips.
Optimizing your queries is a win-win: less data in and out, and less stress on your server.
rows
parameter below 100 for most queries.&rows=100
&start=500000&rows=100
), Solr has to allocate a ton of memory for all those results.start
under 50,000 if possible.docValues=true
docValues="true"
in schema.xml
.Example:
xml
<field name="name" docValues="true" type="text_general" indexed="true" stored="true" />
For highlighting, you may want even more settings:
xml
<field name="description" type="text_general" indexed="true" stored="true" docValues="true" termVectors="true" termPositions="true" termOffsets="true" storeOffsetsWithPositions="true" />
Solr caches are great… until they eat all your memory and leave nothing for real work.
The big four:
filterCache
: stores document ID lists for filter queries (fq
)queryResultCache
: stores doc IDs for search resultsdocumentCache
: caches stored field valuesfieldCache
: stores all values for a field in memory (dangerous for big fields!)Solution: Tune these in solrconfig.xml
and keep sizes low.
xml
<filterCache size="1" initialSize="1" autowarmCount="0"/>
Questions? Want a config review or more tips? Contact the Opensolr team!
Solr’s RAM appetite is legendary. Don’t worry, you’re not alone. Let’s help you keep your heap happy, your queries snappy, and your boss off your back.
Fewer bytes → less RAM.
See our bandwidth tips.
rows
Parameter!Don’t return all the docs unless you want Solr to host a BBQ in your memory.
&rows=100
Huge start
values = huge RAM usage.
Try not to cross start=50000
unless you really like chaos.
Faceting, sorting, grouping, highlighting:
<field name="my_field" docValues="true" type="string" indexed="true" stored="true"/>
Tighten up your caches in solrconfig.xml
.
<filterCache size="1" initialSize="1" autowarmCount="0"/>
Monitor cache hit ratios; <10% = wasted RAM.
4g
or 8g
is enough.-Xms4g -Xmx4g
-XX:+UseG1GC
-XX:+UseStringDeduplication
-XX:MaxGCPauseMillis=200
-Xlog:gc*:file=/var/solr/gc.log:time,uptime,level,tags:filecount=10,filesize=10M
Keep Search API Solr current or face the wrath of bugs.
rows
and start
. docValues
for anything you facet, sort, or group.JVM Option | What It Does | Default/Example |
---|---|---|
-Xms / -Xmx |
Min/Max heap size | -Xms4g -Xmx4g |
-XX:+UseG1GC |
Use the G1 Garbage Collector | Always for Java 8+ |
-XX:MaxGCPauseMillis=200 |
Target max GC pause time (ms) | -XX:MaxGCPauseMillis=200 |
-XX:+UseStringDeduplication |
Remove duplicate strings in heap | Java 8u20+ |
-Xlog:gc* |
GC logging | See above |
-XX:+HeapDumpOnOutOfMemoryError |
Write heap dump on OOM | Always! |
-XX:HeapDumpPath=/tmp/solr-heapdump.hprof |
Path for OOM heap dump | Set to a safe disk |
“How many docs can I return? Solr: Yes.”
👉 Contact Opensolr Support — bring logs, configs, and memes. We love a challenge.
Enabling spellcheck in Apache Solr is like giving your users a helpful nudge whenever they make a typo—because we all know “seach” is not “search.”
Here’s how to get those “Did you mean…?” suggestions working for your queries!
schema.xml
(in your Solr core’s conf
directory):<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="content" type="textSpell" indexed="true" stored="true"/>
<field name="spell" type="textSpell" indexed="true" stored="false" multiValued="true"/>
solrconfig.xml
(in your Solr core’s conf
directory).<requestHandler>
for /select
and add the spellcheck component:<requestHandler name="/select" class="solr.SearchHandler">
<!-- ... -->
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
Still in solrconfig.xml
, define your <searchComponent>
for spellcheck:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">3</int>
<float name="maxQueryFrequency">0.5</float>
</lst>
</searchComponent>
Pro tip: You can tune these parameters based on your data and performance needs. For instance, more “maxEdits” means more generous suggestions, but potentially more noise!
After any schema/config changes, reindex your content.
Otherwise, your spellcheck dictionary will be lonely and unhelpful.
When making a search query, simply add the spellcheck
parameter:
/select?q=your_query&spellcheck=true
You’ll get spellcheck suggestions in your Solr response, usually under the "spellcheck"
section.
Voilà! No more missed searches due to typos. 🎉
Now your Solr is smart enough to fix “teh” into “the.” Happy searching! 🪄
schema_extra_types.xml
Leverage the power of Natural Language Processing (NLP) right inside Solr!
With built-in support for OpenNLP models, you can add advanced tokenization, part-of-speech tagging, named entity recognition, and much more—no PhD required.
Integrating NLP in your schema allows you to:
In short: your Solr becomes smarter and your users get better search results.
Here’s a typical fieldType
in your schema_extra_types.xml
using OpenNLP:
<fieldType name="text_edge_nouns_nl" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_nl.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
Model Paths:
Always reference the full absolute path for NLP model files. For example:
sentenceModel="/opt/nlp/nl-sent.bin"
tokenizerModel="/opt/nlp/nl-token.bin"
posTaggerModel="/opt/nlp/nl-pos-maxent.bin"
This ensures Solr always finds your precious language models—no “file not found” drama!
Type Token Filtering:
The TypeTokenFilterFactory
with useWhitelist="true"
will only keep tokens matching the allowed parts of speech (like nouns, verbs, etc.), as defined in pos_edge_nouns_nl.txt
. This keeps your index tight and focused.
Synonym Graphs:
Add SynonymGraphFilterFactory
to enable query-side expansion. This is great for handling multiple word forms, synonyms, and local lingo.
RemoveDuplicatesTokenFilterFactory
to keep things clean and efficient.You can set up similar analyzers for English, undefined language, or anything you like. For example:
<fieldType name="text_nouns_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_en.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_en.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_en.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
/opt/nlp/
), and keep a README so you know what’s what.Using NLP models in your Solr analyzers will supercharge your search, make autocomplete smarter, and help users find what they’re actually looking for (even if they type like my cat walks on a keyboard).
Need more examples?
Check out the Solr Reference Guide - OpenNLP Integration or Opensolr documentation.
Happy indexing, and may your tokens always be well-typed! 😸🤓
Solr thrives on configuration files—each with its own special job.
Whether you’re running a classic Solr install, a CMS like Drupal, or even going rogue with WordPress and WPSOLR, proper configuration is key.
Solr configurations often reference each other (think: dependencies). If you upload them in the wrong order, you’ll get errors, failed indexes, and possibly even a mild existential crisis.
When uploading your Solr config files via the Opensolr Index Control Panel, follow this foolproof order:
Dependencies First!
Create and upload a .zip
containing all dependency files (such as .txt
files, schema-extra.xml
, solrconfig-extra.xml
, synonyms, stopwords, etc).
Basically, everything except the main schema.xml
and solrconfig.xml
.
Schema Second!
Zip and upload just your schema.xml
file.
This file defines all fields and refers to resources from the previous archive.
solrconfig Last!
Finally, zip and upload your solrconfig.xml
file.
This references your schema fields and ties all the magic together.
In summary:
1️⃣ Dependencies → 2️⃣ schema.xml → 3️⃣ solrconfig.xml
Absolutely!
Use the Opensolr Automation REST API to upload your config files programmatically.
Because, let’s face it, real wizards script things.
Now go forth and upload with confidence! 🦾
The AutoPhrase TokenFilter is a powerful Solr plugin that helps you recognize and index multi-word expressions as single tokens (think: “New York City” as one unit, not three). This can significantly improve the quality of search, autocomplete, and analytics.
Not on all Opensolr environments!
If you’re trying to use the AutoPhraseTokenFilterFactory
and see errors like:
Plugin not found: solr.AutoPhraseTokenFilterFactory
…then the jar isn’t active on your server (yet).
Contact Us
Simply send us a request and we’ll install the AutoPhrase library (or pretty much any other custom Solr plugin) for you.
How to Request a Plugin
Optionally, send the JAR file directly if it’s a custom or non-public library.
After Installation
<filter class="solr.AutoPhraseTokenFilterFactory" ... />
element to your field type in schema.xml
.Questions? Contact Opensolr Support — we’re happy to help!
(If you’re a plugin power user, give us a heads up and we’ll have your Solr instance doing backflips in no time. 🕺)
Need a special Solr plugin or custom filter?
No problem! Opensolr supports custom JAR libraries—so you can fine-tune your search platform with advanced features.
Send Us Your JAR
Email your custom JAR file (or a link to the official plugin page where binaries are already compiled) to support@opensolr.com.
Include This Info
The Opensolr Index Name (where you want the plugin installed)
Installation Timeline
.jar
binary!Once we’ve installed the plugin:
- Update your schema.xml
or solrconfig.xml
to use your new library (we can help with this if needed).
- Reload your Solr core to activate the changes.
- Test your configuration—give it a spin!
Questions? Stuck?
Email support@opensolr.com and our tech team will leap into action (well, at least open their laptops and get right on it).
With Opensolr, you’re never stuck with just the basics. Power up your index—your way! ⚡️
Click on the Tooks Menu Item on the right hand side, and then simply use the form to create your query and delete data.
To move from using the managed-schema to schema.xml, simply follow the steps below:
In your solrconfig.xml file, look for the schemaFactory definition.If you have one, remove it and add this instead:
<schemaFactory class="ClassicIndexSchemaFactory"/>
If you don't have it just add the above snippet somewhere above the requestHandlers definitions.
To move from using the classic schema.xml in your opensolr index, to the managed-schema simply follow the steps below:
In your solrconfig.xml, look for a SchemaFactory definition, and replace it with this snippet:
<schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory>
If you don't have any schemaFactory definition, just paste the above snippet to your solrconfig.xml file, just about any requestHandler definition.
With Opensolr, your project’s Solr version is never a limitation—it’s a superpower! 🦾
Contact Opensolr Support to spin up any version, or just ask us which one makes sense for your needs!
Please go to https://opensolr.com/pricing and make sure you select the analytics option from the extra features tab, when you upgrade your account.
If you can see analytics but no data, make sure your solr queries are correctly formated in the form:
https://server.opensolr.com/solr/index_name/select?q=your_query&other_params...
So, the search query must be clearly visible in the q parameter in order for it to show in analytics.
Bandwidth: you don’t notice it… until you run out. Here’s how to keep your Opensolr search snappy without burning through your monthly gigabytes.
Result: fewer requests, happier users, and much lower bandwidth usage!
🔄 Solr Replication Magic
(Think of it as the “BOGO” deal for bandwidth.)
🎯 Return Only What You Need
/select
requests using the rows
and fl
parameters to only fetch the records and fields you truly need./select?q=mysearch&rows=10&fl=id,title
Master these tricks and your bandwidth will go further, your bills will shrink, and your search users will never know you’ve become a traffic ninja. 🥷
Need more ideas
Sometimes, you’ve got to make a trade: a bit less speed for a lot less disk space.
Here’s how you can shrink your Solr index like a pro (and keep your server from bursting at the seams):
int
instead of tint
int
field takes up less space than a trie integer (tint
).int
will be slower than tint
. (It’s a classic “pick two out of three” scenario: fast, small, cheap.)Take a hard look at your fields.
Sometimes, to get a slimmer index, you need to be ruthless.
Are you hoarding stored fields?
If you’ve got lots of stored fields, try this power move:
Add omitNorms="true"
On text fields that don’t need length normalization.
(Translation: If you don’t care about short/long document bias, ditch the norms and reclaim space!)
Add omitPositions="true"
On text fields that don’t require phrase matching.
(You lose phrase search on those fields, but win back precious bytes.)
Beware the NGram monster!
Special fields like NGrams can gobble up a ton of space.
Shrink smart, and may your search be speedy and your indexes svelte! 🚀🗜️
OpenSolr is more than just a place to host your Apache Solr instance—it’s your full-service, hands-off search infrastructure butler, working around the clock so you don’t have to! Here’s what makes OpenSolr the trusted choice for devs and businesses worldwide:
OpenSolr is a cloud-based search service that takes all the hassle out of hosting, scaling, and managing Apache Solr, the legendary open-source search platform known for:
🛠️ Managed Solr Hosting
Let OpenSolr handle the dirty work—setup, upgrades, security patches, scaling—so you can focus on what matters: building awesome stuff.
📈 Scalability & Performance
Need to handle millions of searches? No sweat. OpenSolr lets you ramp up or down in seconds, delivering reliable performance at any scale.
🔒 Data Security & Backups
Rest easy with industry-standard SSL encryption, regular data backups, and built-in recovery tools. Your data’s safe, come rain or ransomware.
⚙️ Customizable Search Indexes
Define your own schemas, play with analyzers, import data your way. It’s Solr, but without the migraine.
🖥️ User-Friendly Control Panel
Forget the CLI—manage, monitor, and tweak your search environment in a slick web interface. Analytics, logs, real-time stats—one click away.
🙋 Rockstar Support & Consulting
Stuck? OpenSolr’s experts are on standby, offering guidance, troubleshooting, and performance tips. (We don’t judge your config typos.)
🔌 Easy Integration & APIs
Plug OpenSolr into your e-commerce platform, CMS, data warehouse, or even your secret AI project. REST APIs and connectors included!
🌍 Global Data Centers
Your users are everywhere—so is OpenSolr. Pick the region closest to you for lightning-fast, reliable service worldwide.
Anyone who wants powerful, scalable, professional search without the burden of self-hosting: - E-commerce stores 🛒 - Content management systems 📝 - News & media websites 🗞️ - SaaS products ☁️ - Any data-hungry app that needs to search like a boss!
OpenSolr: Where world-class search meets old-school reliability (and a dash of wit).
Ready to search smarter?
Sign up for a free trial!