SolrConnection object

Connecting to a set of solr servers.

To get a SolrCollection instance from a SolrConnection use either dictionary-style or attribute-style access:

>>> from solrcloudpy.connection import SolrConnection
>>> conn = SolrConnection()
>>> conn.list()
>>> conn['collection1']
class solrcloudpy.connection.SolrConnection(server='localhost:8983', detect_live_nodes=False, user=None, password=None, timeout=10)

Connection to a solr server or several ones

  • server – The server. Can be a single one or a list of servers. Example localhost:8983 or [localhost,solr1.domain.com:8983].
  • detect_live_nodes – whether to detect live nodes automativally or not. This assumes that one is able to access the IPs listed by Zookeeper. The default value is False.
  • user – HTTP basic auth user name
  • password – HTTP basic auth password
  • timeout – timeout for HTTP requests

Determine the state of all nodes and collections in the cluster. Problematic nodes or collections are returned, along with their state, otherwise an OK message is returned


Gets the cluster leader

create_collection(collname, *args, **kwargs)

Create a collection.

  • collname – The collection name
  • *args – additiona arguments
  • **kwargs – additional named parameters

Lists out the current collections in the cluster


Lists all nodes that are currently online

SolrCollection object

Manage and search a Solr Collection.

The Collections API is used to enable you to create, remove, or reload collections. Consult the Collections API for more details

>>> from solrcloudpy.connection import SolrConnection
>>> conn = SolrConnection()
>>> coll = conn['test1'].create()
>>> coll

This class is also used for query a Solr collection. The endpoints supported by default are:

  • /select : the default Solr request handler
  • /mlt: the request handler for doing more like this search
  • /clustering: Solr’s clustering component

Support will be coming for the following endpoints:

  • /get: Solr’s real-time get request handler

  • /highlight: Solr’s search highlight component

  • /terms: Term component

    >>> from solrcloudpy import SolrConnection
    >>> coll = SolrConnection()['collection1']
    >>> response = coll.search({'q':'money'})
    >>> response
    <SolrResponse [200]>
    >>> response.result
        "response": "SolrResponse << {'start': 0, 'numFound': 0, 'docs': []} >>"
class solrcloudpy.SolrCollection(connection, name)

Add a list of document to the collection

Parameters:docs – a list of documents to add

Perform clustering on a query

Parameters:params – query parameters. Here params can be a SearchOptions instance, a dictionary or a list of tuples

Commit changes to a collection

create(replication_factor=1, force=False, **kwargs)

Create a collection

  • num_shards – an integer indicating the number of shards for this collection
  • replication_factor – an integer indicating the number of replcas for this collection
  • force – a boolean value indicating whether to force the operation
  • kwargs – additional parameters to be passed to this operation
Additional Parameters:
  • router_name: router name that will be used. defines how documents will be distributed among the shards
  • num_shards: number of shards to create for this collection
  • shards: A comma separated list of shard names. Required when using the implicit router
  • max_shards_per_node: max number of shards/replicas to put on a node for this collection
  • create_node_set: Allows defining which nodes to spread the new collection across.
  • collection_config_name: the name of the configuration to use for this collection
  • router_field: if this field is specified, the router will look at the value of the field in an input document to compute the hash and identify of a shard instead of looking at the uniqueKey field

Additional parameters are further documented at https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateaCollection


Create or modify an alias for a collection

Parameters:alias – the name of the alias
create_shard(shard, create_node_set=None)

Create a new shard

  • shard – The name of the shard to be created.
  • create_node_set – Allows defining the nodes to spread the new collection across.
delete(id=None, q=None, commit=True)

Delete documents in a collection. Deletes occur either by id or by query

  • id – the id of the document to pass.
  • q – the query matching the set of documents to delete
  • commit – whether to commit the change or not

Delete an alias for a collection

Parameters:alias – the name of the alias
delete_replica(replica, shard)

Delete a replica

  • replica – The name of the replica to remove.
  • shard – The name of the shard that includes the replica to be removed.

Delete a collection


Finds if a collection exists in the cluster

Parameters:collection – the collection to find

Get a high-level overview of this collection’s index


Determines if this collection is an alias for a ‘real’ collection


Perform MLT on this index

Parameters:params – query parameters. Here params can be a SearchOptions instance, a dictionary or a list of tuples
optimize(waitsearcher=False, softcommit=False)

Optimize a collection for searching

  • waitsearcher – whether to make the changes to the collection visible or not by opening a new searcher
  • softcommit – whether to perform a soft commit when optimizing

Reload a collection


Search this index

Parameters:params – query parameters. Here params can be a SearchOptions instance, a dictionary or a list of tuples
split_shard(shard, ranges=None, split_key=None)

Split a shard into two new shards

  • shard – The name of the shard to be split.
  • ranges – A comma-separated list of hash ranges in hexadecimal e.g. ranges=0-1f4,1f5-3e8,3e9-5dc
  • split_key – The key to use for splitting the index

Get the state of this collection

SolrIndexStats object

class solrcloudpy.collection.stats.SolrIndexStats(connection, name)

Get different statistics about the undelying index in a collection


Get cache statistics about the index. We retrieve cache stats for the document, filter, fiedvalue, fieldcache caches


Get query handler statistics for all of the handlers used in this Solr node

SolrSchema object

class solrcloudpy.collection.schema.SolrSchema(connection, collection_name)

Get and modify schema


Add fields to the schema

Parameters:json_schema – specs for the fields to add

Get information about a copy field in the schema

Parameters:ftype – the name of the field type

Get information about all copy field in the schema


Get information about a dynamic field in the schema

Parameters:field – the name of the field

Get information about a dynamic field in the schema


Get information about a field in the schema

Parameters:field – the name of the field

Get information about all field in the schema


Get information about a field type in the schema

Parameters:ftype – the name of the field type

Get information about field types in the schema

SearchOptions object

class solrcloudpy.parameters.SearchOptions(**kwargs)

Manage options to pass to a solr query

Although one can use plain dictionaries to pass parameters to solr, this class makes this task more convenient. Currently, it covers all options to pass to do:

  • MLT search via the mltparams member variable
  • normal search via commonparams member variable
  • faceted search via the facetparams member variable


>>> se = SearchOptions()
>>> se.commonparams.q("*:*").fl('*,score')
{'q': set(['*:*']), 'fl': set(['*,score'])}
>>> se.facetparams.field("id")
{'facet.field': set(['id'])}
>>> se
{'commonparams': {'q': set(['*:*']), 'fl': set(['*,score'])}, 'facetparams': {'facet.field': set(['id'])}, 'mltparams': {}}

SolrResponse object

class solrcloudpy.utils.SolrResponse(response_obj)

A generic representation of a solr response. This objects contains both the Response object variable from the requests package and the parsed content in a SolrResult instance.


Status code of this response

SolrResult object

class solrcloudpy.utils.SolrResult(obj)

Generic representation of a Solr search result. The response is a object whose attributes can be also accessed as dictionary keys.


>>> result
"response": "SolrResponse << {'start': 0, 'numFound': 0, 'docs': []} >>"
>>> result['response'].start
>>> result.response.numFound

Convert this result into a python dict for easier manipulation

