API

SolrConnection object

Connecting to a set of solr servers.

To get a SolrCollection instance from a SolrConnection use either dictionary-style or attribute-style access:

>>> from solrcloudpy.connection import SolrConnection
>>> conn = SolrConnection()
>>> conn.list()
[u'collection1']
>>> conn['collection1']
SolrCollection<collection1>
class solrcloudpy.connection.SolrConnection(server='localhost:8983', detect_live_nodes=False, user=None, password=None, timeout=10)

Connection to a solr server or several ones

Parameters:
  • server – The server. Can be a single one or a list of servers. Example localhost:8983 or [localhost,solr1.domain.com:8983].
  • detect_live_nodes – whether to detect live nodes automativally or not. This assumes that one is able to access the IPs listed by Zookeeper. The default value is False.
  • user – HTTP basic auth user name
  • password – HTTP basic auth password
  • timeout – timeout for HTTP requests
cluster_health

Determine the state of all nodes and collections in the cluster. Problematic nodes or collections are returned, along with their state, otherwise an OK message is returned

cluster_leader

Gets the cluster leader

create_collection(collname, *args, **kwargs)

Create a collection.

Parameters:
  • collname – The collection name
  • *args – additiona arguments
  • **kwargs – additional named parameters
list()

Lists out the current collections in the cluster

live_nodes

Lists all nodes that are currently online

SolrCollection object

Manage and search a Solr Collection.

The Collections API is used to enable you to create, remove, or reload collections. Consult the Collections API for more details

>>> from solrcloudpy.connection import SolrConnection
>>> conn = SolrConnection()
>>> coll = conn['test1'].create()
>>> coll
SolrCollection<collection1>

This class is also used for query a Solr collection. The endpoints supported by default are:

  • /select : the default Solr request handler
  • /mlt: the request handler for doing more like this search
  • /clustering: Solr’s clustering component

Support will be coming for the following endpoints:

  • /get: Solr’s real-time get request handler

  • /highlight: Solr’s search highlight component

  • /terms: Term component

    >>> from solrcloudpy import SolrConnection
    >>> coll = SolrConnection()['collection1']
    >>> response = coll.search({'q':'money'})
    >>> response
    <SolrResponse [200]>
    >>> response.result
    {
        "response": "SolrResponse << {'start': 0, 'numFound': 0, 'docs': []} >>"
    }
    
class solrcloudpy.SolrCollection(connection, name)
add(docs)

Add a list of document to the collection

Parameters:docs – a list of documents to add
clustering(params)

Perform clustering on a query

Parameters:params – query parameters. Here params can be a SearchOptions instance, a dictionary or a list of tuples
commit()

Commit changes to a collection

create(replication_factor=1, force=False, **kwargs)

Create a collection

Parameters:
  • num_shards – an integer indicating the number of shards for this collection
  • replication_factor – an integer indicating the number of replcas for this collection
  • force – a boolean value indicating whether to force the operation
  • kwargs – additional parameters to be passed to this operation
Additional Parameters:
 
  • router_name: router name that will be used. defines how documents will be distributed among the shards
  • num_shards: number of shards to create for this collection
  • shards: A comma separated list of shard names. Required when using the implicit router
  • max_shards_per_node: max number of shards/replicas to put on a node for this collection
  • create_node_set: Allows defining which nodes to spread the new collection across.
  • collection_config_name: the name of the configuration to use for this collection
  • router_field: if this field is specified, the router will look at the value of the field in an input document to compute the hash and identify of a shard instead of looking at the uniqueKey field

Additional parameters are further documented at https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateaCollection

create_alias(alias)

Create or modify an alias for a collection

Parameters:alias – the name of the alias
create_shard(shard, create_node_set=None)

Create a new shard

Parameters:
  • shard – The name of the shard to be created.
  • create_node_set – Allows defining the nodes to spread the new collection across.
delete(id=None, q=None, commit=True)

Delete documents in a collection. Deletes occur either by id or by query

Parameters:
  • id – the id of the document to pass.
  • q – the query matching the set of documents to delete
  • commit – whether to commit the change or not
delete_alias(alias)

Delete an alias for a collection

Parameters:alias – the name of the alias
delete_replica(replica, shard)

Delete a replica

Parameters:
  • replica – The name of the replica to remove.
  • shard – The name of the shard that includes the replica to be removed.
drop()

Delete a collection

exists()

Finds if a collection exists in the cluster

Parameters:collection – the collection to find
index_info

Get a high-level overview of this collection’s index

is_alias()

Determines if this collection is an alias for a ‘real’ collection

mlt(params)

Perform MLT on this index

Parameters:params – query parameters. Here params can be a SearchOptions instance, a dictionary or a list of tuples
optimize(waitsearcher=False, softcommit=False)

Optimize a collection for searching

Parameters:
  • waitsearcher – whether to make the changes to the collection visible or not by opening a new searcher
  • softcommit – whether to perform a soft commit when optimizing
reload()

Reload a collection

search(params)

Search this index

Parameters:params – query parameters. Here params can be a SearchOptions instance, a dictionary or a list of tuples
split_shard(shard, ranges=None, split_key=None)

Split a shard into two new shards

Parameters:
  • shard – The name of the shard to be split.
  • ranges – A comma-separated list of hash ranges in hexadecimal e.g. ranges=0-1f4,1f5-3e8,3e9-5dc
  • split_key – The key to use for splitting the index
state

Get the state of this collection

SolrIndexStats object

class solrcloudpy.collection.stats.SolrIndexStats(connection, name)

Get different statistics about the undelying index in a collection

cache_stats

Get cache statistics about the index. We retrieve cache stats for the document, filter, fiedvalue, fieldcache caches

queryhandler_stats

Get query handler statistics for all of the handlers used in this Solr node

SolrSchema object

class solrcloudpy.collection.schema.SolrSchema(connection, collection_name)

Get and modify schema

add_fields(json_schema)

Add fields to the schema

Parameters:json_schema – specs for the fields to add
get_copyfield(field)

Get information about a copy field in the schema

Parameters:ftype – the name of the field type
get_copyfields()

Get information about all copy field in the schema

get_dynamic_field(field)

Get information about a dynamic field in the schema

Parameters:field – the name of the field
get_dynamic_fields()

Get information about a dynamic field in the schema

get_field(field)

Get information about a field in the schema

Parameters:field – the name of the field
get_fields()

Get information about all field in the schema

get_fieldtype(ftype)

Get information about a field type in the schema

Parameters:ftype – the name of the field type
get_fieldtypes()

Get information about field types in the schema

SearchOptions object

class solrcloudpy.parameters.SearchOptions(**kwargs)

Manage options to pass to a solr query

Although one can use plain dictionaries to pass parameters to solr, this class makes this task more convenient. Currently, it covers all options to pass to do:

  • MLT search via the mltparams member variable
  • normal search via commonparams member variable
  • faceted search via the facetparams member variable

Example:

>>> se = SearchOptions()
>>> se.commonparams.q("*:*").fl('*,score')
{'q': set(['*:*']), 'fl': set(['*,score'])}
>>> se.facetparams.field("id")
{'facet.field': set(['id'])}
>>> se
{'commonparams': {'q': set(['*:*']), 'fl': set(['*,score'])}, 'facetparams': {'facet.field': set(['id'])}, 'mltparams': {}}

SolrResponse object

class solrcloudpy.utils.SolrResponse(response_obj)

A generic representation of a solr response. This objects contains both the Response object variable from the requests package and the parsed content in a SolrResult instance.

code

Status code of this response

SolrResult object

class solrcloudpy.utils.SolrResult(obj)

Generic representation of a Solr search result. The response is a object whose attributes can be also accessed as dictionary keys.

Example:

>>> result
{
"response": "SolrResponse << {'start': 0, 'numFound': 0, 'docs': []} >>"
}
>>> result['response'].start
0
>>> result.response.numFound
0
dict

Convert this result into a python dict for easier manipulation

Table Of Contents

Related Topics

This Page