一.Collections API
参考:https://cwiki.apache.org/confluence/display/solr/Collections+API
因为API比较多,我就不一一列举,只列出比较重要的几个 1.创建collection 官方示例:/admin/collections?action=CREATE&name=name&numShards=number&replicationFactor=number&maxShardsPerNode=number&createNodeSet=nodelist&collection.configName=configname
(1) 我的示例:
http://192.168.66.99:8080/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFactor=2&maxShardsPerNode=3
name指明collection名称
numShards指明分片数
replicationFactor指明副本数
maxShardsPerNode 每个节点最大分片数(默认为1)
(2)当我们想指定配置文件,索引目录时,可以加入如下参数
property.name=value | string | No | Set core property name to value. See core.properties file contents. |
可选参数如下:
key
Description
name | The name of the SolrCore. You'll use this name to reference the SolrCore when running commands with the CoreAdminHandler. |
config | The configuration file name for a given core. The default is solrconfig.xml. |
schema | The schema file name for a given core. The default is schema.xml |
dataDir | Core's data directory as a path relative to the instanceDir, data by default. |
configSet | If set, the name of the configset to use to configure the core (see Config Sets). |
properties | The name of the properties file for this core. The value can be an absolute pathname or a path relative to the value of instanceDir. |
transient | If true, the core can be unloaded if Solr reaches the transientCacheSize. The default if not specified is false. Cores are unloaded in order of least recently used first. |
loadOnStartup | If true, the default if it is not specified, the core will loaded when Solr starts. |
coreNodeName | Added in Solr 4.2, this attributes allows naming a core. The name can then be used later if you need to replace a machine with a new one. By assigning the new machine the same coreNodeName as the old core, it will take over for the old SolrCore. |
ulogDir | The absolute or relative directory for the update log for this core (SolrCloud) |
shard | The shard to assign this core to (SolrCloud) |
collection | The name of the collection this core is part of (SolrCloud) |
roles | Future param for SolrCloud or a way for users to mark nodes for their own use. |
(3) 运行http://192.168.66.99:8080/solr/admin/collections?action=CREATE& name=test&numShards=2&replicationFactor=2&maxShardsPerNode=3&property.schema=schema2.xml&property.dataDir=/usr/local/data/solr
以上命令将会创建collection test,指定schema2.xml作为其schema配置文件,并指定/usr/local/data/solr为其数据存放目录
(注意如果指定相关配置文件,首先要向zookeeper中上传相关的配置,运行一下命令将schema2.xml上传到zookeeper
java -classpath .:/usr/local/solr/solrhome-1/lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 127.0.0.1:1181,127.0.0.1:2181,127.0.0.1:3181 -confdir /usr/local/solr/solrhome-1/update/ -confname solr-conf
)
在我本机运行时出现错:
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'test_shard1_replica1': Unable to create core: test_shard1_replica1 Caused by: Lock obtain timed out: NativeFSLock@/usr/local/data/solr/index/write.lock
这是因为3个节点都在我本机,我们将索引目录指定为同一个,这种创建方式默认的数据文件夹会重复,我们可以分别指定分片文件夹
2.删除collection 官方示例:/admin/collections?action=DELETE&name=collection 我的示例:http://192.168.66.99:8080/solr/admin/collections?action=DELETE&name=test 3.创建分片 官方示例:/admin/collections?action=CREATESHARD&shard=shardName&collection=name /admin/collections?action=SPLITSHARD: split a shard into two new shards
我的示例:http://192.168.66.99:8080 /solr/admin/collections?action=CREATESHARD&collection=test&shard=shard1&name=test_shard1_replica1&property.schema=schema2.xml&property.dataDir=/usr/local/data/solr/test_shard1_replica1
本人测试,如果collection是使用第1节方式创建的,使用这种方式进行创建分片时,无法正确执行,原因待研究
4.其他
/admin/collections?action=RELOAD: reload a collection /admin/collections?action=SPLITSHARD: split a shard into two new shards /admin/collections?action=CREATESHARD: create a new shard /admin/collections?action=DELETESHARD: delete an inactive shard /admin/collections?action=CREATEALIAS: create or modify an alias for a collection /admin/collections?action=DELETEALIAS: delete an alias for a collection /admin/collections?action=DELETEREPLICA: delete a replica of a shard
/admin/collections?action=ADDREPLICA: add a replica of a shard
/admin/collections?action=CLUSTERPROP: Add/edit/delete a cluster-wide property
/admin/collections?action=MIGRATE: Migrate documents to another collection /admin/collections?action=ADDROLE: Add a specific role to a node in the cluster /admin/collections?action=REMOVEROLE: Remove an assigned role /admin/collections?action=OVERSEERSTATUS: Get status and statistics of the overseer /admin/collections?action=CLUSTERSTATUS: Get cluster status /admin/collections?action=REQUESTSTATUS: Get the status of a previous asynchronous request
/admin/collections?action=LIST: List all collections
二.Cores API
solr的core在我看来是对shard进行各种操作的,一个core可视为一个shard或者其replica的管理,但是也可以创建collection,
参考:https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage
访问方式: http://localhost:8983/solr/admin/cores?action=action,操作有以下几种
1.查看状态
官方示例:http://localhost:8983/solr/admin/cores?action=STATUS&core=core0
2.创建core
官方示例:http://localhost:8983/solr/admin/cores?action=CREATE&name=coreX&instanceDir=path/to/dir&config=config_file_name.xml&schema=schem_file_name.xml&dataDir=data 可选参数基本与创建collection相同
Parameter
Description
name | The name of the new core. Same as "name" on the <core> element. |
instanceDir | The directory where files for this SolrCore should be stored. Same as instanceDir on the <core> element. |
config | (Optional) Name of the config file (solrconfig.xml) relative to instanceDir. |
schema | (Optional) Name of the schema file (schema.xml) relative to instanceDir. |
datadir | (Optional) Name of the data directory relative to instanceDir. |
configSet | (Optional) Name of the configset to use for this core (see ) |
collection | (Optional) The name of the collection to which this core belongs. The default is the name of the core. collection.<param>=<value> causes a property of <param>=<value> to be set if a new collection is being created. Use collection.configName=<configname> to point to the configuration for a new collection. |
shard | (Optional) The shard id this core represents. Normally you want to be auto-assigned a shard id. |
property.name=value | (Optional) Sets the core property name to value. See . |
async | (Optional) Request ID to track this action which will be processed asynchronously |
我的示例:
http://192.168.66.99:8080/solr/admin/cores?action=CREATE&name=test&collection=test&shard=shard1&instanceDir=/usr/local/data/solr/solr-1/test/&schema=schema2.xml
name指明core名称 该名称为solrhome下的文件夹名称,该文件夹下存放该分片的数据文件
collection指明collection名称 若collection 不存在则创建 若存在则判断shard
shard指明分片名称 若shard不存在,则创建 若存在则创建一个该分片的副本
该命令会在 http://192.168.66.99:8080上创建一个名为test的collection,并且创建一个名为shard1的分片,并且该机器为这个分片的leader
http://192.168.66.99:8080/solr/admin/cores?action=CREATE&name=test_shard1_replica_2&collection=test&shard=shard1
该命令会在 http://192.168.66.99:8080上为test创建shard1的副本
3.刷新core
官方示例:http://localhost:8983/solr/admin/cores?action=RELOAD&core=core0
4.重命名core
官方示例:http://localhost:8983/solr/admin/cores?action=RENAME&core=core0&other=core5
5.交换core
官方示例:http://localhost:8983/solr/admin/cores?action=SWAP&core=core1&other=core0
6.下线core
官方示例:http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core0
可选参数:
deleteIndex: if true, will remove the index when unloading the core.
deleteDataDir: if true, removes the data directory and all sub-directories.
deleteInstanceDir: if true, removes everything related to the core, including the index directory, configuration files, and other related files.
async: if set to a value, makes the call asynchronous. This call can then be tracked using the REQUESTSTATUS API.
7.合并索引
官方示例:
方 式1:http://localhost:8983/solr/admin/cores?action=MERGEINDEXES& core=core0&indexDir=/opt/solr/core1/data/index&indexDir=/opt/solr/core2/data/index
方式2:http://localhost:8983/solr/admin/cores?action=mergeindexes&core=core0&srcCore=core1&srcCore=core2
8.切分
官方示例:http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2
可选参数:
Parameter
Description
Multi-valued
core | The name of the core to be split. | false |
path | The directory path in which a piece of the index will be written. | true |
targetCore | The target Solr core to which a piece of the index will be merged | true |
ranges | A comma-separated list of hash ranges in hexadecimal format | false |
split.key | The key to be used for splitting the index | false |
async | (Optional) Request ID to track this action which will be processed asynchronously | false |
9.查看请求状态
官方示例:http://localhost:8983/solr/admin/cores?action=REQUESTSTATUS&requestid=1
三.collection实践拓展
上述API提供给了我们一组操作collection和core的方法,现在来想一想实际场景中可能遇到的问题
1.场景1新增collection
搭建完solrcloud后我们首先要考虑的就是建立collection,并对其进行分片,我们有两种方式来做这件事
(1)让solrcloud自动帮我们分片,指定分片名称等,即运行命令:
http://192.168.66.99:8080/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFactor=2&maxShardsPerNode=3
(2)自己指定每个分片的机器,即分别运行命令:
http://192.168.66.99:7080/solr/admin/cores?action=CREATE&name=test_shard1_replica_1&collection=test&shard=shard1
...
这两种方式均可以指定配置文件,及存储路径
2.场景2-扩容
随着数据量和访问量的增大,我们需要对solrcloud进行扩容,以维持其运行,这又可能包含两种场景
(1)增加一个collection shard
方式一:使用action=SPLITSHARD将一个分片切分成两块,然后再进行重命名等其他操作
方式二:使用cores?action=CREATE&name=test&collection=test&shard=shard1直接创建
(2)增加一个shard的副本
同样使用cores?action=CREATE&name=test&collection=test&shard=shard1直接创建
3.场景3-更换服务器
个人建议如下,先将新服务器加入solrcloud,同步索引文件,然后再下线老服务器,安全快捷直接通过管理界面即可实现
通过以上场景可以发现,使用core api在实际情况下可能更加快捷,因此可以重点学习
4.另外,有时我们在配置solrcloud过程中可能会出现各种配置错误,这种错误会在solrcloud的管理界面进行提示,比如配置collection时指定schema.xml而在zookeeper中并不存在指定的文件 这时solrcloud就会提示: test3_shard2_replica1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load core configuration for core test3_shard2_replica1 如何处理这种错误呢: (1)删除solrhome下的相关文件夹 (2)挨个重启solrcloud节点