Monday, October 17, 2011

FAST ESP : Deleting all documents in a collection




A typical FAST ESP solution has multiple collections. For example, there might be a collection for the data crawled and indexed Sharepoint, another collection for data crawled from a database using JDBC Connector and so on. 


Suppose that you are developing and implementing such a solution. You'd have setup a bunch of collections and setup their crawlers such that the appropriate documents are included under each collection. During the course of this solution's implementation you'll have a need to discard all the documents in a particular collection and crawl everything from scratch. A typical scenario is when you fix a bug in indexing and want to apply the fix to the documents already in the index as well.
The tool to do this is the 'collection-admin.cmd' command-line utility present in the \bin directory. The exact command to use is this

collection-admin -m clearcollection -n

In the above command, the -m flag indicates the mode to run the tool in. In our case we use to the 'clearcollection' to indicate we want to delete/clear or reset the collection.  '' is the name of the collection you want to reset. You can find the name in the admin GUI or using the following command with the same tool
collection-admin -m listcollections
When you invoke the command, the UI will look like this



When this command returns, all documents in the collection would have been deleted. You can then do a full crawl to repopulate this collection with appropriate documents
More information and command line options to run the 'collection-admin' tool can be found in the tool's own commandline help.

No comments:

Post a Comment