Skip to content

findDocumentsByElastic

Updated pdexter 2022-12-08

(Applies to server-side and client-side rulesets)

Function to make a query (ElasticSearch) and retrieve a "resultset" of documents.

Syntax

ft3.findDocumentsByElastic(query, user-id, [options,] function(err, result) {

        ...

})

Part Description
query a structured ElasticSearch query
user-id documentId of the user, to enforce security of data visibility
options (optional); an object containing options; see below
err if an error occurred, contains the error
result an object containing the full ElasticSearch result structure.

options

The optional argument options can contain alternative options/switches to change the mode of the query.

Option Description
includeAllVersions Fetches all versions of the document.
Normally, only the current version of a document is fetched.
Value: true | false (default false)
includeDeleted Fetches deleted documents as well as undeleted ones.
Value: true | false (default false)
searchOfflineOnly (unverified)
sourceFilter.include Sets what fields should be returned in the _source object of the results.
trackTotalHits Whether to override the limit on result.data.hits.total of 10,000, to return the correct count of potential matches.

ElasticSearch result size

This method has a default results limit of 50 documents.

The result returned will contain the full number of matching documents, (result.data.hits.total), but the actual returned documents will only number 50.

result.data.hits.total has a practical limit of 10,000 in ElasticSearch 7 However, this can be overridden by adding "trackTotalHits" : true to the options of the function; the total value will then be correct for over 10,000 potential hits.

To change this, add a "size": n element to the ElasticSearch query. This will override the default size added by the appserver (see example).

Example

    ft3 = ntf.scope;
    ...

    var eqry = {
        'size' : 1000,
        'query': {'bool': {'filter' : [
            {'term': {'type' : 'insect'}},
            {'term': {'color' : 'blue'}}
        ]}}
    };

    // Modify from, size dynamically after the eqry statement
    eqry.from = 100;
    eqry.size = 500;
    eqry.sort = [{'commonName':'asc'}];

    // Add option to get a true total over 10000
    var options = {
        trackTotalHits : true
    };

    ft3.findDocumentsByElastic(eqry, ntf.user.documentId, options, function(err, result) {
        if (err) {
            ntf.logger.info('Error: ' + err.message);
        } 
        else {
            var i;
            ntf.logger.info('# Total Matches in DB: ' + result.data.hits.total);
            ntf.logger.info('# Returned Results: ' + result.data.hits.hits.length);
            for (i=0;i < result.data.hits.hits.length;i++) {
                var doc = result.data.hits.hits[i]._source;
                ntf.logger.info('Blue Insect [' + i + ']: ' + doc.name);
            }
        }
    });
    return;

Tip

Use the javascript map function for arrays to convert an ElasticSearch result to an array of documents:

var docs = result.data.hits.hits.map(function(hit) { return hit._source });
// OR
var docs = result.data.hits.hits.map(hit => hit._source);

Suggested Shortcut

The ruleset include "ft3 Extension Functions" contains a ready made function getDocsByEqry, which will return (or "resolve") an array of documents to use directly.

This method is coded as an async/await function, therefor it is necessary to declare the ruleAction as async.

    ruleAction : async function(ntf, callback) {
        var ft3 = ntf.scope;

        // Get configuration via query
        var eqry = {"query": {"bool": {"filter": [
            {"term" : {"systemHeader.systemType" : "configuration"}}    
        ]}}};

        var docs = await ft3.getDocsByEqry(ntf, eqry);

        var configDoc = docs?.[0];
        ...

For 99% of a developer's query needs, this should well suffice.

Limiting the fields returned

In order to reduce the number of fields of documents returned, use a sourceFilter option.

// Enter code here
var eqry = {"query": {"bool": {"filter": [
    {"term" : {"appTags" : "spider"}},
    {"term" : {"systemHeader.systemType" : "document"}}
]}}};

var options = {
    sourceFilter : {
        include : ['commonName','genus','species']
    }
};

ft3.findDocumentsByElastic(eqry, ntf.userId, options, function(err, result) {
    if (err) {
        ntf.logger.error('Error on query: ' + err.message);
    }
    else {
        var firstDoc = result.data.hits.hits[0]._source;
        ntf.logger.info('first document: ' + JSON.stringify(firstDoc));
        // shows {"commonName": "Huntsman", "genus": "Beregama", "species": "Aurea"}
    }
});

This can be useful for retrieving a large number of result documents, but where only one or two fields of each document are needed for the script.

Result topology

The result object returned by this function is a full ElasticSearch result structure, so may be quite arcane to dissect. The following properties are the most useful ones for our purposes.

Property Description
result.data.hits.total Number - The full count of matching documents in the database.
This may be limited to a count of 10,000, unless the option trackTotalHits is employed.
This is not the same as the count of documents returned, which may be limited by the size property of the submitted query.
result.data.hits.hits Array of objects representing each fetched document.
result.data.hits.hits[n]._source The full document found on the nth hit of the hits array

Sample ElasticSearch result

{
  "statusCode": 200,
  "data": {
    "took": 2,
    "timed_out": false,
    "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
    },
    "hits": {
      "total": 1,
      "max_score": 0,
      "hits": [
        {
          "_index": "gobble",
          "_type": "documents",
          "_id": "584f25cf22478a32c64c440e",
          "_score": 0,
          "_source": {
            "event": "OnLoad",
            "systemHeader": {
              "summaryName": "Dexwise Spider - OnLoad",
              "serverUpdatedDate": "2017-12-19T04:14:14.207000+00:00",
              "systemType": "document",
              "excludeGeneralSearch": false,
              "templateId": "224bc62858d73ce57a9cb85e",
              "serverDate": "2017-03-07T05:28:24.514Z",
              "currentVersion": true,
              "versionId": "12e837f0-e473-11e7-a189-39e2cecb7850",
              "createdWith": "224bc62858d73ce57a9cb85e",
              "summaryDescription": "RuleSet for processing of template Spider (et al), OnLoad event.",
              "createdBy": "5476946bb671fc07650beed1",
              "createdDate": "2017-12-19T04:14:14.767000+00:00",
              "serverCreatedDate": "2017-12-19T04:14:14.207000+00:00",
              "previousVersionId": "91c0bd80-e46f-11e7-a189-39e2cecb7850"
            },
            "businessType": "RuleSet",
            "flagInPreSave": true,
            "syncTemplateRule": true,
            "versionTag": "20171018A",
            "ruleSetGroup": "dexwise",
            "addSecurityKeys": [
              {
                "name": "DA-Dexwise-Rulesets",
                "documentId": "b8d9eeb0-02f6-11e7-965b-4799f2a39188"
              }
            ],
            "objectEventDescriptor": "Dexwise Spider - OnLoad",
            "template": [
              {
                "name": "Spider",
                "documentId": "c6a8eb50-c029-11e6-9eea-13b66331d97f"
              }
            ],
            "vtn": "system.ruleset",
            "appTags": [
              "system",
              "ruleset"
            ],
            "documentId": "abbff110-c0b7-11e6-9eea-13b66331d97f"
          }
        }
      ]
    }
  },
  "filter": false
}