Sort with search#DataDocuments service orderByFields field

michael · July 28, 2022, 5:02pm

I’ve got an error in Open Search using the search#DataDocuments’s orderByFields field. It says:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Text fields are not optimised for operations that require 
per-document field data like aggregations and sorting, so these operations 
are disabled by default. Please use a keyword field instead. Alternatively, 
set fielddata=true on [categoryName] in order to load field data by 
uninverting the inverted index. Note that this can use significant memory."
      }
    ],
    "status": 400
}

I’m wondering if anyone has encountered this before, and if they have any ideas on how to fix it. I believe that this error has something to do with aggregations needing a specific format to untokenize the fields to their raw text values.

One interesting thing about this is that sorting works with what I can tell primary key fields, but not others. Is there a way to change this functionality? I’m not seeing anything in the Data Search Documentation or the data model.

I was able to replicate this problem in Elastic Search also. To replicate, go to an elastic or open search instance with a tool like Elastic View. Navigate to the mantle_product index and add a string productId to the sort query parameter in Custom Search:

Then if you replace productId with name and search, then it should look like this:

This can also be done with code.

jonesde · July 30, 2022, 6:52pm

The data types for text in ElasticSearch are funny, optimized for search (tokenized) or for lookup and sort. For more info see the sortable field on the DataDocumentField entity:

github.com

moqui/moqui-framework/blob/master/framework/entity/EntityEntities.xml#L145

    
      
                  follow that path of relationships to get to the field.
          
          
        This may also contain a Groovy expression using other fields in the current Map/Object in the document by
                  the path or any parent Map/Object above it in the document. When an expression is used a fieldNameAlias is required.
              </description></field>
              <field name="fieldNameAlias" type="text-medium"><description>Alias to put in document output for field name
                  (ie final part of fieldPath only). Defaults to final part of fieldPath. Must be unique within the document
                  and can be used in EntityCondition passed into the EntityFacade.findDataDocuments() method.</description></field>
              <field name="fieldType" type="text-short"><description>The ElasticSearch field type to use, default is based on entity
                  field type or for expression fields defaults to 'double'.</description></field>
              <field name="sortable" type="text-indicator"><description>Indicates the field should be sortable. This is needed because
                  in ElasticSearch we have two string types to work with: text (tokenized for search, not sortable) and keyword (sortable
                  but not tokenized for search). In ElasticSearch this adds [field name].keyword field of type keyword to sort on if the
                  entity field is a 'text' type ElasticSearch field.</description></field>
              <field name="defaultDisplay" type="text-indicator"><description>Fields displayed by default, set to N to not display in output.</description></field>
              <field name="functionName" type="text-short"><description>If specific the field is queried with the given function.
                  Must be one of the functions available in the view-entity.alias.@function attribute.</description></field>
              <field name="sequenceNum" type="number-integer" default="fieldSeqId as int"/>
              <relationship type="one" related="moqui.entity.document.DataDocument"/>
          </entity>
          <entity entity-name="DataDocumentRelAlias" package="moqui.entity.document" use="configuration">

This is used by the code that sends JSON document schema info in what ElasticSearch calls a ‘mapping’. This code is in the method ElasticFacadeImpl.makeElasticSearchMapping()

github.com

moqui/moqui-framework/blob/master/framework/src/main/groovy/org/moqui/impl/context/ElasticFacadeImpl.groovy#L728

    
      
          }
          static String esIndexToDdId(String index) {
              return EntityJavaUtil.underscoredToCamelCase(index, true)
          }
          
          
static final Map<String, String> esTypeMap = [id:'keyword', 'id-long':'keyword', date:'date', time:'text',
                  'date-time':'date', 'number-integer':'long', 'number-decimal':'double', 'number-float':'double',
                  'currency-amount':'double', 'currency-precise':'double', 'text-indicator':'keyword', 'text-short':'text',
                  'text-medium':'text', 'text-intermediate':'text', 'text-long':'text', 'text-very-long':'text', 'binary-very-long':'binary']
          
          
static Map makeElasticSearchMapping(String dataDocumentId, ExecutionContextFactoryImpl ecfi) {
              EntityValue dataDocument = ecfi.entityFacade.find("moqui.entity.document.DataDocument")
                      .condition("dataDocumentId", dataDocumentId).useCache(true).one()
              if (dataDocument == null) throw new EntityException("No DataDocument found with ID [${dataDocumentId}]")
              EntityList dataDocumentFieldList = dataDocument.findRelated("moqui.entity.document.DataDocumentField", null, null, true, false)
              EntityList dataDocumentRelAliasList = dataDocument.findRelated("moqui.entity.document.DataDocumentRelAlias", null, null, true, false)
          
          
    Map<String, String> relationshipAliasMap = [:]
              for (EntityValue dataDocumentRelAlias in dataDocumentRelAliasList)
                  relationshipAliasMap.put((String) dataDocumentRelAlias.relationshipName, (String) dataDocumentRelAlias.documentAlias)

What this does is add a sub-field called ‘keyword’ of type keyword and then in search code sort by the categoryName.keyword field instead of the non-sortable categoryName field (tokenized for search).

For one example of code that handles this see the orderByFields field in the search#Party service:

github.com

moqui/mantle-usl/blob/master/service/mantle/party/PartyServices.xml#L184

    
      
          <script><![CDATA[
              if (queryString) {
                  if ('MANTLE_USER_ORG'.equals(entityFilterSetId)) {
                      queryString = (queryString ? queryString + ' AND ' : '' ) + 'ownerPartyId:(' + (filterOrgIds ? filterOrgIds.join(' OR ') + ' OR ' : '') + '_NA_)'
                  } else if ('MANTLE_ACTIVE_ORG'.equals(entityFilterSetId) && activeOrgId) {
                      queryString = (queryString ? queryString + ' AND ' : '' ) + 'ownerPartyId:(' + activeOrgId + ' OR _NA_)'
                  }
              }
          ]]></script>
          
          
<set field="orderByFields" from="orderByField ? [orderByField.replace('combinedName', 'combinedName.keyword').replace('username', 'userAccounts.username.keyword')] : null"/>
          <set field="highlightFields" from="['roles.roleTypeId', 'classifications.class', 'pseudoId',
                  'identifications.idValue', 'combinedName', 'firstName', 'lastName', 'organizationName', 'userAccounts.username',
                  'contactMech.infoString', 'contactMechs.address1', 'contactMechs.unitNumber', 'contactMechs.postalCode',
                  'contactMechs.city', 'contactMechs.state', 'contactMechs.areaCode', 'contactMechs.contactNumber', 'contactMechs.city']"/>
          <!-- <log level="warn" message="Product search queryString: ${queryString}"/> -->
          <if condition="queryString">
              <service-call name="org.moqui.search.SearchServices.search#DataDocuments" out-map="context"
                      in-map="[indexName:indexName, documentType:documentType, queryString:queryString, flattenDocument:false,
                           orderByFields:orderByFields, highlightFields:highlightFields, nestedQueryMap:nestedQueryMap,
                           pageIndex:pageIndex, pageSize:pageSize, pageNoLimit:pageNoLimit]"/>

michael · July 30, 2022, 9:12pm

That makes a ton of sense. I guess I was missing the sortable field. I could’ve sworn I’ve looked for that before.