Category Rules (Designer)
About Category Rules
Category rules determine which sentences should be assigned to each category. Category rules are typically examples of words that you want to be included or excluded in the category.
Basic Rules
Basic rules categorize sentences by specifying words that should either be included or excluded in a sentence.
- Select a category or category group.
- Type words within quotations, or drag a key word from the all words tab, into one of the 4 rule lanes.
- Click the preview all rules button to preview the sentences that will go into your category.
Some characters are used to create advanced search operators and therefore will affect the results if used to build a rule. The following characters can be used to basic rule creation:
- Letters
- Numbers
- Percent sign ( % )
- Emojis
- Emoticons
- Currency symbols ( $, €, £ )
Using the 4 Rule Lanes
There are 4 rule lanes that determine the relationship of the words in the rule: OR, AND, AND, NOT.
- OR: This rule lane should contain words that you would like to be present in the sentences in this category.
- AND: These two rule lanes should contain words that must be present together with any of the words from the OR lane.
- NOT: This rule lane should contain words that you do not want in the category, that could have been included based on the things in the OR lane.
Rule Lane Suggestions
Rule lane suggestions helps to build categories faster by analyzing the words typed into a rule lane and suggesting synonyms, related concepts, and common misspellings.
- Type word(s) into a rule lane.
- Click the light bulb icon to the right of the rule lane.
- A window with suggested words will appear above the rule lanes showing your suggested words.
Qtip: Sort suggested words by relevance or name.
- Select a suggestion by clicking the checkbox.
- Once finished with suggestions, click Add.
Suggestions are compatible with the following input:
- Simple terms: e.g. “car”
- Exact phrases: e.g. “sports car”
- Single-character wildcards: e.g. “c?r”
- Multiple-character wildcards: e.g. “technol*”
- Key words: e.g. _mtoken:CAR
Context Rules
Context rules categorize tweets and comments based on the content of the original post or parent document. These rules are useful when categorizing data from threaded conversations on social media.
A context rule is true if it relates to any sentence in the parent document and if it’s true then all sentences from all child documents get categorized. If a context rule is applied to a category group, it will also be applied to all categories in that group as an extended query to every basic rule.
- Select a category or category group.
- Click the Extended tab above the 4 rule lanes.
- Select Parent document from the drop-down list.
- Specify the rule that should apply to the original post.
- Add more rules, if you’d like.
- Click Save node.
CONTEXT RULES FROM FACEBOOK DATA
Data from Facebook is uploaded as a post or a comment. Only comments can contain a link to a parent post, which can be used in context rules.
CONTEXT RULES FROM TWITTER DATA
Data from Twitter can be uploaded as a tweet, a retweet, or a reply. Only replies contain a link to a parent tweet, which can be used in context rules. Replies to a retweet will have the original tweet as its parents.
CONTEXT RULES FROM FILE DATA
Uploaded files can have parent-child hierarchy as long as they have a column mapped to the Parent Natural ID system attribute. This should match the Natural ID of a parent document.
Verbatim-Specific Rules
A verbatim-specific rule categorizes sentences by terms that occur in the verbatim rather than the sentence. These rules are useful when working with social media data where sentence boundaries and style can differ. For more information, see Verbatim-Specific Rules.
Advanced Rule Operators
Special characters can be used for building more advanced rules.
Character | Use |
? | Single character wildcard search.
Example: To search for “text” or “test”, use
|
* | Multiple character wildcard search.
Example: To search for “test”, “tests”, or “tester”, use
|
Boolean operators: AND, OR, NOT | Multiple character wildcard search.
Example:
|
Applying Multiple Rules per Category
You can apply several rules per category. When multiple rules are specified, their queries always have an OR relationship, meaning sentences get categorized if they match at least one of the rules.
- Delete a rule using the trash icon.
- Add a new rule.
- Scroll between rules.
- Preview the currently selected rule.
Referencing Categories in Rules
When you reference a category, you can reuse one category’s rules in another category. Referencing works across all category models, so you can share rules between different models in a project. If you make changes to the referenced category, you need to re-run classification on all categories that reference it to apply those changes.
- Click the category that you would like to add the reference to.
- Click the category reference button above the rule lanes.
- Select the model containing the category you would like to reference.
- Drag the category into a rule lane to refer to its rules.
Category references have the syntax “catRef” and include the category model and category path.
_catRef:[model:”Model Name” path:”Parent Category” node:”CHILD CATEGORY”]
Once you add a category reference, you can click the “Referring Category Nodes” button above the rule lanes of the category that has been referenced to see all references to that node.
Best Practices
- Use Ad Hoc Search to explore data and experiment with rules without altering your category model.
- Fill nodes with basic rules or reuse rules from existing categories.
- Use Theme Detection to refine results.
- Filter your data based on structured attributes and system attributes.
- Use the Source Highlighter to preview a sentence and see which categories the sentence has been assigned to.