This article explains the Fredhopper data quality reports that enable you to identify why specific input data isn't processed and fed to the environment. When the input data doesn't comply with the Fredhopper data specifications, it's being rejected; you can quickly troubleshoot the reason via checking the following logs on fas/<INSTANCE ID>/load-data/<TRIGGER ID>/logs/data-quality/ path of the file API.
-
Categories with missing definitions
-
Logic: Check for not defined categories in
categories.csvfor a given locale but present in at least one category path -
Output DQ report:
dq_category-missing-definition.txt.gz - Consequence: Child category is rejected
- Severity: High
-
Logic: Check for not defined categories in
-
Categories with missing name value
- Logic: Check for categories with missing name values
-
Output DQ report:
dq_category-missing-name.txt.gz - Consequence: Category name will not be translated for that specific locale
- Severity: Medium
-
'0' root category presense
-
Logic: Check for "0" root category id defined in
categories.csv. If found FAS reindex will fail therefore reported as ERROR -
Output DQ report:
dq_category-zero-root-category-id.txt.gz - Consequence: No correct category tree can be build
- Severity: Urgent
-
Logic: Check for "0" root category id defined in
-
Duplicate category definitions with different name values
- Logic: Check if there are duplicate category id entries with different names for the same locale
-
Output DQ report:
dq_category-duplicates.txt.gz - Consequence: FAS randomly indexes one of the two category names
- Severity: Low/Medium
-
Categories detached from category tree
- Logic: Check if categories or branches are detached from the category tree
-
Output DQ report:
dq_category-detached-categories.gz - Consequence: Detached categories will be rejected. If products are only attached to the detached tree, those will be rejected as well.
- Severity: High
-
Duplicate meta definitions with different types
-
Logic: Check for attributes defined multiple times in
custom_attributes_meta.csvwith different basetypes -
Output DQ report:
dq_meta-inconsistency.txt.gz - Consequence: An attribute in FAS can only have one basetype. If the data format is for the other basetype, data and therefore items can be rejected
- Severity: Medium
-
Logic: Check for attributes defined multiple times in
-
Duplicate meta definitions with the same types
-
Logic: Check for repeating metadata definitions in
custom_attributes_meta.csv -
Output DQ report:
dq_meta-duplication.txt.gz - Consequence: Large file size than strictly necessary
- Severity: Low
-
Logic: Check for repeating metadata definitions in
-
Attributes with invalid base type in meta definitions
- Logic: Check for any attributes whose type is not any of the following valid Fredhopper types: not int, float, list, list64, set, set64, text, asset
-
Output DQ report:
dq_meta-invalid-basetypes.txt.gz - Consequence: Attribute will not be indexed
- Severity: Medium
-
Incorrect (different than 'hierarchical') basetype for 'categories' attribute in meta definitions
-
Logic: Check if '
categories' attribute is defined with a type different than 'hierarchical' in metadata. If found FAS reindex will fail therefore be reported as ERROR. -
Output DQ report:
dq_meta-incorrect-basetype-for-categories-attribute.txt.gz - Consequence: FAS can only have one hierarchical attribute and will fail to reindex
- Severity: Urgent
-
Logic: Check if '
-
Attributes named after FAS reserved words in meta definitions
- Logic: Check if attributes are named after FAS reserved words (secondid;universe;countries;itemid;_match_rate) in the metadata definition. If found FAS reindex will fail therefore reported as ERROR.
-
Output DQ report:
dq_meta-reserved-attribute-names.txt.gz - Consequence: FAS will fail to reindex
- Severity: Urgent
-
Duplicate attributes
- Logic: Check for attributes with same 'attribute_id' on product and variant level
-
Output DQ report:
dq_meta-duplicate-attribute-ids.txt.gz - Consequence: Attribute will be rejected by FAS
- Severity: Medium/High
-
'categories' attribute presence in custom attributes files
- Logic: Check if the categories attribute is present in custom attribute files. If found FAS reindex will fail therefore reported as ERROR.
-
Output DQ report:
dq_attributes-categories-attribute-presence-in-attributes-files.txt.gz - Consequence: FAS will fail to reindex
- Severity: Urgent
-
Attributes in custom attributes files with missing basetype definition
-
Logic: Check for attributes in custom attributes files with missing metadata definition in '
custom_attributes_meta.csv' -
Output DQ report:
dq_meta-missing-basetype-meta-definition.txt.gz - Consequence: Attribute will not be indexed
- Severity: Medium/High
-
Logic: Check for attributes in custom attributes files with missing metadata definition in '
-
Localizable attributes with missing locales in custom attributes files
-
Logic: Check for localizable attributes (set/set64/list/list64/asset) with missing '
locale' values in custom attributes files -
Output DQ report:
dq_attributes-localizable-without-locale.txt.gz - Consequence: Attribute will have no value for the missing locale(s) and the attribute_value_id will be shown
- Severity: Medium
-
Logic: Check for localizable attributes (set/set64/list/list64/asset) with missing '
-
Invalid attribute values for int/float attributes
- Logic: Check for int/float attributes with non number values
-
Output DQ report:
dq_meta-invalid-int_float.txt.gz - Consequence: Item will be rejected
- Severity: Medium/High
-
Attributes with missing 'attribute_value' in custom attributes files
- Logic: Check for attributes with missing values in custom attributes files
-
Output DQ report:
dq_attributes-undefined-att-value.txt.gz - Consequence: Attribute has no value, attribute value id will be used
- Severity: Medium
-
Single value basetype attributes with multiple values in custom attributes files
- Logic: Check for single value attributes (int/float/asset/text/list/list64) with multiple values in custom attributes files
-
Output DQ report:
dq_meta-multivalues-for-singlevalue-basetypes.txt.gz - Consequence: Product will get rejected
- Severity: High
-
Set/list attributes with more than 10000 attribute_value_ids in custom attributes files
- Logic: Check for list/set attributes with more than 10000 unique values
-
Output DQ report:
dq_meta-invalid-list_set.txt.gz - Consequence: List/set attributes with more than 10.000 values can impact performance
- Severity: Low/Medium
-
Set64/list64 attributes with more than 64 attribute_value_ids in custom attributes files
- Logic: Check for list64/set64 attributes with more than 10000 unique values
-
Output DQ report:
dq_meta-invalid-list64_set64.txt.gz - Consequence: List64/set64 attributes allow for 64 different values
- Severity: Medium/High
-
List(64)/set(64) attributes with missing 'attribute_value_id' in custom attributes files
-
Logic: Check for list/list64/set/set64 attributes with missing '
attribute_value_id' in custom attributes files -
Output DQ report:
dq_attributes-undefined-att-value-id.txt.gz - Consequence: Attribute value will be used to generate attribute_value_id. For multi-language setup this means the attribute_value_id will differ per locale and all values will be visible for all locales
- Severity: Medium/High
-
Logic: Check for list/list64/set/set64 attributes with missing '
-
Duplicate products in 'products.csv'
-
Logic: Check for duplicate product definitions in '
products.csv' -
Output DQ report:
dq_ids-repeating-product-secondids.txt.gz - Consequence: First product might be overwritten
- Severity: Low/Medium
-
Logic: Check for duplicate product definitions in '
-
Products with invalid (not matching ^[a-z0-9_]+$) 'secondid' values in 'products.csv'
-
Logic: Check for products with '
product_id' not matching ^[a-z0-9_]+$ -
Output DQ report:
dq_ids-invalid-product-secondids.txt.gz - Consequence: product with invalid secondids are likely to be rejected
- Severity: Medium/High
-
Logic: Check for products with '
-
Products with invalid categories in 'products.csv'
-
Logic: Check for products with categories that are not defined in '
categories.csv'but are referenced in 'products.csv'. -
Output DQ report:
dq_category-products-with-invalid-categories.txt.gz - Consequence: If a product belongs to only an invalid category, it will be rejected.
- Severity: Medium/High
-
Logic: Check for products with categories that are not defined in '
-
Product attributes without products
-
Logic: Check for product attributes present in custom attributes file but with missing product defined in
'products.csv'. -
Output DQ report:
dq_orphans-custom-attributes-without-products.txt.gz - Consequence: Attribute will not be indexed
- Severity: Medium
-
Logic: Check for product attributes present in custom attributes file but with missing product defined in
-
Variant attributes without variants
-
Logic: Check for variant attributes present in custom attributes file but with missing variant defined in
'variants.csv' -
Output DQ report:
dq_orphans-custom-attributes-without-variants.txt.gz - Consequence: Attribute will not be indexed
- Severity: Low/Medium
-
Logic: Check for variant attributes present in custom attributes file but with missing variant defined in
-
Products without variants and vice versa
-
Logic: Check for products defined in '
products.csv'but without corresponding variants defined in 'variants.csv' and vice versa -
Output DQ report:
dq_orphans-products-without-variants_variants-without-products.txt.gz - Consequence: If a product has no variant or a variant no product, they will be rejected
- Severity: Medium/High
-
Logic: Check for products defined in '
-
Variants referencing multiple products
- Logic: Check for variants that reference different products
-
Output DQ report:
dq_variant-references-multiple-products.txt.gz - Consequence: Variant will be rejected
- Severity: Medium/High
-
Variants with invalid (not matching ^[a-z0-9_]+$) 'secondid' values in 'variants.csv'
-
Logic: Check for variants with '
product_id' not matching ^[a-z0-9_]+$ -
Output DQ report:
dq_ids-invalid-variant-secondids.txt.gz - Consequence: Variant will be rejected
- Severity: Medium
-
Logic: Check for variants with '
-
Missing operation_type attribute during INCREMENTAL updates
- Logic: Check for missing 'operation_type' attribute during INCREMENTAL updates
-
Output DQ report:
dq_incremental-products-variants-without-operation_type.txt.gz - Consequence: Product/variant will not be updated
- Severity: Medium
-
'operation_type' attribute with non-null 'attriubte_value_id'
-
Logic: Check for '
operation_type' attribute with non-value 'attribute_value_id'. If found FAS reindex will fail therefore reported as ERROR. -
Output DQ report:
dq_incremental-non-null-attribute-value-id-for-operation_type.txt.gz - Conseqence: FAS will fail to reindex
- Severity: Urgent
-
Logic: Check for '
-
Products with 'secondid' values used as variants 'secondid' values
-
Logic: Check for product '
secondid' values used as variant 'secondid' values. If found FAS reindex will fail therefore reported as ERROR. -
Output DQ report:
dq_ids-duplicate-secondids.txt.gz - Consequence: FAS will fail to reindex
- Severity: Urgent
-
Logic: Check for product '
Comments
0 comments
Article is closed for comments.