1. Help Center
  2. Compliance report
  3. Advanced Compliance Monitoring

Understanding the ACM Data Export

ACM Data Export Feature - Customer Specifications

Overview

The Advanced Compliance Monitoring (ACM) Data Export feature enables organizations to automatically export their compliance monitoring data to external cloud storage platforms or secure SFTP servers. This feature provides seamless integration with your existing data infrastructure, allowing for enhanced analysis and reporting capabilities outside the Didomi ecosystem.

Access: This feature is available through the Didomi Marketplace and requires premium permissions.


Key Benefits

  • Cloud & SFTP Support: Export to AWS S3 or secure SFTP servers (Google Cloud Storage and Microsoft Azure coming soon)
  • Daily Automated Exports: Automated daily exports with data from the previous day
  • Multiple Data Types: Access to raw tracker data, aggregated vendor information, and property-level compliance metrics
  • CSV Format: Export data in CSV format (Parquet format coming in future releases)
  • Secure Authentication: Enterprise-grade security with encrypted credential storage
  • Connection Testing: Built-in validation to ensure proper configuration before activation

Supported Export Destinations

Amazon Web Services (AWS) S3

  • Requirements: S3 bucket with appropriate IAM role permissions
  • Configuration Fields:
    • Bucket name *: The name of your S3 bucket
    • Bucket key: Path to the object inside the bucket (optional)
    • Region *: AWS region where your bucket is located
    • STS Role ARN *: AWS IAM role ARN for secure access

SFTP Server

  • Requirements: SFTP server with write permissions
  • Configuration Fields:
    • SFTP Server Host *: Server hostname or IP address
    • Username *: SFTP username for authentication
    • Password *: SFTP password for authentication

Coming Soon

  • Google Cloud Platform (GCP) Storage: Service account-based authentication
  • Microsoft Azure Blob Storage: App registration with storage account permissions

*Required fields


Available Data Types

Raw Tracker Data

Description: Individual tracking events detected during compliance monitoring scans. Each row represents a single tracker (cookie, script, etc.) found on your websites during different consent scenarios.

Key Fields:

  • ID: Unique identifier for this tracker instance within the scan
  • EVENT_TIME: Timestamp when the crawler registered the event during scanning
  • CREATED_AT: Timestamp when tracker data was processed at end of page scan (for data ingestion)
  • COLLECT_ID: Unique identifier for the collect (scan scenario: accept/refuse/no action)
  • COLLECT_PAGE_ID: Unique identifier for the specific page scanned within a collect
  • HOST: Domain hosting the tracker
  • TYPE: Type of tracker (e.g., "cookie", "script")
  • PAGE_URL: Full URL where tracker was detected
  • PAGE_URL_HOST: Host domain of the page being scanned
  • PAGE_URL_SLD: Second-level domain of the page
  • PROPERTY_ID: Your website property identifier
  • REPORT_ID: Unique report identifier
  • ORGANIZATION_ID: Your organization identifier
  • IS_TC_STRING_IDENTICAL: Whether the TCF API tc_string matches the gdpr_consent query parameter
  • INITIATOR: The vendor actually dropping/creating the tracker
  • INITIAL_NAME: Original tracker name before processing
  • NAME: Processed tracker name (after regex pattern matching)
  • NAME_PATTERN: Regex pattern used to process the initial name
  • VENDOR: Vendor that owns the tracker (if successfully matched in database)
  • VALUE: Tracker value/content
  • LIFETIME_SECONDS: How long the tracker persists
  • IS_THIRD_PARTY: Whether tracker is from a third-party domain
  • IS_HTTP_ONLY, IS_SECURE_ONLY, IS_PERSISTENT: Security and persistence flags
  • SOURCE: How tracker was created ("javascript" or "http request")
  • CMP: Consent scenario context (JSON format showing when tracker was detected)
  • HASH_NAME_HOST_TYPE: Hash for aggregating trackers (processed name + host + type)
  • HASH_INITIAL_NAME_HOST_TYPE: Hash for aggregating trackers (initial name + host + type)

Use Cases:

  • Detailed tracker behavior analysis across consent scenarios
  • Technical debugging of tracking implementations
  • Understanding when specific trackers fire relative to consent actions
  • Vendor compliance auditing at the individual tracker level

Aggregated Vendor Data

Description: Consolidated information about vendors detected across your properties, aggregated by consent scenarios. Each row represents a vendor's activity summary within a specific report.

Key Fields:

  • REPORT_ID: Unique report identifier
  • PROPERTY_ID: Website property identifier
  • ORGANIZATION_ID: Your organization identifier
  • ID: Vendor identifier (database ID if matched, or host domain if not)
  • REQUEST_COUNT: Number of tracker requests from this vendor
  • CMP: Consent scenarios where vendor was active (JSON array format)
  • PARTNER: Detailed vendor information and metadata (JSON format)
  • CREATED_AT: Report generation timestamp

Data Relationships: Can be joined with Raw Tracker Data using the PARTNER/VENDOR fields to get detailed tracker information for each vendor.

Use Cases:

  • Vendor performance reporting across consent scenarios
  • Third-party risk assessment and vendor inventory
  • Compliance dashboard creation showing vendor-level metrics
  • Understanding which vendors are active in different consent contexts

Aggregated Properties Data

Description: Property-level compliance status and collection results. Each row represents the overall compliance status for a website property.

Key Fields:

  • WEBSITE: Full website URL
  • PROPERTY_ID: Unique property identifier
  • CREATED_AT: Report generation timestamp
  • REPORTSTATUS: Overall report processing status (e.g., "Processed", "Partially Processed", "Failed")
  • COLLECT_CONSENT_TO_ALL: Status/errors for the "accept all" scenario collection
  • COLLECT_REFUSE_TO_ALL: Status/errors for the "refuse all" scenario collection
  • COLLECT_NO_USER_CHOICE: Status/errors for the "no user action" scenario collection

Status Codes: The collect status fields contain human-readable status messages indicating the outcome of each consent scenario collection:

Success Status:

  • SUCCESS: Collection completed successfully for that scenario

Error Categories:

  • 1. Website not reachable: Network issues, website down, or crawler cannot access the site
  • 2. Anti-bot system: Bot protection or anti-automation measures detected and blocking crawler
  • 3. No existing or clear CMP on the page: No Consent Management Platform found or CMP not clearly identifiable
  • 4. No refuse options found: CMP detected but no refuse/reject consent options available
  • 4. No accept button found: CMP detected but no accept consent button found
  • 5. Not able to click on the button in the page: UI interaction failed - button exists but cannot be clicked
  • 6. General Error: Other unspecified errors not covered by above categories

Important Notes:

  • Multiple status codes can appear in a single field if multiple issues occurred
  • The "no_user_choice" scenario typically has fewer error types as it doesn't require button interactions
  • Status codes help identify specific areas for website optimization or troubleshooting

Use Cases:

  • High-level compliance monitoring across your website portfolio
  • Identifying properties with collection issues or errors
  • Monitoring scan success rates and troubleshooting failed collections
  • Executive reporting on overall compliance monitoring health

File Formats

CSV Format

  • Current Implementation: All exports are delivered in CSV format
  • Advantages: Excel-compatible, human-readable, easy to import into analytics tools
  • Structure: Header row with column names, UTF-8 encoding
  • File Organization: Files are organized in a partitioned folder structure by data type and export date:
CSV/
├── acm_trackers/
│ └── date=<date_of_export>/
│ └── file.csv
├── acm_vendors/
│ └── date=<date_of_export>/
│ └── file.csv
└── acm_properties/
└── date=<date_of_export>/
└── file.csv

Coming Soon

  • Parquet Format: Optimized columnar storage format for big data analytics

Export Scheduling

Current Implementation

  • Frequency: Daily exports
  • Data Coverage: Previous day's collected data
  • Timing: Automated execution (specific time may vary)
  • Configuration: Pre-set schedule (not user-configurable)

Setup Process

Step 1: Access the Data Export Configuration

  1. Navigate to the Didomi Marketplace
  2. Locate the "Data Export" feature
  3. Ensure your organization has the required premium permissions

Step 2: Configure Export Settings

  1. Select Data Types: Choose from raw trackers, aggregated vendors, and/or aggregated properties
  2. Choose Destination: Select AWS S3 or SFTP server (format is automatically set to CSV)

Step 3: Provide Authentication Credentials

Enter the required credentials for your chosen destination:

  • AWS S3: Bucket name, bucket key (optional), region, and STS Role ARN
  • SFTP: Server host, username, and password

Step 4: Test Connection

  1. Click "Test Connection" to validate your configuration
  2. The system will attempt to write a test file to your destination
  3. Success: Configuration can be saved and activated
  4. Failure: Review error message and adjust credentials/configuration

Step 5: Save and Activate

  1. Save your configuration after successful connection test
  2. Your export will be automatically scheduled for daily execution starting the next day

Security Considerations

Data Protection

  • All credentials are encrypted using enterprise-grade encryption
  • Data transmission occurs over secure protocols (HTTPS/SFTP)
  • Export files contain only your organization's compliance data

Access Control

  • Feature requires premium permissions
  • Organization-level access controls apply

User Interface Access Only

The ACM Data Export feature is currently available exclusively through the Didomi Console user interface. There is no API access available for this feature at this time.

All configuration, management, and monitoring of data exports must be performed through the web-based console interface.


Monitoring and Status Tracking

Export Status Dashboard

  • View all configured exports
  • Last export date and time

Troubleshooting

Common Issues

Connection Test Failures

  • Verify credentials are correct and not expired
  • Ensure destination bucket/container exists
  • Check that permissions allow write access
  • Verify network connectivity if using SFTP

Missing Export Files

  • Verify destination path configuration
  • Ensure sufficient storage space at destination

Authentication Errors

  • Refresh expired credentials
  • Verify STS role permissions and trust policies (AWS)
  • Check SFTP server connectivity and credentials
  • Validate bucket access permissions


Limitations and Considerations

Current Limitations

  • Export destinations limited to AWS S3 and SFTP (Google Cloud Storage and Microsoft Azure coming soon)
  • Export format limited to CSV (Parquet format coming in future releases)
  • Daily export schedule is pre-set and not configurable
  • Maximum file size depends on data volume and destination limits

Performance Considerations

  • Large datasets may take longer to export
  • Network bandwidth affects transfer times
  • Destination storage costs apply based on usage

Data Retention

  • Export configurations retained until manually deleted
  • Exported data retention managed by your destination storage policies

Getting Started

Prerequisites

  1. Active Didomi organization with ACM monitoring enabled
  2. Premium feature permissions
  3. Access to AWS S3 bucket or SFTP server
  4. Appropriate credentials and permissions for chosen destination

Next Steps

  1. Contact your Didomi representative to enable premium permissions
  2. Prepare your destination storage and obtain required credentials
  3. Access the Data Export feature through the Didomi Marketplace
  4. Follow the setup process outlined above
  5. Monitor your first export execution and verify data delivery