ACM Data Export Feature - Customer Specifications
Overview
The Advanced Compliance Monitoring (ACM) Data Export feature enables organizations to automatically export their compliance monitoring data to external cloud storage platforms or secure SFTP servers. This feature provides seamless integration with your existing data infrastructure, allowing for enhanced analysis and reporting capabilities outside the Didomi ecosystem.
Access: This feature is available through the Didomi Marketplace and requires premium permissions.
Key Benefits
- Cloud & SFTP Support: Export to AWS S3 or secure SFTP servers (Google Cloud Storage and Microsoft Azure coming soon)
- Daily Automated Exports: Automated daily exports with data from the previous day
- Multiple Data Types: Access to raw tracker data, aggregated vendor information, and property-level compliance metrics
- CSV Format: Export data in CSV format (Parquet format coming in future releases)
- Secure Authentication: Enterprise-grade security with encrypted credential storage
- Connection Testing: Built-in validation to ensure proper configuration before activation
Supported Export Destinations
Amazon Web Services (AWS) S3
- Requirements: S3 bucket with appropriate IAM role permissions
- Configuration Fields:
- Bucket name *: The name of your S3 bucket
- Bucket key: Path to the object inside the bucket (optional)
- Region *: AWS region where your bucket is located
- STS Role ARN *: AWS IAM role ARN for secure access
SFTP Server
- Requirements: SFTP server with write permissions
- Configuration Fields:
- SFTP Server Host *: Server hostname or IP address
- Username *: SFTP username for authentication
- Password *: SFTP password for authentication
Coming Soon
- Google Cloud Platform (GCP) Storage: Service account-based authentication
- Microsoft Azure Blob Storage: App registration with storage account permissions
*Required fields
Available Data Types
Raw Tracker Data
Description: Individual tracking events detected during compliance monitoring scans. Each row represents a single tracker (cookie, script, etc.) found on your websites during different consent scenarios.
Key Fields:
ID
: Unique identifier for this tracker instance within the scanEVENT_TIME
: Timestamp when the crawler registered the event during scanningCREATED_AT
: Timestamp when tracker data was processed at end of page scan (for data ingestion)COLLECT_ID
: Unique identifier for the collect (scan scenario: accept/refuse/no action)COLLECT_PAGE_ID
: Unique identifier for the specific page scanned within a collectHOST
: Domain hosting the trackerTYPE
: Type of tracker (e.g., "cookie", "script")PAGE_URL
: Full URL where tracker was detectedPAGE_URL_HOST
: Host domain of the page being scannedPAGE_URL_SLD
: Second-level domain of the pagePROPERTY_ID
: Your website property identifierREPORT_ID
: Unique report identifierORGANIZATION_ID
: Your organization identifierIS_TC_STRING_IDENTICAL
: Whether the TCF API tc_string matches the gdpr_consent query parameterINITIATOR
: The vendor actually dropping/creating the trackerINITIAL_NAME
: Original tracker name before processingNAME
: Processed tracker name (after regex pattern matching)NAME_PATTERN
: Regex pattern used to process the initial nameVENDOR
: Vendor that owns the tracker (if successfully matched in database)VALUE
: Tracker value/contentLIFETIME_SECONDS
: How long the tracker persistsIS_THIRD_PARTY
: Whether tracker is from a third-party domainIS_HTTP_ONLY
,IS_SECURE_ONLY
,IS_PERSISTENT
: Security and persistence flagsSOURCE
: How tracker was created ("javascript" or "http request")CMP
: Consent scenario context (JSON format showing when tracker was detected)HASH_NAME_HOST_TYPE
: Hash for aggregating trackers (processed name + host + type)HASH_INITIAL_NAME_HOST_TYPE
: Hash for aggregating trackers (initial name + host + type)
Use Cases:
- Detailed tracker behavior analysis across consent scenarios
- Technical debugging of tracking implementations
- Understanding when specific trackers fire relative to consent actions
- Vendor compliance auditing at the individual tracker level
Aggregated Vendor Data
Description: Consolidated information about vendors detected across your properties, aggregated by consent scenarios. Each row represents a vendor's activity summary within a specific report.
Key Fields:
REPORT_ID
: Unique report identifierPROPERTY_ID
: Website property identifierORGANIZATION_ID
: Your organization identifierID
: Vendor identifier (database ID if matched, or host domain if not)REQUEST_COUNT
: Number of tracker requests from this vendorCMP
: Consent scenarios where vendor was active (JSON array format)PARTNER
: Detailed vendor information and metadata (JSON format)CREATED_AT
: Report generation timestamp
Data Relationships: Can be joined with Raw Tracker Data using the PARTNER
/VENDOR
fields to get detailed tracker information for each vendor.
Use Cases:
- Vendor performance reporting across consent scenarios
- Third-party risk assessment and vendor inventory
- Compliance dashboard creation showing vendor-level metrics
- Understanding which vendors are active in different consent contexts
Aggregated Properties Data
Description: Property-level compliance status and collection results. Each row represents the overall compliance status for a website property.
Key Fields:
WEBSITE
: Full website URLPROPERTY_ID
: Unique property identifierCREATED_AT
: Report generation timestampREPORTSTATUS
: Overall report processing status (e.g., "Processed", "Partially Processed", "Failed")COLLECT_CONSENT_TO_ALL
: Status/errors for the "accept all" scenario collectionCOLLECT_REFUSE_TO_ALL
: Status/errors for the "refuse all" scenario collectionCOLLECT_NO_USER_CHOICE
: Status/errors for the "no user action" scenario collection
Status Codes: The collect status fields contain human-readable status messages indicating the outcome of each consent scenario collection:
Success Status:
SUCCESS
: Collection completed successfully for that scenario
Error Categories:
1. Website not reachable
: Network issues, website down, or crawler cannot access the site2. Anti-bot system
: Bot protection or anti-automation measures detected and blocking crawler3. No existing or clear CMP on the page
: No Consent Management Platform found or CMP not clearly identifiable4. No refuse options found
: CMP detected but no refuse/reject consent options available4. No accept button found
: CMP detected but no accept consent button found5. Not able to click on the button in the page
: UI interaction failed - button exists but cannot be clicked6. General Error
: Other unspecified errors not covered by above categories
Important Notes:
- Multiple status codes can appear in a single field if multiple issues occurred
- The "no_user_choice" scenario typically has fewer error types as it doesn't require button interactions
- Status codes help identify specific areas for website optimization or troubleshooting
Use Cases:
- High-level compliance monitoring across your website portfolio
- Identifying properties with collection issues or errors
- Monitoring scan success rates and troubleshooting failed collections
- Executive reporting on overall compliance monitoring health
File Formats
CSV Format
- Current Implementation: All exports are delivered in CSV format
- Advantages: Excel-compatible, human-readable, easy to import into analytics tools
- Structure: Header row with column names, UTF-8 encoding
- File Organization: Files are organized in a partitioned folder structure by data type and export date:
CSV/
├── acm_trackers/
│ └── date=<date_of_export>/
│ └── file.csv
├── acm_vendors/
│ └── date=<date_of_export>/
│ └── file.csv
└── acm_properties/
└── date=<date_of_export>/
└── file.csv
Coming Soon
- Parquet Format: Optimized columnar storage format for big data analytics
Export Scheduling
Current Implementation
- Frequency: Daily exports
- Data Coverage: Previous day's collected data
- Timing: Automated execution (specific time may vary)
- Configuration: Pre-set schedule (not user-configurable)
Setup Process
Step 1: Access the Data Export Configuration
- Navigate to the Didomi Marketplace
- Locate the "Data Export" feature
- Ensure your organization has the required premium permissions
Step 2: Configure Export Settings
- Select Data Types: Choose from raw trackers, aggregated vendors, and/or aggregated properties
- Choose Destination: Select AWS S3 or SFTP server (format is automatically set to CSV)
Step 3: Provide Authentication Credentials
Enter the required credentials for your chosen destination:
- AWS S3: Bucket name, bucket key (optional), region, and STS Role ARN
- SFTP: Server host, username, and password
Step 4: Test Connection
- Click "Test Connection" to validate your configuration
- The system will attempt to write a test file to your destination
- Success: Configuration can be saved and activated
- Failure: Review error message and adjust credentials/configuration
Step 5: Save and Activate
- Save your configuration after successful connection test
- Your export will be automatically scheduled for daily execution starting the next day
Security Considerations
Data Protection
- All credentials are encrypted using enterprise-grade encryption
- Data transmission occurs over secure protocols (HTTPS/SFTP)
- Export files contain only your organization's compliance data
Access Control
- Feature requires premium permissions
- Organization-level access controls apply
User Interface Access Only
The ACM Data Export feature is currently available exclusively through the Didomi Console user interface. There is no API access available for this feature at this time.
All configuration, management, and monitoring of data exports must be performed through the web-based console interface.
Monitoring and Status Tracking
Export Status Dashboard
- View all configured exports
- Last export date and time
Troubleshooting
Common Issues
Connection Test Failures
- Verify credentials are correct and not expired
- Ensure destination bucket/container exists
- Check that permissions allow write access
- Verify network connectivity if using SFTP
Missing Export Files
- Verify destination path configuration
- Ensure sufficient storage space at destination
Authentication Errors
- Refresh expired credentials
- Verify STS role permissions and trust policies (AWS)
- Check SFTP server connectivity and credentials
- Validate bucket access permissions
Limitations and Considerations
Current Limitations
- Export destinations limited to AWS S3 and SFTP (Google Cloud Storage and Microsoft Azure coming soon)
- Export format limited to CSV (Parquet format coming in future releases)
- Daily export schedule is pre-set and not configurable
- Maximum file size depends on data volume and destination limits
Performance Considerations
- Large datasets may take longer to export
- Network bandwidth affects transfer times
- Destination storage costs apply based on usage
Data Retention
- Export configurations retained until manually deleted
- Exported data retention managed by your destination storage policies
Getting Started
Prerequisites
- Active Didomi organization with ACM monitoring enabled
- Premium feature permissions
- Access to AWS S3 bucket or SFTP server
- Appropriate credentials and permissions for chosen destination
Next Steps
- Contact your Didomi representative to enable premium permissions
- Prepare your destination storage and obtain required credentials
- Access the Data Export feature through the Didomi Marketplace
- Follow the setup process outlined above
- Monitor your first export execution and verify data delivery