Data Catalog

The Data Catalog allows you to explore and manage the structure of your connected databases. With it, you can view schemas, tables, columns, relationships, and enrich metadata to improve understanding of your data.

Features

📊 Structure Exploration

Navigate through schemas, tables, and columns

📝 Metadata

Database and manual descriptions

🔗 Relationships

FKs, inferred, and ERD diagram

🔒 Visibility

Column-level access control

Structure Exploration

The Data Catalog displays the complete structure of your database:

Schemas and Tables

Lists all schemas available in the connection
For each schema, shows contained tables
Displays column count per table
Indicates tables with primary keys and indexes

Columns

For each table, you can see:

Information	Description
Name	Column name
Type	Data type (VARCHAR, INTEGER, etc.)
Nullable	Whether it accepts null values
PK	Whether it's part of the primary key
FK	Whether it's a foreign key
Default	Default value, if any

Metadata

The Data Catalog supports two types of descriptions:

Database Descriptions

Comments defined directly in the database via COMMENT ON:

COMMENT ON TABLE customers IS 'Active customer registry';
COMMENT ON COLUMN customers.email IS 'Primary contact email';

These descriptions are automatically imported during synchronization.

Manual Descriptions

Descriptions added by your team through the Console:

Complement or override database descriptions
Linked to the connection in Console
Don't modify the original database
Can be edited at any time

AI Enrichment

Solução42 can automatically suggest descriptions based on:

Column and table names
Data type
Common industry patterns
Context from other columns

Review Suggestions

Always review AI-suggested descriptions before applying them. They are based on patterns and may not reflect specific usage in your organization.

Relationships

Foreign Keys

The Data Catalog automatically imports FKs defined in the database:

Shows source table and column
Shows target table and column
Indicates cardinality (1:N, N:M)

Inferred Relationships

For databases without explicit FKs, the system can infer relationships by convention:

Columns named *_id are mapped to corresponding tables
Example: customer_id → customers table
Inferred relationships are marked as "suggested"

ERD Diagram

Visualize relationships graphically:

On the connection page, click ERD
The diagram shows all tables and their relationships
Use zoom and pan to navigate
Click on a table to highlight its relationships
Filter by schema to focus on specific areas

Data Samples

The Data Catalog can display data samples to facilitate understanding:

Limit: Up to 10 rows per table
Visibility: Respects visibility settings
Updates: Data is fetched on demand, not stored

Sensitive Data

Columns configured as restricted or pseudonymized appear masked in samples, even for administrators.

Data Visibility

Control which data in your organization can be viewed in queries, visualizations, and dashboards. Visibility settings are automatically applied to all queries, ensuring sensitive data is never accidentally exposed.

Why Use It?

PII Protection: Hide personal data like emails, SSN, and phone numbers
Compliance: Meet LGPD, GDPR, and HIPAA requirements
Security: Prevent accidental exposure of sensitive data
Safe Analytics: Enable analysis without exposing raw data

Visibility Levels

Table Visibility

Level	Description
Public	Table can be queried. Individual column visibility is respected.
Restricted	All table columns are hidden, regardless of individual settings.

Column Visibility

Level	What appears in query
Public	Original data value
Restricted	`[RESTRICTED]`
Pseudonymized	SHA-256 hash of value (allows anonymous JOINs)

How Data Appears

Public Column:

│ email                       │
├─────────────────────────────┤
│ [email protected]            │
│ [email protected]            │

Restricted Column:

│ email                       │
├─────────────────────────────┤
│ [RESTRICTED]                │
│ [RESTRICTED]                │

Pseudonymized Column:

│ email                                                           │
├─────────────────────────────────────────────────────────────────┤
│ a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef12345 │
│ b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456  │

Pseudonymization and JOINs

The hash is deterministic: the same value always generates the same hash. This allows JOINs between tables using pseudonymized columns without revealing the original data.

Configuring Visibility

By Column

Access the connection's Data Catalog
Navigate to the desired table
Click on the column you want to configure
In Visibility, select the desired level
Click Save

By Table

Access the connection's Data Catalog
Click on the desired table
In the details panel, locate Table Visibility
Select Public or Restricted
Click Save

Priority

Table visibility takes precedence over column visibility. If a table is restricted, all its columns will also be restricted.

Visibility Validation

Columns can have two validation states:

State	Icon	Description
Not validated	Gray	Default or AI-suggested configuration
Validated	Green	Configuration reviewed and confirmed by a user

Recommendation

Review and validate visibility for all sensitive columns after connecting a new database.

AI Suggestions for Visibility

Solução42 can automatically suggest appropriate visibility based on:

Column name: email, cpf, password, ssn, etc.
Data type: Long text fields may contain PII
Industry patterns: Common conventions for sensitive data

To apply suggestions:

In the Data Catalog, look for columns with suggestion icon (lightbulb)
Click on the column to see the suggestion
Review the recommendation
Click Apply Suggestion or adjust manually

Visibility Use Cases

Personal Data (PII)

Column	Recommendation	Justification
Email	Pseudonymized	Allows cohort analysis without exposing identity
SSN/Tax ID	Restricted	Unique identifier, should not be exposed
Phone	Restricted	Sensitive personal data
Full name	Restricted or Pseudonymized	Depends on analysis needs

Financial Data

Column	Recommendation	Justification
Card number	Restricted	Should never be exposed
CVV	Restricted	Should never be stored visibly
Balance	Restricted	Sensitive financial data

Health Data (HIPAA)

Column	Recommendation	Justification
Patient ID	Pseudonymized	Allows analysis without identification
Diagnosis	Restricted	Protected medical information
Medications	Restricted	Protected medical information

Automatic Enforcement

Visibility is automatically enforced in:

SQL Queries: Results respect configured visibility
Visualizations and Dashboards: Charts and filters don't expose restricted values
AI Analytics: The AI assistant doesn't access restricted values
Exports: All exports apply the same rules

Visibility Auditing

All visibility changes are recorded:

Who changed
When changed
Previous value
New value

To generate compliance reports, go to Data Catalog → Export Report → Visibility Report.

How to Use

Accessing the Data Catalog

In the sidebar menu, click Connections
Select the desired connection
Click Data Catalog

Navigating the Structure

Use the sidebar tree to navigate through schemas
Expand a schema to see its tables
Click on a table to see its columns
Use search to find specific tables or columns

Adding Descriptions

Navigate to the desired table or column
In the details panel, click Edit description
Enter the description
Click Save

Syncing Metadata

Metadata sync is automatic when configuring a connection. To update manually:

Access the connection page
Click Settings
Click Sync Metadata

Incremental Sync

Synchronization detects only changes since the last run, making the process fast even for large databases.

Best Practices

Documentation

Add descriptions for all main tables
Document columns with technical or abbreviated names
Use AI as a starting point, then refine manually

Visibility

✅ Configure visibility before granting data access
✅ Use pseudonymization for columns used in JOINs
✅ Review visibility after each sync
✅ Validate all sensitive columns before granting access
❌ Don't leave sensitive columns as public
❌ Don't ignore columns in staging/temp tables
❌ Don't apply AI suggestions without review

Maintenance

Sync metadata after schema changes
Review inferred relationships periodically
Keep descriptions updated with business changes

Additional Resources

Security - Security and compliance practices

Features​

📊 Structure Exploration

📝 Metadata

🔗 Relationships

🔒 Visibility

Structure Exploration​

Schemas and Tables​

Columns​

Metadata​

Database Descriptions​

Manual Descriptions​

AI Enrichment​

Relationships​

Foreign Keys​

Inferred Relationships​

ERD Diagram​

Data Samples​

Data Visibility​

Why Use It?​

Visibility Levels​

Table Visibility​

Column Visibility​

How Data Appears​

Configuring Visibility​

By Column​

By Table​

Visibility Validation​

AI Suggestions for Visibility​

Visibility Use Cases​

Personal Data (PII)​

Financial Data​

Health Data (HIPAA)​

Automatic Enforcement​

Visibility Auditing​

How to Use​

Accessing the Data Catalog​

Navigating the Structure​

Adding Descriptions​

Syncing Metadata​

Best Practices​

Documentation​

Visibility​

Maintenance​

Additional Resources​