Skip to content

Conversation

@ivan-afanasiev
Copy link

@ivan-afanasiev ivan-afanasiev commented Oct 1, 2025

🚀 Add Version-Specific Data Structure Download Functionality

📋 Summary

This PR extends the snowplow-cli ds download command with comprehensive version-specific download capabilities, allowing users to download specific versions, all versions, or filter by environment. The implementation includes proper file naming conventions, robust error handling, and maintains full backward compatibility.

✨ Features Added

🎯 Core Functionality

  • Specific Version Download: Download a particular version of a data structure
  • All Versions Download: Download all available versions of a data structure
  • Environment Filtering: Filter deployments by environment (DEV/PROD)
  • Smart File Naming: Automatic version suffix inclusion in filenames
  • Backward Compatibility: Existing functionality remains unchanged

🔧 New Command-Line Flags

--vendor string     # Vendor of the specific data structure
--name string       # Name of the specific data structure  
--format string     # Format of the data structure (default: jsonschema)
--version string    # Specific version to download
--all-versions      # Download all versions (mutually exclusive with --version)
--env string        # Filter by environment (DEV, PROD)

🎯 Use Cases

1. Download Specific Version

# Download user entity version 2-0-0
snowplow-cli ds download --vendor com.example --name user --format jsonschema --version 2-0-0
# Creates: com.example/user_2-0-0.yaml

2. Download All Versions

# Download all versions of user entity
snowplow-cli ds download --vendor com.example --name user --format jsonschema --all-versions
# Creates: com.example/user_1-0-0.yaml, com.example/user_2-0-0.yaml

3. Environment-Specific Downloads

# Download only production versions
snowplow-cli ds download --vendor com.example --name user --format jsonschema --all-versions --env PROD

4. Latest Version (Existing Behavior)

# Download latest version (no version suffix)
snowplow-cli ds download --vendor com.example --name user --format jsonschema
# Creates: com.example/user.yaml

🏗️ Technical Implementation

📁 Files Modified

Core Logic

  • cmd/ds/download.go: Extended command with new flags and download logic
  • internal/console/requests_ds.go: Added version-specific API functions
  • internal/util/files.go: Enhanced file creation with version naming

New Functions Added

// Hash generation for data structure identification
GenerateDataStructureHash(orgId, vendor, name, format string) string

// Version-specific downloads
GetSpecificDataStructureVersion(ctx, client, dsHash, version string) (*DataStructure, error)
GetAllDataStructureVersions(ctx, client, dsHash, envFilter string) ([]DataStructure, error)

// Enhanced file creation with version support
CreateDataStructuresWithVersions(dss []DataStructure, isPlain bool, includeVersions bool) error

🔄 API Integration

The implementation uses a two-step API approach for reliable version retrieval:

  1. Listing API: /data-structures/v1/{hash} - Get metadata and deployments
  2. Versions API: /data-structures/v1/schemas/versions - Get actual schema content

This approach ensures we can retrieve any version, not just the latest, by matching the requested version from the comprehensive versions list.

🎨 Smart File Naming

The system automatically determines when to include version suffixes:

Download Type Filename Pattern Example
Specific Version {name}_{version}.{ext} user_2-0-0.yaml
All Versions {name}_{version}.{ext} user_1-0-0.yaml, user_2-0-0.yaml
Latest Version {name}.{ext} user.yaml
Bulk Download {name}.{ext} user.yaml

🧪 Comprehensive Testing

📊 Test Coverage

  • 94 total tests with 100% pass rate
  • 15 new test cases specifically for version functionality
  • 3 new test files created for comprehensive coverage

🧪 Test Categories

1. File Operations (internal/util)

✅ TestCreateDataStructuresWithVersions_IncludeVersionsTrue
✅ TestCreateDataStructuresWithVersions_IncludeVersionsFalse
✅ TestCreateDataStructuresWithVersions_MultipleVersionsSameName
✅ TestCreateDataStructuresWithVersions_JsonFormat
✅ TestCreateDataStructuresWithVersions_BackwardCompatibility
✅ TestCreateDataStructuresWithVersions_ComplexVersionNames

2. API Functions (internal/console)

✅ TestGetSpecificDataStructureVersion_Success
✅ TestGetSpecificDataStructureVersion_VersionNotFound
✅ TestGetAllDataStructureVersions_Success
✅ TestGetAllDataStructureVersions_WithEnvironmentFilter

3. Command Logic (cmd/ds)

✅ TestDownloadCommand_FlagValidation (5 sub-tests)
✅ TestDownloadCommand_IncludeVersionsLogic (4 sub-tests)

🔧 Test Features

  • Mock HTTP servers with realistic API responses
  • Temporary file systems for isolated testing
  • Error scenario coverage including network failures
  • Edge case handling for complex version numbers
  • Backward compatibility verification

🛡️ Error Handling & Validation

✅ Input Validation

  • Mutual exclusivity: --version and --all-versions cannot be used together
  • Required parameters: --vendor, --name, and --format must be provided together
  • Environment validation: --env accepts only DEV/PROD values

🚨 Error Scenarios Handled

  • Non-existent versions: Clear error messages with version information
  • API failures: Graceful degradation with informative logging
  • Network issues: Proper timeout and retry handling
  • Invalid responses: Robust JSON parsing with fallback behavior

📝 User-Friendly Messages

# Clear error for missing version
❌ Schema data not found for version 3-0-0

# Informative success messages
✅ Downloaded specific version: vendor=com.example name=user version=2-0-0
✅ Downloaded all versions: vendor=com.example name=user count=3 env=PROD

🔄 Backward Compatibility

✅ Existing Functionality Preserved

  • Bulk downloads continue to work exactly as before
  • Latest version downloads maintain original behavior
  • All existing flags remain functional
  • File naming unchanged for existing use cases

📝 Documentation Updates

📖 Help Text Enhanced

$ snowplow-cli ds download --help

Download data structures from BDP Console.

By default, downloads the latest versions of all data structures from your development environment.

You can download specific data structures using --vendor, --name, and --format flags.
You can also download a specific version using --version flag, or all versions using --all-versions flag.
Use --env flag to filter deployments by environment (DEV, PROD).

Examples:
  # Download a specific data structure
  $ snowplow-cli ds download --vendor com.example --name user_event --format jsonschema

  # Download a specific version
  $ snowplow-cli ds download --vendor com.example --name user_event --format jsonschema --version 1-0-0

  # Download all versions
  $ snowplow-cli ds download --vendor com.example --name user_event --format jsonschema --all-versions

  # Download only production deployments
  $ snowplow-cli ds download --vendor com.example --name user_event --format jsonschema --all-versions --env PROD

Comment on lines +99 to +103
// Validate mutually exclusive flags
if version != "" && allVersions {
snplog.LogFatalMsg("validation error", fmt.Errorf("--version and --all-versions are mutually exclusive"))
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be expressed with cobra's MarkFlagsMutuallyExclusive

downloadCmd.PersistentFlags().Bool("plain", false, "Don't include any comments in yaml files")

// New flags for specific data structure download
downloadCmd.PersistentFlags().String("vendor", "", "Vendor of the specific data structure to download (requires --name and --format)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires --name and --format

This should be expressed as MarkFlagsRequiredTogether


// GenerateDataStructureHash generates a SHA-256 hash for a data structure
// based on organization ID, vendor, name, and format as per Snowplow API documentation
func GenerateDataStructureHash(orgId, vendor, name, format string) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is correct, but it's done differently on line 426. Either both should use this function, or none

}

// Parse the listing response
var listingResp struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a struct for this

dsHash := console.GenerateDataStructureHash(org, vendor, name, formatFlag)

if allVersions {
// Download all versions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You could consider removing some of the inline comments—particularly around this code block and line 116 and here. The logic is already quite clear and self-explanatory, so the comments may not be necessary.

// Set up the command
cmd.SetArgs(args)

// Execute the command
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same for these comments. thanks!

@gleb-lobov
Copy link
Collaborator

gleb-lobov commented Oct 2, 2025

Hi @ivan-afanasiev
Thank for your contribution

Could you please describe the use-case that you have for the single version download? Im not against implementing it, but so far this tool was build to assist with git-ops, where your versions would be stored in the git history. I understand that there is a case when you first switch to git and don't have the history, but I don't see how the single version download helps there

Short: "Download data structures from BDP Console",
Args: cobra.MaximumNArgs(1),
Long: `Downloads the latest versions of all data structures from BDP Console.
Long: `Downloads data structures from BDP Console.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update any references from "BDP" to just "Console"? We’ve recently completed a rename at Company level. Thanks a lot for your help with this!

@gibbok-snowplow
Copy link
Collaborator

gibbok-snowplow commented Oct 2, 2025

Thanks for the PR and the effort! Just a suggestion—I’d love to hear your thoughts on trying out a slightly less verbose API and using the flags we have already when possible:

# Download latest version with jsonschema format (format is default, so can be omitted)
$ snowplow-cli ds download --match com.example/user_event

# Or explicitly specify jsonschema format
$ snowplow-cli ds download --match com.example/user_event --output-format jsonschema

# Download specific version with jsonschema
$ snowplow-cli ds download --match com.example/user_event@1-0-0

# Download all jsonschema versions
$ snowplow-cli ds download --match com.example/user_event --all-versions

# Download all jsonschema versions from production only
$ snowplow-cli ds download --match com.example/user_event --all-versions --env PROD

# If you want to be explicit about everything
$ snowplow-cli ds download --match com.example/user_event@1-0-0 --output-format jsonschema --env PROD
```

@ivan-afanasiev
Copy link
Author

Hi @ivan-afanasiev Thank for your contribution

Could you please describe the use-case that you have for the single version download? Im not against implementing it, but so far this tool was build to assist with git-ops, where your versions would be stored in the git history. I understand that there is a case when you first switch to git and don't have the history, but I don't see how the single version download helps there

Hi @gleb-lobov — thanks for the quick review!

My use case is a bit different: I'm building a code-generation tool for Snowplow event specifications because Snowtype doesn't match our workflow. Specifications can reference specific versions of events and entities, so I need a way to fetch a single version of a particular data structure so the generator can produce code for that exact schema. While calling the Console API directly is possible, I thought it makes sense to add such kind of functionality into the CLI to simplify our code generator logic.

I'm not a Go expert, so please feel free to challenge the implementation and suggest improvements — happy to iterate.

Thanks again!

@gibbok-snowplow
Copy link
Collaborator

@ivan-afanasiev Thanks for your message.

Regarding this point:

I need a way to fetch a single version of a particular data structure so the generator can produce code for that exact schema.

In Snowtype, you can target specific data structures (including their exact versions) by defining them in your snowtype.config.json file. For example:

{
  "dataStructures": [
    "xxx/aaa/jsonschema/3-0-1",
    "xxx/aaa/jsonschema/3-0-2"
  ]
}

I hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants