Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ end_of_line = crlf
indent_size = 2

# Json files
[*.json]
[*.{json,jsonc}]
end_of_line = crlf

# Linux scripts
Expand Down
34 changes: 33 additions & 1 deletion .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@ After the final push, sweep-resolve stale older threads for removed code paths.
The main public API for working with language tags:

**Static Factory Methods:**

- `Parse(string tag)`: Parse a language tag string, returns null on failure
- `TryParse(string tag, out LanguageTag? result)`: Safe parsing with out parameter
- `ParseOrDefault(string tag, LanguageTag? defaultTag = null)`: Parse with fallback to "und"
Expand All @@ -182,6 +183,7 @@ The main public API for working with language tags:
- `FromLanguageScriptRegion(string language, string script, string region)`: Factory for full tags

**Properties:**

- `Language`: Primary language subtag (internal set)
- `ExtendedLanguage`: Extended language subtag (internal set)
- `Script`: Script subtag (internal set)
Expand All @@ -192,6 +194,7 @@ The main public API for working with language tags:
- `IsValid`: Property to check if tag is valid

**Instance Methods:**

- `Validate()`: Verify structural correctness
- `Normalize()`: Return normalized copy of tag (does not validate)
- `ToString()`: String representation
Expand All @@ -200,6 +203,7 @@ The main public API for working with language tags:
- Operators: `==`, `!=`

**Design Characteristics:**

- Implements `IEquatable<LanguageTag>`
- Constructors are internal, use factory methods or builder
- Properties use internal setters to maintain immutability for public API
Expand All @@ -210,6 +214,7 @@ The main public API for working with language tags:
Fluent builder for constructing language tags:

**Methods:**

- `Language(string value)`: Set primary language
- `ExtendedLanguage(string value)`: Set extended language
- `Script(string value)`: Set script
Expand Down Expand Up @@ -242,10 +247,12 @@ Fluent builder for constructing language tags:
Provides language code conversion and matching:

**Properties:**

- `Undetermined`: Constant for "und" (undetermined language)
- `Overrides`: User-defined (IETF, ISO) mapping pairs

**Methods:**

- `GetIetfFromIso(string languageTag)`: Convert ISO to IETF format
- `GetIsoFromIetf(string languageTag)`: Convert IETF to ISO format
- `IsMatch(string prefix, string languageTag)`: Prefix matching for content selection
Expand All @@ -255,24 +262,29 @@ Provides language code conversion and matching:
Static class for configuring global logging for the entire library:

**Properties:**

- `LoggerFactory`: Gets or sets the global logger factory for creating category loggers

**Methods:**

- `SetFactory(ILoggerFactory loggerFactory)`: Configure the library to use a logger factory
- `TrySetFactory(ILoggerFactory loggerFactory)`: Set factory only if none is configured

**Logger Resolution Priority:**

1. `LoggerFactory` property (when not `NullLoggerFactory`)
2. `NullLogger.Instance` (default fallback)

**Important Notes:**

- Loggers are created and cached at time of use by each class instance
- Changes to `LoggerFactory` after a logger is created do not affect existing cached loggers
- Only new logger requests use updated configuration

### Data Models

#### Iso6392Data.cs

- ISO 639-2 language codes (3-letter bibliographic/terminologic codes)
- **Public Methods:**
- `Create()`: Load embedded data
Expand All @@ -283,6 +295,7 @@ Static class for configuring global logging for the entire library:
- **Record Properties:** `Part2B`, `Part2T`, `Part1`, `RefName`

#### Iso6393Data.cs

- ISO 639-3 language codes (comprehensive language codes)
- **Public Methods:**
- `Create()`: Load embedded data
Expand All @@ -293,6 +306,7 @@ Static class for configuring global logging for the entire library:
- **Record Properties:** `Id`, `Part2B`, `Part2T`, `Part1`, `Scope`, `LanguageType`, `RefName`, `Comment`

#### Rfc5646Data.cs

- RFC 5646 / BCP 47 language subtag registry
- **Public Methods:**
- `Create()`: Load embedded data
Expand All @@ -309,13 +323,15 @@ Static class for configuring global logging for the entire library:
#### Supporting Classes

**ExtensionTag (sealed record):**

- `Prefix`: Single-character extension prefix (char)
- `Tags`: ImmutableArray of extension values
- `ToString()`: Format as "prefix-tag1-tag2"
- `Normalize()`: Returns normalized copy with sorted, lowercase tags
- `Equals()`: Case-insensitive equality comparison

**PrivateUseTag (sealed record):**

- `Prefix`: Constant 'x'
- `Tags`: ImmutableArray of private use values
- `ToString()`: Format as "x-tag1-tag2"
Expand All @@ -325,11 +341,13 @@ Static class for configuring global logging for the entire library:
### Language Tag Structure

Per RFC 5646, language tags follow this format:
```

```text
[Language]-[Extended language]-[Script]-[Region]-[Variant]-[Extension]-[Private Use]
```

Examples:

- `zh`: Simple language tag
- `zh-yue-hk`: Language with extended language and region
- `en-latn-gb-boont-r-extended-sequence-x-private`: Full tag with all components
Expand All @@ -347,7 +365,9 @@ Examples:
## API Design Patterns

### Factory Pattern

Use static factory methods instead of public constructors:

```csharp
// Good
LanguageTag tag = LanguageTag.Parse("en-US");
Expand All @@ -358,7 +378,9 @@ LanguageTag tag = LanguageTag.FromLanguage("en");
```

### Builder Pattern

Use fluent builder for complex tag construction:

```csharp
LanguageTag tag = LanguageTag.CreateBuilder()
.Language("en")
Expand All @@ -367,12 +389,15 @@ LanguageTag tag = LanguageTag.CreateBuilder()
```

### Immutability Pattern

- All properties are immutable after construction
- Use `Normalize()` to get modified copies
- Collections are exposed as `ImmutableArray<T>`

### Safe Parsing

Always use safe parsing patterns:

```csharp
// TryParse pattern
if (LanguageTag.TryParse(input, out LanguageTag? tag))
Expand Down Expand Up @@ -407,6 +432,7 @@ LanguageTag tag = LanguageTag.ParseOrDefault(input); // Falls back to "und"
## Recent API Changes

### Changed (Breaking)

- `LanguageTagParser` is now internal (use `LanguageTag.Parse()` instead)
- Properties changed from `IList<string>` to `ImmutableArray<string>`:
- `VariantList` → `Variants`
Expand All @@ -417,6 +443,7 @@ LanguageTag tag = LanguageTag.ParseOrDefault(input); // Falls back to "und"
- Tag construction requires use of factory methods or builder (constructors are internal)

### Added (Non-Breaking)

- `LanguageTag.ParseOrDefault()`: Safe parsing with fallback
- `LanguageTag.ParseAndNormalize()`: Combined parse and normalize
- `LanguageTag.IsValid`: Property for validation
Expand All @@ -429,6 +456,7 @@ LanguageTag tag = LanguageTag.ParseOrDefault(input); // Falls back to "und"
## Future Improvements

Consider these areas for enhancement:

- Use a BNF parser or parser generator (ANTLR4, Eto.Parse, etc.) instead of hand-parsing
- Implement comprehensive subtag content validation against registry data
- Add more language lookup and validation features
Expand All @@ -443,6 +471,7 @@ Consider these areas for enhancement:
## Common Patterns

### Creating Tags

```csharp
// Simple parsing
LanguageTag? tag = LanguageTag.Parse("en-US");
Expand All @@ -468,6 +497,7 @@ LanguageTag tag = LanguageTag.CreateBuilder()
```

### Normalizing Tags

```csharp
// Parse and normalize separately
LanguageTag? tag = LanguageTag.Parse("en-latn-us");
Expand All @@ -478,6 +508,7 @@ LanguageTag? tag = LanguageTag.ParseAndNormalize("en-latn-us"); // "en-US"
```

### Accessing Tag Components

```csharp
LanguageTag tag = LanguageTag.Parse("en-latn-gb-boont-r-extended-x-private")!;

Expand All @@ -490,6 +521,7 @@ PrivateUseTag privateUse = tag.PrivateUse; // { Tags=["private"] }
```

### Comparing Tags

```csharp
LanguageTag? tag1 = LanguageTag.Parse("en-US");
LanguageTag? tag2 = LanguageTag.Parse("en-us");
Expand Down
14 changes: 14 additions & 0 deletions .markdownlint-cli2.jsonc
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"config": {
// Prose paragraphs and data-heavy tables/URLs are intentionally long;
// reflowing at 80 cols hurts readability and churns diffs.
"MD013": false,
// Inline HTML is used for reference-link section dividers.
"MD033": false,
// Require fenced code blocks over the legacy 4-space-indented style.
"MD046": { "style": "fenced" },
// Wide tables are intentional where wrapping cells breaks GitHub rendering.
"MD060": false
},
"gitignore": true
}