Skip to content

Remove control characters from JSON SPDX docs as those corrupt RDF.XML ouput #854

@zbleness

Description

@zbleness

The Android Soong Build system can generate mostly correct SPDX docs.
It does include the external LICENSE text for all components in though hasExtractedLicensingInfos / extractedText.
Some extractedText values include the formfeed JSON control character: \f.
The JSON formfeed control character will deserialize to \x0c in python (ASCII formfeed control char).

The problem is that when converted to XML these control characters will corrupt the RDF+XML document as
most control characters in XML are not supported in general (both XML 1.0 and 1.1).

The proposed fix (being implemented) is to strip the \b and \f JSON control characters as those have no semantical value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions