Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML parser #4180

Merged
merged 62 commits into from
Jun 21, 2023
Merged

XML parser #4180

merged 62 commits into from
Jun 21, 2023

Conversation

bbert
Copy link
Contributor

@bbert bbert commented May 5, 2023

Continuation of #3412

The goal of this PR is to provide a new XML parser in order to speed up parsing time.
Parsing time may be criticial for long contents with SegmentTimelines.

What has been achieved in this PR:

  • integration of tXml parser: https://github.com/TobiasNickel/tXml
  • modification of the tXml parser in order to apply simplification process (https://github.com/TobiasNickel/tXml#xmlsimplify-txml_dom_object) on-the-fly during parsing, while also applying matchers on the attributes values and managing children that shall be stored as arrays
  • remove the StringMatcher since it is not useful anymore as every attribute is parsed by default as a string (BTW simply removing this matcher could reduce up to 30-40% the original parsing processing time)
  • update of the NumericMatcher in order to avoid convert into numbers attributes that shall be in string format but that can be filled as a numeric value ("id" attribute for example)
  • remove all references to '_asArray' notation (no more duplication of child nodes as a single object and as an array of object)

Here are some numbers for the XML parsing processing time (in ms) for a 240mn content having one AdaptationSet with a SegmentTimeline without repeat pattern:

  Chrome desktop  (core i7 2x2.7GHz 16Go) Chromecast 3rd Gen Chromecast Ultra
dash.js 110 2 500 1 800
dash.js + tXml 30 500 350

@bbert bbert mentioned this pull request May 5, 2023
@dsilhavy dsilhavy added this to the 5.0.0 milestone May 16, 2023
@dsilhavy dsilhavy changed the base branch from development to v5.0.0 May 30, 2023 08:27
import ObjectIron from './objectiron';
import X2JS from '../../../externals/xml2json';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove xml2json from externals folder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

const acc = new DescriptorType();
return acc.init(audioChanCfg);
});
}

function getAudioChannelConfigurationForRepresentation(representation) {
if (!representation || !representation.hasOwnProperty(DashConstants.AUDIOCHANNELCONFIGURATION_ASARRAY) || !representation[DashConstants.AUDIOCHANNELCONFIGURATION_ASARRAY].length) return [];
return representation[DashConstants.AUDIOCHANNELCONFIGURATION_ASARRAY].map( audioChanCfg => {
if (!representation || !representation.hasOwnProperty(DashConstants.AUDIO_CHANNEL_CONFIGURATION) || !representation[DashConstants.AUDIO_CHANNEL_CONFIGURATION].length) return [];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function is defined two times, probably a merge mistake

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed, a merge mistake

constructor() {
super(
(attr, nodeName) => {
const stringAttrsInElements = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this per default always converted to string which is why we dont need this matcher anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

@dsilhavy
Copy link
Collaborator

  • Additional findings: Events dont seem to be dispatched anymore: listening-to-SCTE-EMSG-events.html demo page does not show any events

@bbert
Copy link
Contributor Author

bbert commented Jun 21, 2023

  • Additional findings: Events dont seem to be dispatched anymore: listening-to-SCTE-EMSG-events.html demo page does not show any events

Fixed, InbandEventStream was not in the list of nodes being represented as an array

@dsilhavy dsilhavy merged commit 0274eed into v5.0.0 Jun 21, 2023
@dsilhavy dsilhavy deleted the feature/xml-parser branch August 7, 2023 13:35
@Andrews54757
Copy link
Contributor

This XML parser does not handle escape sequences correctly. As a result, urls containing the & (&) sequence will be malformed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants