Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #801 libxml_disable_entity_loader() is changed #802

Closed
wants to merge 7 commits into from
Closed

Fixes #801 libxml_disable_entity_loader() is changed #802

wants to merge 7 commits into from

Conversation

stephane-de
Copy link

@stephane-de stephane-de commented Dec 4, 2018

This would be one way to fix #801

Somehow when I try to load a PDF the XmlScanner::getInstance() is called many times:
A new getInstance() is called before the __destruct() of the previous one, causing the previousLibxmlDisableEntityLoaderValue to contain false even when the original value was true.

My solution was to share $previousLibxmlDisableEntityLoaderValue accross all instances, and be sure to not overwrite the original value.

This would be one way to fix #787

Somehow when I try to load a PDF the `XmlScanner::getInstance()` is called many times:
A new `getInstance()` is called before the `__destruct()` of the previous one, causing the `previousLibxmlDisableEntityLoaderValue` to contain `false` even when the original value was `true`.

My solution was to share `$previousLibxmlDisableEntityLoaderValue` accross all instances, and be sure to not overwrite the original value.
@stephane-de stephane-de changed the title Fixes #787 libxml_disable_entity_loader() is changed Fixes #801 libxml_disable_entity_loader() is changed Dec 4, 2018
Copy link
Member

@PowerKiKi PowerKiKi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but before merging, unit tests should be added to cover the cases you describe.

@philipp-kolesnikov
Copy link

philipp-kolesnikov commented Dec 13, 2018

We also faced issues new XmlScanner behavior and I'm afraid that proposed solution will not work in our case.

Code below will print "failed to load external entity "/data/test.xml"" for any valid test.xml file

$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();

libxml_use_internal_errors(true);
libxml_clear_errors();

$document = new DOMDocument();
$document->load("test.xml");

echo libxml_get_last_error()->message. "\n";

Because libxml_disable_entity_loader(true) disables loading from file.

Both issues can be fixed by moving all enabling/disabling logic out of constructor/desctructor to XmlScanner::scan($xml) function like this

public function scan($xml)
{
    if ($this->libxmlDisableEntityLoader) {
        $this->previousLibxmlDisableEntityLoaderValue = libxml_disable_entity_loader(true);
    }
// all old code here 		
    if ($this->libxmlDisableEntityLoader) {
        libxml_disable_entity_loader($this->previousLibxmlDisableEntityLoaderValue);
    }
    return $xml;
}

What do you think?

@stephane-de
Copy link
Author

stephane-de commented Dec 13, 2018 via email

@philipp-kolesnikov
Copy link

Should I submit a new PR, or you will update this one?

Hello, Yes, moving the logic to scan() seams looks good to me (works for my case too), changing global parameters for the shorter time possible is always a good idea. Just be sure to never call a scan() within a scan()

@PowerKiKi
Copy link
Member

I think the point of using the constructor/destructor is to be certain that it will be done. So if we move the logic to the scan method, it should use a try...finally block.

@philipp-kolesnikov feel free to open a new PR, but be sure to include unit tests, in order to be merged faster.

Also on a sidenote, XmlScanner::getInstance() gives the wrong impression that it is a singleton of some sort. But it's not the case at all. I'd be in favor of removing that static method, and making the constructor accept a IReader and use a private method to detect the pattern needed. But that's slightly out of topic and could be done in a separate PR.

@PowerKiKi PowerKiKi closed this in 8918888 Jan 1, 2019
guillaume-ro-fr pushed a commit to guillaume-ro-fr/PhpSpreadsheet that referenced this pull request Jun 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants