In order to provide you with a bit of diversification, I'll sandwich an RC of Documents between all those modern scroll builds 😉 …
The changelog really contains only one single item this time, but it's the most important part of the whole extension:
new: more efficient and flexible algorithm detecting more links
Here's the download link of Documents 2.0.3 :up:
And for those who like to know how stuff works a bit of an inside view: :sherlock:
The algorithm, which I used to use for a really long time is pretty simple, but worked nicely:
it checks, if the link ends with a . followed by a file type of a selected document
After a bug report I altered that a bit in version 2.0.1 to allow for page numbers:
now it checks, if the link ends with a ., the file type and optionally the wording "#page" and a number.
Additionally, I checked for ? in both versions separately to ensure that the matched document isn't a variable.
Since I got another bug report, I took a whole different approach now for version 2.0.3:
Instead of checking the end of the link, it now starts at the beginning.
The algorithm basically checks for a . followed by a /, another . and finally the file type.
Before and between them the algorithm allows for any number of different characters except the ?, after the file type any number of any characters.
Excluding the ? makes sure that we do not match documents passed as a variable of a regular website (So I combined both regular expressions into one, which is of course more efficient). 😉
Then the first . ensures, that we start within the host, the following / guarantees, that everything behind is within a folder of this host. Finally, there's the same check as in earlier versions, but allowing for junk-data (e.g. Datamover-10.11.pdf?version=2&modificationDate=1289922519381) behind it.
Since I hereby kind of exchanged the heart of the extension, it's not too unlikely that it'll also have some regressions. If you stumble across a link which doesn't work or which is opened accidentally by Documents, please let me know!
As I already guessed, this algorithm did accidentally try to open some web pages, e.g. "http://www2.le.ac.uk/Members/davidwickins/old/test.docx/view". Hence, I had to adjust it another time and that's what I ended up with:
The difference to the variant before is that now the whole address from the first to the last character is checked.
The front part is identical, but I now specified, that the allowed junk-data has to begin with a ? or #.
I think the risk of having accidental matches is next to zero now, but you never know. I'll give you guys another week to test it before it's made public just to be on the safe side 😉
Other than that: Have fun with it :cheers: