Conversation
|
This pull request introduces 4 alerts when merging 9757e66 into 9f396f7 - view on LGTM.com new alerts:
|
|
This pull request fixes 1 alert when merging 9886ba8 into 9f396f7 - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
👍 @artlowel : Code looks good here, and I've tested and this works as described (UI rewrites the URL when flag is enabled). I've not tested the new docs on setting up the proxy server, but those can be tweaked as needed in the future.
Minor comments/questions inline. But, I think this is a good first implementation which we can have Google Scholar test against.
| Indexers such as Google Scholar require that files are hosted on the same domain as the page that links them. In DSpace 7, Bitstreams are served from the REST server. So if you use different servers for the REST api and the UI you'll want to ensure that Bitstream downloads are proxied through the UI server. | ||
|
|
||
| In order to achieve this we'll need to do two things: | ||
| - **Proxy the Bitstream downloads through the UI server.** You'll need to put a webserver such as httpd or nginx in front of the UI server in order to achieve this. [Below](#apache-http-server-config) you'll find a section explaining how to do it in httpd. |
There was a problem hiding this comment.
Silly question perhaps, but I have to ask.... What is the reason for requiring a separate server-level proxy configuration, instead of just using something like http-proxy-middleware (which we already use to proxy sitemaps)?
I'm assuming the server-level proxy likely results in better performance here?
Just thought I'd ask as I'm sure others will ask this same thing eventually, once they notice we are auto-proxying sitemaps, but not auto-proxying downloads.
There was a problem hiding this comment.
The reason is both performance, and the ease with which it would be possible to tweak or add e.g. caching to this config. That isn't supported out of the box for http-proxy-middleware and would require quite a bit of JS code to add.
For sitemaps it would likely also be a better idea to proxy them using httpd, that is something we should consider adding to our installation documentation. But because sitemaps are relatively small files, and because every installation of dspace 7 will need it, we decided to make it available out of the box for people that want a simple setup.
A setup with a rest server and a UI on different hosts is more complicated by definition, so it is likely we can expect it won't have the express server for the UI or the tomcat for rest being the direct access points for clients
There was a problem hiding this comment.
@artlowel : Thanks for clarifying. I wanted to ensure this decision was documented clearly in this PR, which your comment now does. I think this is perfectly OK for now, I just wanted us to clarify why file downloads are (at least currently) proxied differently from sitemaps....and I do agree that auto-proxying sitemaps is much more important here, as we want to ensure those are always enabled by default (for SEO purposes). I'm OK with the extra steps regarding file downloads if you choose a more complex installation (of different hosts for UI vs REST)
|
This pull request fixes 1 alert when merging 764cfc5 into c67d2e6 - view on LGTM.com fixed alerts:
|
|
Merging as my minor feedback/questions were addressed above. Thanks @artlowel ! |
References
Description
This PR adds an option to rewrite URLs for Bitstream downloads and the
citation_pdf_url<meta>tag using the UI domain. It also fixes thecitation_abstract_html_urlfield, and adds docs on how to configure the proxy for bitstream downloads using httpd.Instructions for Reviewers
This PR adds a
rewriteDownloadUrlsenvironment property, that defaults tofalse. When set totrueit will change download URLs and thecitation_pdf_url<meta>tag on item pages to be based on the UI origin instead of the REST origin.That way, if you're hosting the rest api and the UI on different domains, download links will still appear to be hosted on the UI.
It also ensures the
citation_abstract_html_url<meta>tag always uses the UI origin.Finally it adds some documentation on how to configure the proxy for bitstream downloads using httpd.
In order to test this PR:
citation_pdf_url<meta>tag are unaffected: they use the REST originrewriteDownloadUrlstotruein your environment file, the bitstream download URLs and thecitation_pdf_url<meta>tag use the UI origincitation_abstract_html_url<meta>tag uses the UI origin.Checklist
yarn run lintpackage.json), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.