This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Is it possible to extract the ALL the contents of a web site?
Thread poster: Elena Miguel
Elena Miguel Spain Local time: 04:16 English to Spanish + ...
Oct 24, 2005
I have to quote the translation of a web site for a regular client who has not designed it, and thus, he cannot provide me with the files. Is it possible to extract ALL the contents of a web site? The site is full of pages and subpages and it is nearly impossible to do it "manually"... Thank you in advance.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
RWSTranslation Germany Local time: 04:16 German to English + ...
Maybe
Oct 24, 2005
Hello,
maybe it is possible if you find a way to see all the text. If you have dynamic created pages, you will often see only some of the text.
You can try to use tools like offline explorer to save a website locally. But you can only save the information which would be sent by the Webserver according to the setting and questions of your web browser.
Hans
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Roberto Tokuda Local time: 23:16 Member (2005) Japanese to Spanish + ...
utility
Oct 24, 2005
You can use webstripper to download all files of the web page(s) and related links into your computer
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Gerard de Noord France Local time: 04:16 Member (2003) English to Dutch + ...
No it's impossible
Oct 24, 2005
DSC wrote:
Hello,
maybe it is possible if you find a way to see all the text. If you have dynamic created pages, you will often see only some of the text.
You can try to use tools like offline explorer to save a website locally. But you can only save the information which would be sent by the Webserver according to the setting and questions of your web browser.
Hans
For the same reasons Hans says maybe I say no. You and your client can't be sure that the site is 100% HTML and only in that case you can spunge it successfully with the tools Hans mentioned. And even then you can't be sure that the HTML code isn't altered during the process.
Don't do it. I did it once some years ago and I regretted it.
Regards, Gerard
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Heinrich Pesch Finland Local time: 05:16 Member (2003) Finnish to German + ...
If you get the password for ftp
Oct 25, 2005
...you can transfer all files concerned, translate them and be sure all is ok. Otherwise it's a risky thing to do. I wonder why someone wants to translate a site without having access to it. Regards Heinrich
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Elena Miguel Spain Local time: 04:16 English to Spanish + ...
TOPIC STARTER
Thank you
Oct 25, 2005
Thank you everybody! I finally managed to extract most of the files with stripper and hope this is enough to create a "tentative" quote which is the only thing they need for the moment. Regards.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Samuel Murray Netherlands Local time: 04:16 Member (2006) English to Afrikaans + ...
No, it is not
Oct 25, 2005
Delelis wrote: Is it possible to extract ALL the contents of a web site?
No. What the web server serves you is based on what user agent you make the request with.
The site is full of pages and subpages and it is nearly impossible to do it "manually"...
You could use a download manager such as Getleft to download as much of the site as possible.
But even if the client doesn't have the original content... doesn't he have his own internet connection? Can't he download the site himself? He's probably too lazy, yes? Then your option is to download it yourself and send it to him (zipped) and ask him to indicate whether those pages are the pages he wants translated. Get the client to indicate exactly which pages he wants translated.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Gwidon Naskrent Poland Local time: 04:16 English to Polish + ...
Another solution
Nov 11, 2005
If the site is dynamically generated, perhaps it could be possible to coax the client into providing you with the php source files. Or is there a tool suited to translating php contents?
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free