Is it possible to extract the ALL the contents of a web site?
Thread poster: Elena Miguel
Elena Miguel
Elena Miguel  Identity Verified
Spain
Local time: 04:16
English to Spanish
+ ...
Oct 24, 2005

I have to quote the translation of a web site for a regular client who has not designed it, and thus, he cannot provide me with the files.
Is it possible to extract ALL the contents of a web site?
The site is full of pages and subpages and it is nearly impossible to do it "manually"...
Thank you in advance.


 
RWSTranslation
RWSTranslation
Germany
Local time: 04:16
German to English
+ ...
Maybe Oct 24, 2005

Hello,

maybe it is possible if you find a way to see all the text. If you have dynamic created pages, you will often see only some of the text.

You can try to use tools like offline explorer to save a website locally. But you can only save the information which would be sent by the Webserver according to the setting and questions of your web browser.

Hans


 
Roberto Tokuda
Roberto Tokuda  Identity Verified
Local time: 23:16
Member (2005)
Japanese to Spanish
+ ...
utility Oct 24, 2005

You can use webstripper to download all files of the web page(s) and related links into your computer

http://webstripper.net/

Regards


 
Gerard de Noord
Gerard de Noord  Identity Verified
France
Local time: 04:16
Member (2003)
English to Dutch
+ ...
No it's impossible Oct 24, 2005

DSC wrote:

Hello,

maybe it is possible if you find a way to see all the text. If you have dynamic created pages, you will often see only some of the text.

You can try to use tools like offline explorer to save a website locally. But you can only save the information which would be sent by the Webserver according to the setting and questions of your web browser.

Hans


For the same reasons Hans says maybe I say no. You and your client can't be sure that the site is 100% HTML and only in that case you can spunge it successfully with the tools Hans mentioned. And even then you can't be sure that the HTML code isn't altered during the process.

Don't do it. I did it once some years ago and I regretted it.

Regards,
Gerard


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 05:16
Member (2003)
Finnish to German
+ ...
If you get the password for ftp Oct 25, 2005

...you can transfer all files concerned, translate them and be sure all is ok. Otherwise it's a risky thing to do. I wonder why someone wants to translate a site without having access to it.
Regards
Heinrich


 
Elena Miguel
Elena Miguel  Identity Verified
Spain
Local time: 04:16
English to Spanish
+ ...
TOPIC STARTER
Thank you Oct 25, 2005

Thank you everybody!
I finally managed to extract most of the files with stripper and hope this is enough to create a "tentative" quote which is the only thing they need for the moment.
Regards.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 04:16
Member (2006)
English to Afrikaans
+ ...
No, it is not Oct 25, 2005

Delelis wrote:
Is it possible to extract ALL the contents of a web site?


No. What the web server serves you is based on what user agent you make the request with.

The site is full of pages and subpages and it is nearly impossible to do it "manually"...


You could use a download manager such as Getleft to download as much of the site as possible.

But even if the client doesn't have the original content... doesn't he have his own internet connection? Can't he download the site himself? He's probably too lazy, yes? Then your option is to download it yourself and send it to him (zipped) and ask him to indicate whether those pages are the pages he wants translated. Get the client to indicate exactly which pages he wants translated.


 
Gwidon Naskrent
Gwidon Naskrent  Identity Verified
Poland
Local time: 04:16
English to Polish
+ ...
Another solution Nov 11, 2005

If the site is dynamically generated, perhaps it could be possible to coax the client into providing you with the php source files. Or is there a tool suited to translating php contents?

 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Laureana Pavon[Call to this topic]

You can also contact site staff by submitting a support request »

Is it possible to extract the ALL the contents of a web site?






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »