Is it possible to add .pdf/.xlsx dictionaries to GoldenDict?
Thread poster: Danesh
Danesh
Danesh
Local time: 19:25
English to Persian (Farsi)
Jun 9, 2019

Is it possible to add .pdf/.xlsx dictionaries to GoldenDict?

In “GoldenDict help” _(GoldenDict-1.5.0-RC2-372-gc3ff15f_(QT_5123)(64bit)_ I see no explicit mention of .pdf/.xlsx local files as supported dictionary formats, but in “Other resources” section of the said help file it says External programs are supported. Does this mean we can add.pdf/.xlsx files to GoldenDict? If yes, how?

Thank you very much in advance for your step-by-step instructions in simple language.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 16:55
Member (2006)
English to Afrikaans
+ ...
@Danesh Jun 9, 2019

Danesh wrote:
Is it possible to add .pdf/.xlsx dictionaries to GoldenDict?


No. If you want to use such dictionaries in GoldenDict, you have to convert them to one of the formats supported by GoldenDict. All of them are quite complicated. It's odd, in my view, that GoldenDict supports only high complex dictionary formats. It is possible to create fairly simple Lingvo DSL dictionaries, but I'm not sure how well they are supported by GoldenDict.

To convert Excel glossaries to DSL should be fairly simple. The DSL format is a plaintext format in UTF16-LE, with the file extension "DSL", and formatted like this:

source term 1
[space]target term
[space]description
source term 2
[space]target term
[space]another target term
[space]description
source term 3
[space]target term
[space]description 1
[space]description 2
[space]description 3
etc.

To convert a PDF, you'd have to first convert it to Excel.


Danesh
 
Danesh
Danesh
Local time: 19:25
English to Persian (Farsi)
TOPIC STARTER
Thank you, Samuel Jun 10, 2019

Thank you, Samuel
Thank you very much indeed dear Samuel for you great answers.


 
esperantisto
esperantisto  Identity Verified
Local time: 18:55
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
Yes, DSL Jun 11, 2019

DSL support is quite good in GoldenDict. Actually, it's even a bit better than in ABBYY Lingvo: you can create multiple articles with the same heading. The simplest way: export your dictionary from XLSX to tab-separated text in UTF-16 and replace tabs with tabs + carriage returns. Or copy and paste as a text table to Word, convert the table to tab-separated text, replace tabs with tabs + new paragraphs, export to plain UTF-16 text.

 
Danesh
Danesh
Local time: 19:25
English to Persian (Farsi)
TOPIC STARTER
How to export from XLSX to tab-separated text in UTF-16? Jun 14, 2019

esperantisto wrote:

The simplest way: export your dictionary from XLSX to tab-separated text in UTF-16 and replace tabs with tabs + carriage returns. Or copy and paste as a text table to Word, convert the table to tab-separated text, replace tabs with tabs + new paragraphs, export to plain UTF-16 text.


Dear esperantisto,
Could you please give me step-by-step instructions in simple language how I can export my dictionary from XLSX to tab-separated text in UTF-16 and replace tabs with tabs + carriage returns?
Thank you for your time and expertise in advance.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 16:55
Member (2006)
English to Afrikaans
+ ...
@Danesh Jun 14, 2019

Danesh wrote:
esperantisto wrote:
The simplest way: export your dictionary from XLSX to tab-separated text in UTF-16 and...

How I can export my dictionary from XLSX to tab-separated text in UTF-16 and replace tabs with tabs + carriage returns?


How many columns do you have in your Excel file? What are in those columns?


 
Danesh
Danesh
Local time: 19:25
English to Persian (Farsi)
TOPIC STARTER
2 columns Jun 15, 2019

Samuel Murray wrote:
How many columns do you have in your Excel file? What are in those columns?


Hi dear Samuel,

2 columns: the first column contains the English terms, and the second column Persian ones; or vice versa.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 16:55
Member (2006)
English to Afrikaans
+ ...
@Danesh Jun 15, 2019

Danesh wrote:
Samuel Murray wrote:
How many columns do you have in your Excel file? What are in those columns?

2 columns: the first column contains the English terms, and the second column Persian ones; or vice versa.


Okay, you need a text editor as well. I use Akelpad.

So, you open the Excel file, and then select the two columns simultaneously, and press Ctrl+C (copy), and then paste it into Akelpad (Ctrl+V). Then you save it in Akelpad (Ctrl+S) as something.dsl, and make sure the "Code page" is selected to "1200 (UTF-16 LE)". Then, press Ctrl+H to perform a Find/Replace operation. Make sure "Esc-sequences" and "Beginning" are ticked. Find "\t", replace with "\n | ", and click "Replace all".

akelpad find replace

[Or, if you follow Esperantisto's advice, find "\t", replace with "\t\n". I'm not sure if that will work, though, since AFAIK DSL requires a space before each non-headword line, and doesn't care about tabs at the ends of lines... but I haven't tested this extensively.]

[Edited at 2019-06-15 11:21 GMT]


Danesh
 
Danesh
Danesh
Local time: 19:25
English to Persian (Farsi)
TOPIC STARTER
Thank you very much for your great help Jun 15, 2019

Thank you very much for your great help

 
esperantisto
esperantisto  Identity Verified
Local time: 18:55
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
Space(s) or tab(s) Jun 17, 2019

Samuel Murray wrote:

DSL requires a space before each non-headword line


Any number of spaces or tabs is OK.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Is it possible to add .pdf/.xlsx dictionaries to GoldenDict?






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »