Subtitles Transmitted on Teletext Pages

A

Archive7

Guest
As you probably know, there are many channels that transmit subtitles either in DVB or on teletext pages such as page 777 or page 150 for example.
There is a problem with these teletext subtitles especially for channels from Poland because some of the special Polish characters do not transmit as normal characters but look differently.
For example (part of a generated srt file):
Lines 1 and 2 are almost perfect, but lines 3 and 4 contain erroneous characters
1
00:00:00,600 --> 00:00:03,640
Chytrze pomyǁlane.

2
00:00:03,680 --> 00:00:07,560
Zapora naftowa.
Kto przez to przejdzie?

3
00:00:07,600 --> 00:00:11,360
Z lini@ Maginota te¼ sobie
nie poradz@.

4
00:00:11,400 --> 00:00:15,520
Hitler jest g¾upcem,
porywaj@c si─ na Francj─.

Even if the channel is closed, you can still see the teletext subtitles if you select any of the TTX pages. At least one of them will show the subtitles on a black screen.

There is no problem if the subtitles are transmitted as DVB.

I can't imagine anybody in Poland would be happy to see these incorrect characters, especially on their national channels.

With enough data, I can build a conversion table, but I am hoping that there is an easier way.

Perhaps there is a special plugin for Enigma 2 boxes that can correct these problems?
 

Captain Jack

Burnt out human
Joined
Oct 21, 2006
Messages
11,805
Reaction score
7,990
Points
113
My Satellite Setup
See signature
My Location
North Somerset
This is probably to do with the font/charset in use rather than anything Enigma2 related.
 
A

Archive7

Guest
This is probably to do with the font/charset in use rather than anything Enigma2 related.
I am not so sure about this.
I am certain that the respective provider would make sure that the information is transmitted correctly, even on the teletext (TTX) pages.
Perhaps a Polish receiver doesn't have this problem, which means it is an Enigma 2 problem (or any other standard receiver for this matter).
 

Channel Hopper

Suffering fools, so you don't have to.
Staff member
Joined
Jan 1, 2000
Messages
35,595
Reaction score
8,576
Points
113
Age
59
Website
www.sat-elite.uk
My Satellite Setup
A little less analogue, and a lot more crap.
My Location
UK
As you probably know, there are many channels that transmit subtitles either in DVB or on teletext pages such as page 777 or page 150 for example.
There is a problem with these teletext subtitles especially for channels from Poland because some of the special Polish characters do not transmit as normal characters but look differently.
For example (part of a generated srt file):
Lines 1 and 2 are almost perfect, but lines 3 and 4 contain erroneous characters
1
00:00:00,600 --> 00:00:03,640
Chytrze pomyǁlane.


Even if the channel is closed, you can still see the teletext subtitles if you select any of the TTX pages. At least one of them will show the subtitles on a black screen.

There is no problem if the subtitles are transmitted as DVB.

I can't imagine anybody in Poland would be happy to see these incorrect characters, especially on their national channels.

With enough data, I can build a conversion table, but I am hoping that there is an easier way.

An observer in Poland for example
 
A

archive10

Guest
I am not so sure about this.
I am certain that the respective provider would make sure that the information is transmitted correctly, even on the teletext (TTX) pages.
Perhaps a Polish receiver doesn't have this problem, which means it is an Enigma 2 problem (or any other standard receiver for this matter).
CJ is right. Teletext operates with a specific character set, which exists in different variants.
The main difference is the language-specific characters.
A bit like codepages on old PCs.

On most STBs, the teletext fonts are bitmapped, and this determines what characters appear.
If the teletext font in your STB does not have the proper characters for Polish, you will get the lines you write.
So you need a different character set for teletext on your STB for this to display properly.

In contrast, DVB subtitles are pre-rendered bitmaps, which are merely displayed on a graphics layer.
 
A

Archive7

Guest
Alright, I finally made the cross code table (didn't need the Enigma machine)

FYI there are 9 special Polish characters (enlarged for visual clarity)

ą

ć

ę

ł

ń

ó

ś

ź

ż


and the teletext characters representing them are

@ > ą

↑ > ć

─ > ę

¾ > ł

$ > ń

# > ó

ǁ > ś

← > ź

¼ > ż


So if you extract a subtitle file from a ts video file, based on teletext pages (DVB subtitles don't have this problem), you can easily convert the weird looking characters into the correct Polish characters

For example

1
00:00:03,770 --> 00:00:09,077
I znowu suszenie.

W ten spos#b powstaje smak.
W ten sposób powstaje smak.


2
00:00:09,751 --> 00:00:13,258
Dostrzegamy kilka much,

opr#cz nich ¼adnych napastnik#w.
oprócz nich żadnych napastników.

Hope someone might find this useful.
 
A

Archive7

Guest
The characters cross shown above are for my receiver.
It is possible that other receivers might show different characters or even the correct ones.
 

Topper

Amo Amas Amant Admin
Staff member
Joined
Nov 18, 2004
Messages
23,991
Reaction score
4,014
Points
113
Age
69
My Satellite Setup
Has gone to a good home elsewhere
My Location
Blackburn, Lancashire
Not much good if you do not know Polish of course:-lol
 

davemurgtroyd

Regular Member
Joined
Apr 8, 2009
Messages
1,314
Reaction score
709
Points
113
Age
74
Location
Oxford
My Satellite Setup
See signature
My Location
Oxford
Does the same apply to Russian and other cyrilic character sets?
 

Captain Jack

Burnt out human
Joined
Oct 21, 2006
Messages
11,805
Reaction score
7,990
Points
113
My Satellite Setup
See signature
My Location
North Somerset
Are you using actual teletext and pick a page number or just pick subtitles? In teletext you can select char set to use, press teletext then menu and it will be there somewhere.

Let me know the channel and I'll test it
 
A

Archive7

Guest
Not much good if you do not know Polish of course:-lol
Well, one can always try to learn Polish or any other language just for fun. It is not so difficult if you really try.
Extracting the subtitles as a file this way, from a recorded *.ts file, it is very easy to convert it to any other language, English if you prefer, using Google Translation and then watch any program with English translation.
 
A

Archive7

Guest
Are you using actual teletext and pick a page number or just pick subtitles? In teletext you can select char set to use, press teletext then menu and it will be there somewhere.

Let me know the channel and I'll test it
Brilliant. I see what you mean now.
I found a configuration page in teletext which was set to German by default. Turned it off and then selected Polish.
Now I see the special Polish characters showing as they should be on my TV.
BUT this has not made any change to the recorded subtitle stream as they still show the special Polish characters incorrectly.
It is not a big issue now. I have created a Macro in NotePad ++ that changes the characters in the *.srt file to what they should be.
 
A

Archive7

Guest
Does the same apply to Russian and other cyrilic character sets?
Is there a Russian channel on Hotbird 13E that transmits subtitles with teletext?
I can test this for you.
As there are numerous characters in Cyrilic, the conversion table is going to be longer.
 

Captain Jack

Burnt out human
Joined
Oct 21, 2006
Messages
11,805
Reaction score
7,990
Points
113
My Satellite Setup
See signature
My Location
North Somerset
There are not. There are a couple on 5E but they are DVB subtitles.

The only one I know that transmits teletext subtitles is Channel 1 Russia in C band on 40E (TI2-MI).
 
A

Archive7

Guest
There are not. There are a couple on 5E but they are DVB subtitles.

The only one I know that transmits teletext subtitles is Channel 1 Russia in C band on 40E (TI2-MI).
What about Ukrainian channels? Don't they use Cyrillic characters?
Unfortunately I don't have access to Amos 4W (Europe footprint) only the ME to check.
 

davemurgtroyd

Regular Member
Joined
Apr 8, 2009
Messages
1,314
Reaction score
709
Points
113
Age
74
Location
Oxford
My Satellite Setup
See signature
My Location
Oxford
Are you using actual teletext and pick a page number or just pick subtitles? In teletext you can select char set to use, press teletext then menu and it will be there somewhere.

Let me know the channel and I'll test it
No particular channel at present but will soon be trying to relearn the Russian I was taught over 50 years ago and haven't really used since.Actual speech is a lot harder to follow initially - possibly down to accents.
 
A

Archive7

Guest
No particular channel at present but will soon be trying to relearn the Russian I was taught over 50 years ago and haven't really used since.Actual speech is a lot harder to follow initially - possibly down to accents.
Great.
You might find that Russian movies and possibly TV series are broadcasted occasionally by non Russian channels.
I am sure that I have come across few movies in the past from premium Polish channels such as Canal+ or Canal+ Films.

The hardest part of learning Russian for me was to learn the Cyrillic characters. I think it took me few weeks to grasp it, as the brain at first interpreted the Russian characters similar to English, the way they sound in English
For example Работа (Rabota - means work) I thought it sounds as Pabota.
Does learning a new language for elderly people helps to keep Alzheimer disease away? I sincerely hope so.
 

davemurgtroyd

Regular Member
Joined
Apr 8, 2009
Messages
1,314
Reaction score
709
Points
113
Age
74
Location
Oxford
My Satellite Setup
See signature
My Location
Oxford
Great.
You might find that Russian movies and possibly TV series are broadcasted occasionally by non Russian channels.
I am sure that I have come across few movies in the past from premium Polish channels such as Canal+ or Canal+ Films.

The hardest part of learning Russian for me was to learn the Cyrillic characters. I think it took me few weeks to grasp it, as the brain at first interpreted the Russian characters similar to English, the way they sound in English
For example Работа (Rabota - means work) I thought it sounds as Pabota.
Does learning a new language for elderly people helps to keep Alzheimer disease away? I sincerely hope so.
The alphabet is not a problem I can still remember that and count in Russian and remember a few phrases (enough to order a meal etc) but most of the vocabular has been lost in the mists of time. I have to think in a foreign language to be able to speak/write it

A few years back my German vocab was getting rusty but satellite TV brought it back up to scratch before my holiday in North Germany. Even more years ago I was fluent but Germans used to think I was from Bavaria (Bayerisch - the German equivalent of being called a Geordie) having spent some time around Lake Constance (Bodensee) and iit showed in my nuances and pronunciation. A very pleasant area with even a semi-tropical island (Mainau) in a mountain lake - probably due solar reflection from the snow on surrounding mountains.
 
Top