Notepads: [Bug] Pasting Arabic text may make it left-to-right

Describe the bug Right-to-Left scripts (Arabic, Hebrew, etc.) are not treated as RTL.

To Reproduce Type this string into Notepads:

العربية

This is the word “Arabic” in Arabic, contains 7 letters:

  • ا Alif;
  • ل Lam;
  • ع Ayin;
  • د Ra;
  • ب Ba;
  • ي Ya;
  • ة Ta Marbutah.

Expected behavior This string is shown like this: image It arranges the letters from right to left, and shows:

  • Alif in initial-isolated form;
  • Lam in initial form;
  • Ayin in medial form;
  • Ra in final form;
  • Ba in initial form;
  • Ya in medial form;
  • Ta Marbutah in final form;

Screenshots image It could be observed that this word is treated as left-to-right and then shaped together. Currently it is shown like this, which is wrong:

  • Alif and Lam shown in a Lam-Alif ligature in its final form;
  • Ayin in initial form;
  • Ra in final form;
  • Ba in medial form;
  • Ya in initial form;
  • Ta Marbutah in isolated form.

Desktop (please complete the following information):

  • OS: 18363.720
  • Version: 1.1.7.0

Additional context Please note that BiDi is not only supporting text that flows from right to left, but also the mixture of LTR and RTL scripts, and various BiDi control characters. Please refer to here for more detail: https://en.wikipedia.org/wiki/Bidirectional_text

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

image

Test String:

Adlam:	𞤑𞤵𞥅𞤤𞤢𞤤 𞤺𞤢𞤣𞤢𞤲𞤢𞤤 𞤋𞤲𞥆𞤢𞤥𞤢 𞤢𞥄𞤣𞤫𞥅𞤶𞤭 𞤬𞤮𞤬 𞤨𞤮𞤼𞤭⹁ 𞤲'𞤣𞤭𞤥𞤯𞤭𞤣𞤭 𞤫 𞤶𞤭𞤦𞤭𞤲𞤢𞤲𞥆𞤣𞤫 𞤼𞤮 𞤦𞤢𞤲𞥆𞤺𞤫 𞤸𞤢𞤳𞥆𞤫𞥅𞤶𞤭.
Arabic:	عندما يريد العالم أن ‪يتكلّم ‬ ، فهو يتحدّث بلغة يونيكود. تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود (Unicode Conference)، الذي سيعقد في 10-12 آذار 1997 بمدينة مَايِنْتْس، ألمانيا.
Hebrew:	סעיף א. כל בני אדם נולדו בני חורין ושווים בערכם ובזכויותיהם. כולם חוננו בתבונה ובמצפון, לפיכך חובה עליהם לנהוג איש ברעהו ברוח של אחוה.
N'ko:		ߞߏ ߡߍ߲ ߞߵߊ߬ ߞߍ߫ ߊ߲ ߛߋ߫ ߘߊ߫ ߞߊ߬ ߕߟߋ߬ߓߊ߰ߓߟߐߟߐ ߘߊߦߟߍ߬ ߒߞߏ ߦߋ߫ ߸ 
Syriac:	ܫܠܡ ܘܠܐܠܗܐ ܏ܫܘܒ܆ ܘܠܕܘܝܐ ܕܣܡ ܗܠܝܢ ܫܘܒܩܢܐ܀
Thaana:	ވަނަ މާއްދާ ހުރިހާ އިންސާނުންވެސް ދުނިޔެއަށް އުފަންވަނީ، މިނިވަންކަމުގައި، ހަމަހަމަ ޙައްޤުތަކަކާއެކު، ހަމަހަމަ ދަރަޖައެއްގައި ކަމޭހިތެވިގެންވާ ބައެއްގެ ގޮތުގައެވެ.

Slightly modifying PastePlainTextFromWindowsClipboard could fix the IME-paste issue:

Document.BeginUndoGroup();
Document.Selection.SetText(TextSetOptions.None, text);
Document.Selection.CharacterFormat.TextScript = TextScript.Ansi;
Document.Selection.StartPosition = Document.Selection.EndPosition;
Document.EndUndoGroup();