source/data/translit/fa_fa_Latn_BGN.txt - Issue 2440913002: Update ICU to 58.1

Unified Diff: source/data/translit/fa_fa_Latn_BGN.txt

Issue 2440913002: Update ICU to 58.1

Patch Set: Created 4 years, 2 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Index: source/data/translit/fa_fa_Latn_BGN.txt

diff --git a/source/data/translit/Persian_Latin_BGN.txt b/source/data/translit/fa_fa_Latn_BGN.txt

similarity index 63%

rename from source/data/translit/Persian_Latin_BGN.txt

rename to source/data/translit/fa_fa_Latn_BGN.txt

index 6082a862d551128b2bb8575c38560f799d09f400..0fd2d1181ea9fcabe1e8084e39403e77023b328c 100644

--- a/source/data/translit/Persian_Latin_BGN.txt

+++ b/source/data/translit/fa_fa_Latn_BGN.txt

@@ -1,22 +1,53 @@

-# ***************************************************************************

-# *

-# ***************************************************************************

-# File: Persian_Latin_BGN.txt

-# Generated from CLDR

+# License & terms of use: http://www.unicode.org/copyright.html#License

+# File: fa_fa_Latn_BGN.txt

+# Generated from CLDR

+########################################################################

+# BGN/PCGN 1956 System

+# This system was adopted by the BGN in 1946 and by the PCGN in 1958.

+# It is used for the romanization of geographic names in Iran and

+# for Persian-language names in Afghanistan.

+# Originally prepared by Michael Everson <everson@evertype.com>

+########################################################################

+# MINIMAL FILTER: Persian-Latin

:: [[:arabic:][:block=ARABIC:][ءآابةتثجحخدذرزسشصضطظعغفقكلمنهویي\u064E\u064F\u0650\u0651\u0652٠١٢٣٤٥٦٧٨٩پچژگی]] ;

:: NFKD (NFC) ;

+########################################################################

+# Define All Transformation Variables

+########################################################################

$alef = ’;

$ayin = ‘;

$disambig = \u0331 ;

+# Use this $wordBoundary until bug 2034 is fixed in ICU:

+# http://bugs.icu-project.org/cgi-bin/icu-bugs/transliterate?id=2034;expression=boundary;user=guest

$wordBoundary = [^[:L:][:M:][:N:]] ;

+########################################################################

+# non-letters

[:Nd:]{٫}[:Nd:] ↔ [:Nd:]{','}[:Nd:] ; # ARABIC DECIMAL SEPARATOR

[:Nd:]{٬}[:Nd:] ↔ [:Nd:]{'.'}[:Nd:] ; # ARABIC THOUSANDS SEPARATOR

٫ ↔ ',' $disambig ; # ARABIC DECIMAL SEPARATOR

٬ ↔ '.' $disambig ; # ARABIC THOUSANDS SEPARATOR

+# ٭ ↔ ; # ARABIC FIVE POINTED STAR // no need to transliterate

، ↔ ',' ; # ARABIC COMMA

؛ ↔ ';' ; # ARABIC SEMICOLON

؟ ↔ '?' ; # ARABIC QUESTION MARK

@@ -41,10 +72,46 @@ $wordBoundary = [^[:L:][:M:][:N:]] ;

۷ ↔ 7 ; # EXTENDED ARABIC-INDIC DIGIT SEVEN

۸ ↔ 8 ; # EXTENDED ARABIC-INDIC DIGIT EIGHT

۹ ↔ 9 ; # EXTENDED ARABIC-INDIC DIGIT NINE

+########################################################################

+# Rules moved to front to avoid masking

+########################################################################

+# BGN Page 89 Rule 4

+# The character sequences كه , زه , سه , and گه may be romanized k·h, z·h,

+# s·h, and g·h in order to differentiate those romanizations from the

+# digraphs kh, zh, sh, and gh.

+########################################################################

كه → k·h ; # ARABIC LETTER KAF + HEH

زه → z·h ; # ARABIC LETTER ZAIN + HEH

سه → s·h ; # ARABIC LETTER SEEN + HEH

گه → g·h ; # ARABIC LETTER GAF + HEH

+########################################################################

+# End Rule 4

+########################################################################

+# BGN Page 91 Rule 7

+# Doubles consonant sounds are represented in Arabic script by

+# placing a shaddah ( \u0651 ) over a consonant character. In romanization

+# the letter should be doubled. [The remainder of this rule deals with

+# the definite article and is lexical.]

+########################################################################

ب\u0651 → bb ; # ARABIC LETTER BEH + SHADDA

پ\u0651 → pp ; # ARABIC LETTER PEH + SHADDA

ت\u0651 → tt ; # ARABIC LETTER TEH + SHADDA

@@ -75,6 +142,20 @@ $wordBoundary = [^[:L:][:M:][:N:]] ;

ه\u0651 → hh ; # ARABIC LETTER HEH + SHADDA

و\u0651 → ww ; # ARABIC LETTER WAW + SHADDA

ی\u0651 → yy ; # ARABIC LETTER FARSI YEH + SHADDA

+########################################################################

+# End Rule 7

+########################################################################

+# Start of Transformations

+########################################################################

$wordBoundary{ء → ; # ARABIC LETTER HAMZA

ء → $alef ; # ARABIC LETTER HAMZA

$wordBoundary{ا → ; # ARABIC LETTER ALEF

@@ -122,3 +203,7 @@ $wordBoundary{ا → ; # ARABIC LETTER ALEF

\u064F → o ; # ARABIC DAMMA

\u0652 → ; # ARABIC SUKUN

::NFC (NFD) ;

+########################################################################

« no previous file with comments | « source/data/translit/es_zh.txt ('k') | source/data/translit/he_he_Latn_BGN.txt » ('j') | no next file with comments »